Перейти к содержимому
Skip to content
← All articles

Designing Effective Health Check Endpoints for Web Services

A health check endpoint is a dedicated URL that reports whether your application is functioning correctly. It's the foundation of automated monitoring, load balancer routing, and container orchestration. A well-designed health check prevents false positives, catches real issues, and provides actionable information.

Types of Health Checks

Liveness Check

Answers: "Is the application process running?" A minimal check — if it fails, the application needs to be restarted.

GET /health/live → 200 OK
{"status": "alive"}

Only check the application process itself. Don't check dependencies — a database outage shouldn't trigger an application restart (which won't fix the database).

Readiness Check

Answers: "Can this instance handle traffic?" Checks dependencies and readiness. If it fails, remove this instance from the load balancer but don't restart it.

GET /health/ready → 200 OK
{
  "status": "ready",
  "checks": {
    "database": {"status": "up", "latency_ms": 3},
    "redis": {"status": "up", "latency_ms": 1},
    "disk_space": {"status": "ok", "free_gb": 42}
  }
}

GET /health/ready → 503 Service Unavailable
{
  "status": "not_ready",
  "checks": {
    "database": {"status": "down", "error": "Connection refused"},
    "redis": {"status": "up", "latency_ms": 1}
  }
}

Startup Check

Answers: "Has the application finished starting up?" Prevents traffic before initialization is complete (database migrations, cache warming, config loading).

What to Check

Response Format

{
  "status": "healthy",
  "version": "2.4.1",
  "uptime_seconds": 86432,
  "timestamp": "2025-03-15T12:00:00Z",
  "checks": {
    "mysql": {
      "status": "up",
      "latency_ms": 2,
      "details": "MySQL 8.0, 45 active connections"
    },
    "redis": {
      "status": "up",
      "latency_ms": 0.5,
      "details": "Redis 6.0, 128MB used"
    },
    "disk": {
      "status": "warning",
      "free_gb": 5.2,
      "details": "Below 10GB threshold"
    }
  }
}

HTTP Status Codes

Don't use 500 — that implies an unexpected error. 503 specifically means "temporarily unavailable" which is the right semantics.

Implementation Tips

Keep It Fast

Health checks are called frequently (every 5-30 seconds). They must respond in <1 second. Use connection timeouts of 2-3 seconds for dependency checks.

Don't Break Under Load

A health check that fails under high load will cause the load balancer to remove the instance — reducing capacity exactly when you need it most. Make health checks lightweight and independent of application load.

Cache Dependency Checks

Don't query the database on every health check call. Cache results for 5-10 seconds to reduce overhead.

Separate Public from Internal

Public health check (/health): Returns only status code (200/503). Internal health check (/health/detailed): Returns full diagnostics. Protect the detailed endpoint — it can leak infrastructure information.

PHP Implementation Example

// /api/health.php
$checks = [];
$healthy = true;

// MySQL check
try {
    $start = microtime(true);
    $db = getDB();
    $db->query('SELECT 1');
    $checks['mysql'] = [
        'status' => 'up',
        'latency_ms' => round((microtime(true) - $start) * 1000, 1)
    ];
} catch (Exception $e) {
    $checks['mysql'] = ['status' => 'down', 'error' => $e->getMessage()];
    $healthy = false;
}

// Redis check
try {
    $start = microtime(true);
    $redis = getRedis();
    $redis->ping();
    $checks['redis'] = [
        'status' => 'up',
        'latency_ms' => round((microtime(true) - $start) * 1000, 1)
    ];
} catch (Exception $e) {
    $checks['redis'] = ['status' => 'down'];
    $healthy = false;
}

http_response_code($healthy ? 200 : 503);
echo json_encode(['status' => $healthy ? 'healthy' : 'unhealthy', 'checks' => $checks]);

Conclusion

Health check endpoints are simple to build but critical to get right. Separate liveness from readiness, keep checks fast, protect detailed endpoints, and ensure they work correctly under load. A well-designed health check is the first thing that tells you something is wrong — make sure it's reliable.

Check your website right now

Check now →
More articles: Monitoring
Monitoring
Alerting Best Practices for Website Monitoring
14.03.2026 · 22 views
Monitoring
Uptime Monitoring: Why and How to Set It Up
14.03.2026 · 12 views
Monitoring
MTTR, MTTF, MTBF: Reliability Metrics Explained for Web Operations
16.03.2026 · 13 views
Monitoring
Cron Job Monitoring with Dead Man's Switch
14.03.2026 · 15 views