Short answer. Kubernetes checks pod health with three probe types: livenessProbe restarts a hung container, readinessProbe removes a pod from load balancing until it can accept traffic, and startupProbe gives slow-starting apps time to initialize. This is internal self-healing. But probes cannot see the network path outside the cluster — for honest uptime from the user's perspective you need external synthetic monitoring.
Three probe types and their roles
Confusing liveness and readiness causes the most painful Kubernetes incidents. Their purposes are fundamentally different:
- livenessProbe — "is the process alive?" Failure → kubelet restarts the container;
- readinessProbe — "is it ready to accept traffic?" Failure → the pod leaves Endpoints but is not restarted;
- startupProbe — "has it finished starting?" Until it passes, liveness and readiness are disabled.
Manifest example
A correct configuration for a web service with health endpoints:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
template:
spec:
containers:
- name: web
image: myapp:1.4.2
ports:
- containerPort: 8080
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 5
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz
port: 8080
periodSeconds: 5
failureThreshold: 3
Liveness vs readiness: common mistakes
| Mistake | Consequence | The right way |
|---|---|---|
| liveness checks a dependency (DB) | Cascading restart of all pods on a DB failure | liveness — the process only |
| No startupProbe for a slow start | liveness kills the pod before init finishes | Add startupProbe with headroom |
| readiness and liveness on the same path | Pod restarts instead of leaving rotation | Split /healthz and /readyz |
| failureThreshold too aggressive | Flapping restarts under load | 3–5 plus a sane periodSeconds |
The golden rule: liveness checks ONLY the process itself. If liveness probes the database or an external API документацию, a dependency failure triggers a cascading restart of every replica.
What probes cannot see
Probes run inside the cluster: kubelet knocks on the pod directly. They do not check the Ingress, the load balancer, DNS, TLS termination, or the network between the user and the cluster. A pod can be "Ready" while the site is unreachable from outside because of a misconfigured Ingress or an expired certificate on the LoadBalancer.
An external layer on top of Kubernetes
enterno.io checks the full path from outside — the way a user sees it: HTTP response through the Ingress, SSL validity on the balancer, DNS resolution, and Ping from RU / EU / US regions. This complements internal probes rather than replacing them: probes heal pods, the external monitor catches problems at the cluster edge. Free tier offers 10 monitors at a 5-minute interval; paid tiers go to 1 minute / 30 seconds. Alerts reach Telegram, Slack, email, webhook, PagerDuty.
For cron and массовую проверку URL jobs in Kubernetes, use heartbeat monitoring (a dead man's switch): the job pings a unique URL, and if the ping does not arrive in time, you get an alert.
FAQ
Why does a pod restart under load?
Most often the liveness probe times out from CPU starvation. Increase timeoutSeconds, check resource limits, and do not put heavy checks on liveness.
Is startupProbe always needed?
No, only for apps with long initialization (cache warmup, migrations). For fast services, liveness + readiness is enough.
Does readinessProbe replace external monitoring?
No. readiness controls in-cluster load balancing but cannot see the Ingress, DNS, or the path to the user. You still need an external synthetic monitor.
How do I monitor a CronJob in Kubernetes?
Use heartbeat: the job makes an HTTP ping to enterno.io/heartbeat at the end of execution; a missed ping is an alert about a job that did not run or hung.
Add the outside-the-cluster view: create a monitor at enterno.io/monitors to check the Ingress and SSL. Useful reading: health-check endpoints, container monitoring, monitoring for developers.