Skip to content
← All articles

Kubernetes Uptime Monitoring

Short answer. Kubernetes checks pod health with three probe types: livenessProbe restarts a hung container, readinessProbe removes a pod from load balancing until it can accept traffic, and startupProbe gives slow-starting apps time to initialize. This is internal self-healing. But probes cannot see the network path outside the cluster — for honest uptime from the user's perspective you need external synthetic monitoring.

Three probe types and their roles

Confusing liveness and readiness causes the most painful Kubernetes incidents. Their purposes are fundamentally different:

  • livenessProbe — "is the process alive?" Failure → kubelet restarts the container;
  • readinessProbe — "is it ready to accept traffic?" Failure → the pod leaves Endpoints but is not restarted;
  • startupProbe — "has it finished starting?" Until it passes, liveness and readiness are disabled.

Manifest example

A correct configuration for a web service with health endpoints:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  template:
    spec:
      containers:
        - name: web
          image: myapp:1.4.2
          ports:
            - containerPort: 8080
          startupProbe:
            httpGet:
              path: /healthz
              port: 8080
            failureThreshold: 30
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 10
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /readyz
              port: 8080
            periodSeconds: 5
            failureThreshold: 3

Liveness vs readiness: common mistakes

MistakeConsequenceThe right way
liveness checks a dependency (DB)Cascading restart of all pods on a DB failureliveness — the process only
No startupProbe for a slow startliveness kills the pod before init finishesAdd startupProbe with headroom
readiness and liveness on the same pathPod restarts instead of leaving rotationSplit /healthz and /readyz
failureThreshold too aggressiveFlapping restarts under load3–5 plus a sane periodSeconds
The golden rule: liveness checks ONLY the process itself. If liveness probes the database or an external API документацию, a dependency failure triggers a cascading restart of every replica.

What probes cannot see

Probes run inside the cluster: kubelet knocks on the pod directly. They do not check the Ingress, the load balancer, DNS, TLS termination, or the network between the user and the cluster. A pod can be "Ready" while the site is unreachable from outside because of a misconfigured Ingress or an expired certificate on the LoadBalancer.

An external layer on top of Kubernetes

enterno.io checks the full path from outside — the way a user sees it: HTTP response through the Ingress, SSL validity on the balancer, DNS resolution, and Ping from RU / EU / US regions. This complements internal probes rather than replacing them: probes heal pods, the external monitor catches problems at the cluster edge. Free tier offers 10 monitors at a 5-minute interval; paid tiers go to 1 minute / 30 seconds. Alerts reach Telegram, Slack, email, webhook, PagerDuty.

For cron and массовую проверку URL jobs in Kubernetes, use heartbeat monitoring (a dead man's switch): the job pings a unique URL, and if the ping does not arrive in time, you get an alert.

FAQ

Why does a pod restart under load?

Most often the liveness probe times out from CPU starvation. Increase timeoutSeconds, check resource limits, and do not put heavy checks on liveness.

Is startupProbe always needed?

No, only for apps with long initialization (cache warmup, migrations). For fast services, liveness + readiness is enough.

Does readinessProbe replace external monitoring?

No. readiness controls in-cluster load balancing but cannot see the Ingress, DNS, or the path to the user. You still need an external synthetic monitor.

How do I monitor a CronJob in Kubernetes?

Use heartbeat: the job makes an HTTP ping to enterno.io/heartbeat at the end of execution; a missed ping is an alert about a job that did not run or hung.

Add the outside-the-cluster view: create a monitor at enterno.io/monitors to check the Ingress and SSL. Useful reading: health-check endpoints, container monitoring, monitoring for developers.

Check your website right now

Check your site →
More articles: DevOps
DevOps
Uptime Checks in CI/CD Pipelines
18.06.2026 · 35 views
DevOps
Monitoring AI Agents
22.06.2026 · 25 views
DevOps
LLM Observability Guide
22.06.2026 · 33 views
DevOps
Monitoring RAG Pipelines
22.06.2026 · 33 views