Log aggregation — practice сбора logs из multiple services в central searchable store. Причина: grep по 50 servers — не scale. Stack options: ELK (Elasticsearch + Logstash + Kibana) — powerful но expensive, Loki (Grafana, cheaper), Splunk (enterprise $$$), CloudWatch/DataDog Logs (SaaS). Critical features: search, alerts, retention, correlation с traces.
Ниже: подробности, пример, смежные термины, FAQ.
# Fluent Bit config
[INPUT]
Name tail
Path /var/log/nginx/access.log
[OUTPUT]
Name loki
Host grafana-loki:3100
Labels host=$HOSTNAME,service=nginxELK: full-text indexed, fast search, expensive at scale. Loki: Prometheus-like labels + grep при query time, 10× cheaper. Для high-volume — Loki. Для complex search — ELK.
Sampling (drop 90% INFO logs), log level discipline (INFO/WARN/ERROR не DEBUG в prod), TTL (< 30 days hot).
Ingestion в nearest region + async replication. Или separate stores + federated search (Loki federation, CloudWatch cross-account).