Load Balancing Algorithms: Round Robin, Least Connections, and More

Anatoly Oshmanovsky

Infrastructure

Load Balancing Algorithms: Round Robin, Least Connections, and More

Published: 16.03.2026 · ~5 min · 218 views

Load balancing distributes incoming network traffic across multiple servers to ensure no single server bears too much load. The algorithm used to make distribution decisions has a direct impact on application performance, reliability, and resource utilization. Choosing the right algorithm depends on your traffic patterns, server capabilities, and application architecture.

Why Load Balancing Matters

Without load balancing, a single server handles all requests. This creates a single point of failure and limits your ability to scale. load balancer solve this by routing requests to a pool of backend servers based on a distribution algorithm. Benefits include:

High availability — if one server fails, traffic is redirected to healthy ones
Scalability — add more servers to handle increased load
Performance — distribute work to reduce response times
Maintenance — take servers offline for updates without downtime

Static Algorithms

Static algorithms make routing decisions without considering the current state of backend servers. They are simple, predictable, and have minimal overhead.

Round Robin

The simplest algorithm. Requests are distributed sequentially across the server pool. Server 1 gets the first request, Server 2 gets the second, and so on. After the last server, it cycles back to Server 1.

# Nginx round robin (default behavior)
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

Best for: Servers with identical hardware and stateless applications where each request takes roughly the same processing time.

Limitations: Does not account for differences in server capacity or current load. A slow request on one server can cascade into uneven distribution.

Weighted Round Robin

An extension of round robin where each server is assigned a weight proportional to its capacity. A server with weight 5 receives five times more requests than a server with weight 1.

# Nginx weighted round robin
upstream backend {
    server 10.0.1.1:8080 weight=5;  # Powerful server
    server 10.0.1.2:8080 weight=3;  # Medium server
    server 10.0.1.3:8080 weight=1;  # Small server
}

Best for: Heterogeneous server pools where machines have different CPU, memory, or network capabilities.

IP Hash

A hash of the client's IP address determines which server receives the request. The same client IP always maps to the same server (as long as the server pool does not change).

# Nginx IP hash
upstream backend {
    ip_hash;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

Best for: Applications that require session affinity (sticky sessions) without using cookies or tokens. Useful when server-side session state cannot be externalized.

Limitations: Clients behind a NAT or corporate proxy share the same IP, causing uneven distribution. Adding or removing servers reshuffles the mapping.

Dynamic Algorithms

Dynamic algorithms consider the current state of backend servers when making routing decisions. They adapt to real-time conditions but require more overhead to track server state.

Least Connections

Routes each new request to the server with the fewest active connections. This naturally adapts to differences in request processing time — servers handling slow requests accumulate connections and receive fewer new ones.

# Nginx least connections
upstream backend {
    least_conn;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

Best for: Applications with variable request processing times, such as API документацию where some endpoints are fast and others involve database queries or external service calls.

Weighted Least Connections

Combines least connections with server weights. The routing decision considers both the number of active connections and each server's weight, producing a normalized connection-to-capacity ratio.

# HAProxy weighted least connections
backend app_servers
    balance leastconn
    server srv1 10.0.1.1:8080 weight 5 check
    server srv2 10.0.1.2:8080 weight 3 check
    server srv3 10.0.1.3:8080 weight 1 check

Least Response Time

Routes requests to the server with the fastest response time and fewest active connections. This requires the load balancer to actively measure backend response times.

Best for: Latency-sensitive applications where response time consistency is critical.

Random with Two Choices

Picks two servers at random, then sends the request to the one with fewer connections. This provides near-optimal distribution with very low computational overhead and no shared state — making it ideal for distributed load balancers.

Algorithm Comparison

Algorithm	Type	Session Affinity	Adapts to Load	Complexity
Round Robin	Static	No	No	Very Low
Weighted Round Robin	Static	No	No	Low
IP Hash	Static	Yes	No	Low
Least Connections	Dynamic	No	Yes	Medium
Weighted Least Conn	Dynamic	No	Yes	Medium
Least Response Time	Dynamic	No	Yes	High
Random Two Choices	Dynamic	No	Yes	Low

Health Checks

Regardless of algorithm, health checks are essential. They ensure traffic is only sent to servers that can handle it:

Passive checks — the load balancer marks a server as down after observing consecutive failures
Active checks — the load balancer periodically sends probe requests to each server

# HAProxy active health checks
backend app_servers
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server srv1 10.0.1.1:8080 check inter 5s fall 3 rise 2
    server srv2 10.0.1.2:8080 check inter 5s fall 3 rise 2

Choosing the Right Algorithm

Identical servers, uniform requests → Round Robin
Different server capacities → Weighted Round Robin
Variable request duration → Least Connections
Session stickiness needed → IP Hash or cookie-based affinity
Latency-critical applications → Least Response Time
Distributed / multi-region → Random with Two Choices

Summary

Load balancing algorithms range from simple static approaches like round robin to sophisticated dynamic methods that adapt to real-time server conditions. The optimal choice depends on your specific requirements: server homogeneity, request patterns, session requirements, and latency sensitivity. In practice, starting with least connections is a solid default for most web applications, as it naturally adapts to variable workloads without requiring manual weight tuning.

Check your website right now

Check your site →

Load Balancing Algorithms: Round Robin, Least Connections, and More

Why Load Balancing Matters

Static Algorithms

Round Robin

Weighted Round Robin

IP Hash

Dynamic Algorithms

Least Connections

Weighted Least Connections

Least Response Time

Random with Two Choices

Algorithm Comparison

Health Checks

Choosing the Right Algorithm

Summary

Start monitoring for free