Rate limiting in HAProxy and Nginx

Rate-limiting is a common strategy for safe guarding a server from potential DDoS attacks or sudden peaks in network traffic. Rate-limiting instructs the server to block requests from certain IP addresses that are sending an unusual number of requests to the system.

We can apply rate-limiting to both Nginx and HAProxy. Nginx runs on each end node hosting the service, while HAProxy serves as the load-balancer and distributes incoming requests among available nodes. This post describes how to rate-limit requests on both Nginx and HAProxy and shows how to whitelist IPs and rate-limit a single URL. The final section shows how to apply this configuration in Puppet.

1. Rate-limiting in HAProxy

This section describes how to configure HAProxy to rate-limit incoming requests and block requests that cross a certain threshold. HAProxy provides rate-limiting in the following contexts:

Rate queuing: The simplest form of rate-limiting in which requests are queued if they cross a certain threshold. Subsequent requests are served in FIFO order.
Sliding window: Requests are stored in a stick table that keeps a record of incoming IP addresses. A threshold is defined, allowing the user to make a certain number of requests in a given time period. Subsequent requests are denied with a 429 status code.
Fixed window: Similar to the sliding window, we define an interval as before; however, instead of storing the request rate, we keep a request counter. When requests for a certain user reach their limit, subsequent requests get blocked, usually for a period of 24 hours (can be modified). This is mainly used for APIs where user requests are limited to, for example, 1000 per day.

1.1. HAProxy configuration

This HAProxy blog covers the basics of applying rate-limiting on the server side.

HAProxy configuration is divided into four sections: frontend, backend, defaults, and global. The frontend handles incoming requests from clients, and the backend is expected to fulfill the request. Note that all these steps can be performed in a combined config section, but for maintainability and readability, it is divided into separate sections.

frontend: Handles all incoming requests from clients.
backend: Expected to fulfill the request.
defaults: Contains default settings for the server. Can be used to avoid duplication.
global: Global settings for the server.

To rate-limit in HAProxy, we first need to decrypt the HTTPS traffic using SSL offloading. This feature is only available in HAProxy versions 1.x and onwards.

We will only focus on the frontend and backend configuration sections, as the other sections contain default configuration and will not be used for setting up load balancing.

1.2. Defining the frontend configuration

In the frontend section, we specify the port and IP address where our site listens for traffic. If we have configured SSL for our site, we also need to specify the location of the certificate file.

View the haproxy systemd service ($ systemctl status haproxy) to check where the configuration file is defined.

frontend my_example_site.com # The keyword following 'frontend' is the label.
bind *:443 ssl crt /etc/ssl/cert1.pem  
bind *:80 # Bind all IP addresses to listen on port 80.
default_backend example_site_backend # Afterwards, traffic will get redirected to this backend.
http-request track-sc0 src # (Described in section 1.4. Stick tables).
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 50 } !white_list # (Described in section 1.5. Setting the Request Rate Limit).
mode http # HTTP mode instructs the server to inspect the traffic before passing it to the backend.
# Other options include 'tcp,' which means the traffic is encrypted and will be passed on to the backend as is.
redirect scheme https code 301 if !{ ssl_fc } # Instructs the server to redirect traffic to HTTPS with a 301 status code if they try accessing from an unencrypted site.
stick-table type ipv6 size 100k expire 30s store http_req_rate(1s) # See the section below for stick tables.

1.3. Defining the backend configuration

The backend section defines a pool of servers where requests are actually handled. Below is an example backend configuration:

backend my_example_site.com_backend
balance roundrobin # Select the load balancing algorithm.
default-server inter 2s fall 2 rise 2 # See below:
# inter: Specifies the inter-check delay for health checks. In this case, 'inter 2s' indicates that the interval between two consecutive health checks for a server is 2 seconds.
# fall: Specifies the number of consecutive failed health checks after which a server is considered down. With 'fall 2', if two consecutive health checks fail, the server will be marked as down.
# rise: Specifies the number of consecutive successful health checks required for a server to be marked as up. With 'rise 2', after two consecutive successful health checks, the server will be marked as up.
mode http # Instruct backend servers to communicate using the HTTP protocol.
option httpchk HEAD /status HTTP/1.1\r\nHost:\ www.example.com # See below:
# Sends a HEAD request to the /status path of the backend servers, specifying the Host header as www.example.com, to perform health checks and determine the availability and health status of the servers.
stick on src # Enable session stickiness based on the source IP address of the client.
stick-table type ip size 20k peers sct_my_example_site # Define a stick table.
server server1 192.168.1.25:80 # Define a list of servers; each on a separate line.

1.4. Stick tables

Stick tables are what make it possible to rate-limit servers. They are a key-value store that holds an incoming IP address as the key with its counter. The counter is incremented whenever a new request is made to the server. Using this information, we can define rules to block requests if they cross a certain threshold.

We can tweak how long a stick table can hold information before erasing its buffer. A stick table can be defined as follows.

backend st_src_global # The backend for which we are defining this.
stick-table type ip size 1m expire 10s store http_req_rate(10s) # A stick table that can hold 1m
# (1048576) IPs and expires after 10 seconds unless it is accessed during that time.
# The HTTP request rate is calculated in an interval of 10 seconds.

HAProxy provides up to 12 stick table counter tracks, labeled from sc0 up to sc11.

1.5. Setting the request rate limit

We can use the HAProxy built-in http_req_rate directive to measure the request rate. In this example, we will return a 429 if a user makes more than 50 requests in an interval of 5 seconds.

stick-table type ip size 1m expire 5s store http_req_rate(5s)
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 50 } !white_list

The parameter 0 for sc_http_req_rate refers to the stick counter number.

1.6. Whitelisting an IP address

To define a whitelist, we use Access Control Lists (ACLs). In HAProxy, they allow us to test various conditions and perform a given action based on those tests. They can be defined as follows.

acl white_list src 192.168.1.1 ... # List of IP addresses to whitelist.
http-request track-sc0 src
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 25 } !white_list # This applies to all incoming IPs, except for those defined in the whitelist.

1.7. Limiting the number of open connections

The conn_cur option can be used to count the number of open connections from an IP address. If a user has too many connections open, we can deny their further connections. The syntax remains similar as before.

stick-table type ip size 1m expire 10s store conn_cur # Define a stick table.
tcp-request content track-sc0 src
tcp-request content reject if { sc_conn_cur(0) gt 10 } # The parameter `0` in sc_conn_cur refers to the stick counter number.

By using tcp-request instead of http-request, we do not evaluate HTTP headers in the packet, making the processing more efficient.

HAProxy provides additional counters for measuring the error rate of a site (HTTP requests that have a 4xx status code). The bytes_out_rate counter can be used to track content that is generating the most traffic for your site. It is also possible to create custom statistics using the general-purpose counter sc_inc_gpc0.

HAProxy Enterprise has features that allow individual increments across all peer nodes. This approach is better for detecting DDoS attacks.

2. Rate-limiting in Nginx

Nginx supports various rate-limiting schemes. In the example below, we'll set up two-stage rate-limiting.

Two-stage rate-limiting throttles a request before blocking it. To implement this, we first define the limit_req_zone in the Nginx configuration file nginx.conf.

limit_req_zone zone=two_stage_limit_store:10m rate=5r/s;

To limit certain types of requests, we can use the $limit variable. This can be used for rate-limiting a specific request method. For example, to rate-limit POST calls, we can write:

map $request_method $limit {
    default         '';
    POST            <IP of your server>;
}

limit_req_zone $limit zone=two_stage_limit_store:10m rate=5r/s;

Next, we reference the above limit in our server configuration. If the server hostname is "example.com," this file will be located at /etc/nginx/sites-enabled/example.com.conf. We first specify the response code for requests exceeding our rate-limit in the server section.

server {
    ...
    limit_req_status 429;
    ...
}

The error code 429 corresponds to "Too Many Requests."

Next, we define the location(s) we would like to rate-limit (within the server section). To rate-limit a specific endpoint, we can add:

location /token/abc {
    limit_req zone=two_stage_limit_store burst=25 delay=20;
    ...
}

Note: If we replace /token/abc with /, we will rate-limit all endpoints on our server.

Here, we reference the limit zone defined earlier in the nginx.conf file. The next two parameters are our rate-limit parameters:

burst specifies the number of requests to allow within 1 second.
delay indicates the request count after which subsequent requests are throttled.

To summarize, in the above config, we allow up to 25 requests per second. Requests 1 to 20 will reach the server without any delay. Requests 21 to 25 will be throttled, and any subsequent requests will be rejected with a 429 status code.

Upon reviewing the text, I have identified a few minor typos and suggestions for improvement:

3. Configuration in Puppet

The configurations below are an exact replica of what is shown above, but in Puppet. It is assumed that the required services are already installed on the system.

3.1. HAProxy configuration

# HAProxy Configuration

$ipv4s_list = ... # List of IPs.

haproxy::frontend { "my_example_site.com":
    bind    => {
        '*:80'   => [],
        ':::80'  => [],
        # SSL termination for rate-limiting.
        '*:443'  => "ssl crt ${certificate_path}",
        ':::443' => "ssl crt ${certificate_path}",
    },
    options => {
        'mode'            => 'http',
        'redirect'        => 'scheme https code 301 if !{ ssl_fc }',
        'stick-table'     => 'type ipv6 size 100k expire 30s store http_req_rate(1s)',
        'acl'             => "white_list src ${join($ipv4s_list, ' ')}",
        'http-request'    => [
        'track-sc0 src',
        # This setting allows 45 requests per second.
        'deny deny_status 429 if { sc_http_req_rate(0) gt 45 } !white_list',
        ],
        'default_backend' => "my_example_site.com_backend",
    },
}

haproxy::backend { "my_example_site.com_backend":
    options          => {
        'balance'        => 'roundrobin',
        'mode'           => 'http',
        'default-server' => 'inter 2s fall 2 rise 2',
        'option'         => [
        "httpchk HEAD /status HTTP/1.1\r\nHost:\ www.example.com",
        ],
        'stick'          => 'on src',
        'stick-table'    => "type ip size 20k peers sct_my_example_site",
    },
}

3.2. Nginx configuration

Similarly, for Nginx, we define the same configuration as above.

# Two-stage rate limiting for all nodes, see: https://www.nginx.com/blog/rate-limiting-nginx/#Two-Stage-Rate-Limiting
nginx::resource::location { 'rate limit':
    ensure                => present,
    ssl                   => true,
    ssl_only              => true,
    location              => '/',
    server                => $facts['networking']['fqdn'],
    limit_zone            => 'two_stage_limit_store burst=25 delay=20',
    proxy                 => "http://my_example_site.com:8080",
    proxy_read_timeout    => '90s',
    proxy_connect_timeout => '90s',
    proxy_send_timeout    => '90s',
    proxy_set_header      => $proxy_set_header,
}

class { 'nginx':
    # Override the default Nginx log format.
    log_format              =>
    {
        'json_combined' => 'escape=json'
        '{"data": {'
            '"time_local":"${time_local}",'
            '"remote_addr":"${remote_addr}",'
            '"remote_user":"${remote_user}",'
            '"request":"${request}",'
            '"status": "${status}",'
            '"body_bytes_sent":"${body_bytes_sent}",'
            '"request_time":"${request_time}",'
            '"http_referrer":"${http_referer}",'
            '"http_user_agent":"${http_user_agent}"'
        '}}',
    },
    limit_req_zone          => "${facts['networking']['ip']} zone=two_stage_limit

/home/adeel