Nginx tuning

The default PutFS docker-compose.yml includes a tuned nginx config for blob storage workloads. This page explains each directive and additional options for high-concurrency deployments.

Static file serving

sendfile on;
tcp_nopush on;
tcp_nodelay on;

sendfile on – serves static files via the kernel's sendfile() syscall, bypassing userspace entirely. Data goes directly from the file descriptor to the socket. This is the single most important directive for static file performance.
tcp_nopush on – batches response headers and the first chunk of file data into a single TCP packet (sets TCP_CORK). Reduces packet count, especially for small files where the entire response fits in one packet.
tcp_nodelay on – disables Nagle's algorithm after tcp_nopush sends the initial batch. Ensures the final chunk of a response is sent immediately without waiting to fill a full TCP segment. The combination of tcp_nopush + tcp_nodelay gives you batched headers with no trailing delay.

Connection handling

keepalive_timeout 65;
keepalive_requests 1000;
reset_timedout_connection on;

keepalive_timeout 65 – holds idle connections open for 65 seconds. Clients making multiple requests (e.g. listing then downloading) reuse the same TCP connection, avoiding the overhead of a new handshake per request.
keepalive_requests 1000 – allows up to 1000 requests per keepalive connection before forcing a reconnect. Prevents long-lived connections from accumulating state. The default is 1000; explicit is defensive.
reset_timedout_connection on – sends a TCP RST instead of a graceful FIN for timed-out connections. Frees server-side resources immediately rather than waiting through the TIME_WAIT state. Important under load when connection slots are scarce.

Upload handling

client_max_body_size 0;
proxy_request_buffering off;

client_max_body_size 0 – disables the upload size limit entirely. For blob storage, any artificial limit is wrong – enforce quotas at the filesystem level (ZFS zfs set quota) instead.
proxy_request_buffering off – streams the request body directly to the API without buffering to a temp file on disk. Without this, nginx writes every upload to /tmp before forwarding it, doubling disk I/O and adding latency. Auth is evaluated before the body is read, so unauthenticated uploads are rejected at the header stage.

Logging

access_log off;
log_not_found off;

access_log off – disables access logging entirely. This is critical for performance: at high concurrency, writing a log line per request to stdout or a file causes blocking I/O that dominates latency. In our benchmarks, access_log /dev/stdout reduced throughput from 104K req/s to 3.8K req/s – a 27x penalty. If you need access logs, write to a buffered file with access_log /var/log/nginx/access.log buffer=64k flush=5s; instead.
log_not_found off – suppresses error log entries for 404 responses. Since PutFS uses try_files $uri =404, missing files are normal operation (e.g. a file that was deleted, or a path that only exists via the API). Without this, the error log fills with noise.

High-concurrency settings

For deployments expecting thousands of concurrent connections, add these to the top-level config:

worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    multi_accept on;
}

worker_rlimit_nofile 65535 – raises the file descriptor limit per worker process. Each connection + each static file being served consumes a file descriptor. The default (~1024) is too low for high concurrency. Requires the host's ulimit -n to be at least as high.
worker_connections 4096 – maximum simultaneous connections per worker. With worker_processes auto (one per CPU core), total capacity is cores × 4096. Default is 1024.
multi_accept on – accept all pending connections in one event loop iteration instead of one at a time. Reduces latency at high connection rates. Minor effect at low concurrency.

Note

use epoll is unnecessary on Linux – nginx uses epoll by default.

Async I/O for large files

aio threads;
directio 8m;

aio threads – reads large files using a thread pool instead of blocking the worker process. Without this, a worker serving a multi-GB file blocks until the read completes – stalling all other connections on that worker. Requires nginx compiled with --with-threads (standard on most distros).
directio 8m – files >= 8MB bypass the OS page cache and use direct I/O, which is then handled by aio threads. Files < 8MB continue to use sendfile (fast path via kernel zero-copy). This gives us the best of both: sendfile for small files, non-blocking AIO for large ones.

Add these inside the location / block that serves static files:

location / {
    sendfile on;
    aio threads;
    directio 8m;
    # ...
}

Note

directio disables sendfile for affected requests automatically – there's no conflict. nginx picks the right path per request based on file size.

File descriptor cache

open_file_cache max=10000 inactive=60s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;

This caches nginx's own stat() and open() results (file descriptors, sizes, mtimes) – not file content (that's the OS page cache / ZFS ARC). It saves two syscalls per request for files that are read repeatedly. At very high req/s on a hot working set, this adds up.

For most PutFS workloads (write-once, read a few times) this won't make a noticeable difference – the kernel's dentry/inode cache already handles metadata lookups efficiently, and ZFS ARC caches both data and metadata. Consider enabling it only if you see high stat()/open() syscall counts under load (check with strace -c or perf top).

Warning

Cached metadata can serve stale Content-Length or Last-Modified for up to open_file_cache_valid seconds after a file changes. Fine for WORM workloads, potentially surprising otherwise.

Gzip compression

gzip on;
gzip_types text/plain application/json text/css application/javascript;
gzip_min_length 1024;
gzip_proxied any;

gzip on – compresses responses for clients that support it. Most PutFS traffic is binary blobs (images, PDFs, archives) that don't compress further – but listing responses (trailing slash) return plain text key names that compress well, especially for buckets with thousands of objects.
gzip_types – only compress text-based content types. Binary blobs are skipped entirely, avoiding wasted CPU.
gzip_min_length 1024 – don't bother compressing responses under 1KB. The gzip overhead isn't worth it for tiny payloads.
gzip_proxied any – compress responses from the API backend (proxied requests), not just static files.

Note

If you use ZFS compression=zstd, data is decompressed transparently on read before nginx serves it. Enabling gzip here re-compresses text content for the wire transfer – two compression cycles, but the disk read is still small (ZFS ARC caches compressed blocks) and the client gets a smaller download.

Upstream keepalive

Keep connections to the PutFS API alive between requests:

upstream api {
    server api:5000;
    keepalive 32;
}

keepalive 32 – maintain a pool of 32 idle connections to the API backend. Without this, nginx opens a new TCP connection for every proxied request (PUT, DELETE, LIST). The pool size should match your expected concurrency to the API.

When using upstream keepalive, the proxied requests need:

proxy_http_version 1.1;
proxy_set_header Connection "";

This switches from HTTP/1.0 (which closes after each request) to HTTP/1.1 with persistent connections.

Auth: map vs auth_request

PutFS defaults to map-based auth – two map directives load API keys and path/method scopes into nginx hash tables at startup. Authentication ($key_ok) and authorisation ($auth_ok) are pure in-memory O(1) lookups with zero I/O.

include /keys/keys.conf;

server {
    if ($key_ok != "1") { return 401; }
    if ($auth_ok != "1") { return 403; }
    # ...
}

The alternative is auth_request, which fires a subrequest per connection to an external auth service. Use auth_request when you need dynamic auth logic (OAuth, token validation). For static API key auth, the map approach is dramatically faster.

If is not evil here

The nginx wiki warns against if inside location blocks. The if ($key_ok != "1") { return 401; } pattern at the server level with return only is explicitly safe – return doesn't interact with content handlers.

Key management requires nginx -s reload after changes to keys.conf. See Auth for the key file format.

Timeouts

client_body_timeout 300s;
proxy_read_timeout 300s;

client_body_timeout 300s – time to wait between successive reads of the request body from the client. The default (60s) will timeout multi-GB uploads on slow connections.
proxy_read_timeout 300s – time to wait for the API to send a response. Large listing operations or slow disk reads can exceed the default 60s.

Unix socket

A Unix socket eliminates TCP loopback overhead entirely – no port allocation, no SYN/ACK, no TIME_WAIT accumulation.

Granian

granian --interface asgi --uds /run/putfs/putfs.sock \
    --workers 4 --runtime-mode mt --loop uvloop \
    --backpressure 16 --http 1 --no-ws \
    putfs.api:app

Nginx upstream

upstream api {
    server unix:/run/putfs/putfs.sock;
    keepalive 32;
}

Docker

For docker-compose, Unix sockets require a shared volume between the api and nginx containers:

services:
  api:
    volumes:
      - putfs_run:/run/putfs
    command: >-
      granian --interface asgi --uds /run/putfs/putfs.sock
      --workers 4 --runtime-mode mt --loop uvloop
      putfs.api:app

  nginx:
    volumes:
      - putfs_run:/run/putfs:ro

volumes:
  putfs_run:

Nginx tuning

Static file serving

Connection handling

Upload handling

Logging

High-concurrency settings

Async I/O for large files

File descriptor cache

Gzip compression

Upstream keepalive

Auth: map vs auth_request

Timeouts

Unix socket

Granian

Nginx upstream

Docker

Further reading