Skip to content

Granian tuning

Granian is the ASGI server that runs the PutFS API. For most deployments, the defaults work well. This page covers the settings that matter for blob storage workloads – primarily streaming uploads (PUT).

Note

Reads (GET/HEAD) are served by nginx via sendfile and never reach granian. Tuning granian only affects write throughput.

Workers

--workers 4

Number of worker processes. Each worker runs an independent Python interpreter. The general rule: match the number of CPU cores, don't exceed 2x cores. More workers = more memory and context switching overhead.

CPU cores Workers
2 2
4 4
8 4–8
16+ 8–16

PutFS's API is I/O-bound (streaming to disk), not CPU-bound. Going beyond the core count adds process overhead without improving throughput.

Runtime mode

--runtime-mode mt
  • mt (multi-threaded) – shares the Rust async runtime across workers. More efficient on multi-core systems. Recommended for PutFS.
  • st (single-threaded) – each worker has its own runtime. More efficient with very few workers (1–2).

Event loop

--loop uvloop

uvloop is a Cython-based event loop. In our benchmarks, uvloop has ~20% lower per-request latency on small writes (1K–1M) compared to rloop (Rust-based). rloop has higher throughput on large sequential writes (10G+), but that difference disappears behind real network I/O. An rloop image can be built with docker build --target rloop ..

Backpressure

--backpressure 128

Maximum concurrent requests per worker. For streaming uploads, each PUT is a long-lived connection – high backpressure lets more uploads run simultaneously. The default is derived from backlog / workers.

Workload Backpressure
Few large uploads 16–32
Many small uploads 128–256
Mixed 128 (default in PutFS)

Setting this too high wastes memory (each in-flight request holds a buffer). Setting it too low causes request queuing.

HTTP/1 buffer size

--http1-buffer-size 10485760

Maximum buffer size for HTTP/1.1 connections in bytes. Default is ~408KB. PutFS sets this to 10MB to match the streaming chunk size – fewer read syscalls per upload chunk means less overhead per PUT.

Protocol

--http 1 --no-ws
  • --http 1 – HTTP/1.1 only. HTTP/2 adds framing overhead for no benefit since nginx proxies via HTTP/1.1 anyway.
  • --no-ws – disable WebSocket support. PutFS doesn't use WebSockets.

Unix socket

The default Docker image uses a Unix socket at /run/putfs/putfs.sock. Override via environment:

GRANIAN_UDS=/run/putfs/putfs.sock
GRANIAN_UDS_PERMISSIONS=666

For TCP instead (not recommended, slower):

GRANIAN_HOST=127.0.0.1
GRANIAN_PORT=5000
# unset GRANIAN_UDS

Environment variables

All granian settings can be configured via GRANIAN_ env vars. The Docker image sets sensible defaults – override only what you need:

GRANIAN_INTERFACE=asgi          # don't change
GRANIAN_UDS=/run/putfs/putfs.sock
GRANIAN_UDS_PERMISSIONS=666
GRANIAN_WORKERS=4               # match CPU cores
GRANIAN_RUNTIME_MODE=mt
GRANIAN_LOOP=uvloop
GRANIAN_BACKPRESSURE=128
GRANIAN_HTTP1_BUFFER_SIZE=10485760
GRANIAN_HTTP=1
GRANIAN_NO_WS=true

What not to change

  • --blocking-threads – only supported for WSGI, not ASGI. Granian will error on startup.
  • --http 2 – adds framing overhead, breaks nginx sendfile for reads, no benefit for streaming uploads.
  • --runtime-mode st – less efficient on multi-core systems. Only useful with 1–2 workers.
  • --workers 1 --threads 16 with Python 3.14t – we tested free-threaded Python and it's slower: considerably higher write latency and way more memory due to thread-safe allocator overhead. The GIL is not the bottleneck – PutFS is I/O bound and the GIL releases during I/O.

Further reading