Granian tuning

Granian is the ASGI server that runs the PutFS API. For most deployments, the defaults work well. This page covers the settings that matter for blob storage workloads – primarily streaming uploads (PUT).

Note

Reads (GET/HEAD) are served by nginx via sendfile and never reach granian. Tuning granian only affects write throughput.

Workers

--workers 4

Number of worker processes. Each worker runs an independent Python interpreter. The general rule: match the number of CPU cores, don't exceed 2x cores. More workers = more memory and context switching overhead.

CPU cores	Workers
2	2
4	4
8	4–8
16+	8–16

PutFS's API is I/O-bound (streaming to disk), not CPU-bound. Going beyond the core count adds process overhead without improving throughput.

Runtime mode

--runtime-mode mt

mt (multi-threaded) – shares the Rust async runtime across workers. More efficient on multi-core systems. Recommended for PutFS.
st (single-threaded) – each worker has its own runtime. More efficient with very few workers (1–2).

Event loop

--loop uvloop

uvloop is a Cython-based event loop. In our benchmarks, uvloop has ~20% lower per-request latency on small writes (1K–1M) compared to rloop (Rust-based). rloop has higher throughput on large sequential writes (10G+), but that difference disappears behind real network I/O. An rloop image can be built with docker build --target rloop ..

Backpressure

--backpressure 128

Maximum concurrent requests per worker. For streaming uploads, each PUT is a long-lived connection – high backpressure lets more uploads run simultaneously. The default is derived from backlog / workers.

Workload	Backpressure
Few large uploads	16–32
Many small uploads	128–256
Mixed	128 (default in PutFS)

Setting this too high wastes memory (each in-flight request holds a buffer). Setting it too low causes request queuing.

HTTP/1 buffer size

--http1-buffer-size 10485760

Maximum buffer size for HTTP/1.1 connections in bytes. Default is ~408KB. PutFS sets this to 10MB to match the streaming chunk size – fewer read syscalls per upload chunk means less overhead per PUT.

Protocol

--http 1 --no-ws

--http 1 – HTTP/1.1 only. HTTP/2 adds framing overhead for no benefit since nginx proxies via HTTP/1.1 anyway.
--no-ws – disable WebSocket support. PutFS doesn't use WebSockets.

Unix socket

The default Docker image uses a Unix socket at /run/putfs/putfs.sock. Override via environment:

GRANIAN_UDS=/run/putfs/putfs.sock
GRANIAN_UDS_PERMISSIONS=666

For TCP instead (not recommended, slower):

GRANIAN_HOST=127.0.0.1
GRANIAN_PORT=5000
# unset GRANIAN_UDS

Environment variables

All granian settings can be configured via GRANIAN_ env vars. The Docker image sets sensible defaults – override only what you need:

GRANIAN_INTERFACE=asgi          # don't change
GRANIAN_UDS=/run/putfs/putfs.sock
GRANIAN_UDS_PERMISSIONS=666
GRANIAN_WORKERS=4               # match CPU cores
GRANIAN_RUNTIME_MODE=mt
GRANIAN_LOOP=uvloop
GRANIAN_BACKPRESSURE=128
GRANIAN_HTTP1_BUFFER_SIZE=10485760
GRANIAN_HTTP=1
GRANIAN_NO_WS=true

What not to change

--blocking-threads – only supported for WSGI, not ASGI. Granian will error on startup.
--http 2 – adds framing overhead, breaks nginx sendfile for reads, no benefit for streaming uploads.
--runtime-mode st – less efficient on multi-core systems. Only useful with 1–2 workers.
--workers 1 --threads 16 with Python 3.14t – we tested free-threaded Python and it's slower: considerably higher write latency and way more memory due to thread-safe allocator overhead. The GIL is not the bottleneck – PutFS is I/O bound and the GIL releases during I/O.