Skip to content

S3 (not implemented)

PutFS does not ship an S3 endpoint. We mocked a prototype for an S3 compatible extension for the minimal api, but decided against it.

This page records why we discarded it and what to watch out for if you decide to reimplement it. The intended audience is someone who wants S3 wire compatibility on top of PutFS and is willing to maintain it.

Why we removed it

The prototype used awssig (the only readily available pure-Python SigV4 verifier on PyPI) and called AWSSigV4S3Verifier(..., body=b"") in the auth middleware. But we can't rely on awssig:

  1. No body integrity. awssig does not hash the request body. It echoes the client-supplied X-Amz-Content-Sha256 header into the canonical string and verifies the signature over that – never comparing the header value to the actual bytes received. This is contrary to AWS's expectation that the server independently verifies X-Amz-Content-Sha256.
  2. awssig is unmaintained. Last release: 0.5.0 on 2019-12-14. Still depends on six (Python 2/3 compat library, EOL). No CVEs filed – likely just not audited. Signature comparison uses == (non-constant-time), and the S3 variant's URI normalisation diverges from standard SigV4 (preserves .. and //, which means the signed canonical path can differ from what the OS resolves later).

Design notes for re-implementers

If you still want to add S3 to PutFS as an in-tree feature, the design has to address the problems above. Notes from the prototype:

SigV4 verification

  • Don't use awssig. Or if you do, treat it as unaudited code: re-implement the canonical request locally, verify the signature with hmac.compare_digest, and use awssig only as a reference. The SigV4 spec is ~50 lines of code; the canonical-request rules are precise and well-documented. (AWS spec)
  • Always re-hash the body. Stream the request body through hashlib.sha256() and compare the digest to the value of X-Amz-Content-Sha256 from the (verified) headers. On mismatch, refuse the request – 400 XAmzContentSHA256Mismatch.
  • Decide a policy for UNSIGNED-PAYLOAD. boto3 defaults to UNSIGNED-PAYLOAD for streaming uploads (it intentionally does not include the body in the signature). This is AWS-spec-compliant but means body integrity is provided only by transport security. If you require integrity end-to-end, refuse UNSIGNED-PAYLOAD for write methods unless the deployer opts in.
  • STREAMING-AWS4-HMAC-SHA256-PAYLOAD signs each chunk in the body. Supporting it requires parsing chunk trailers – the spec is in the SigV4 documentation linked above.
  • Use a constant-time comparison (hmac.compare_digest) for both signature and content-hash comparisons.
  • Pin a strict timestamp window (15 min matches AWS). No nonce tracking – captured signed requests can be replayed within the window over plain HTTP. Require HTTPS.

X-Accel-Redirect

Serve GET requests via Nginx sendfile. Build the redirect from the resolved relative path:

rel = path.relative_to(root).as_posix()
headers["x-accel-redirect"] = f"/_internal/{rel}"

The corresponding nginx block:

location /_internal/ {
    internal;
    alias /data/;
}

Plus, in the proxy block, set the Host header explicitly so the SigV4 signature matches the host the client signed against:

proxy_set_header Host $http_host;

What the prototype implemented

For reference, the removed prototype handled: GetObject, PutObject, HeadObject, DeleteObject, ListObjectsV2 (with prefix, delimiter, continuation-token pagination), ListBuckets, CreateBucket, HeadBucket, DeleteBucket. Multipart upload was not implemented. XML responses were built with xml.sax.saxutils.escape – fine for the static fields, but a serious gap for any field that needs proper XML serialisation (e.g. binary in object keys).