Data model

PutFS stores files on a plain directory hierarchy. It doesn't enforce any structure – PUT /hello.txt works fine.

For multi-tenant or multi-application deployments, there's a logical model:

tenant    →  a PutFS deployment (maps to a ZFS pool or data root)
└── dataset   →  first path segment (maps to a ZFS dataset, or "bucket" by convention)
    └── object    →  everything after (just a file)

https://acme.putfs.example.org/invoices/q1-2024.pdf
       ^^^^                    ^^^^^^^^ ^^^^^^^^^^^^
       tenant                  dataset  object path

On disk:

/srv/putfs/
├── invoices/           ← dataset
│   └── q1-2024.pdf     ← object
└── contracts/          ← dataset
    └── nda.pdf         ← object

Transparency: the filesystem is the source of truth

PutFS has no metadata database, no index, no sidecar files – the bytes on disk are the state PutFS exposes. Any change reaches every observer immediately, and in both directions: a PUT via the API is visible to ls -la the next instant, an mv or rm on disk shows up in the next API listing with no reindex step. rsync, zfs send, snapshot rollbacks, manual edits – they all just work; none of them need to know that PutFS exists.

The flipside: any feature that would require persistent metadata (object tags, custom ACLs, version history independent of the filesystem) is out of scope by construction.

Write atomicity and durability

Every PUT streams into a private temporary file and is published with an atomic rename (plain keys) or a create-only link (WORM keys), so a key only ever appears on disk as a complete object. A reader, or a racing PUT, sees the old object or the new one – never a partial or torn write – and a failed or aborted upload leaves only its own temp, never a half-written key. This atomicity is unconditional: it holds regardless of filesystem or tuning.

Durability is a separate guarantee. PutFS issues no fsync on the write path – it writes the temp, closes it, renames it into place, and returns 201. A successful PUT therefore means "atomically visible", not "already flushed to stable storage": PutFS delegates durability to the filesystem, exactly as it delegates redundancy and checksums (see Data integrity).

A 201 is not a power-loss barrier by default

Because PutFS never calls fsync, its writes are asynchronous as far as the filesystem is concerned. On ZFS with the default sync=standard, a just-acknowledged PUT sits in the open transaction group and is flushed at the next commit (up to zfs_txg_timeout, default 5s), so a power loss in that window can drop the object. It is dropped atomically – you lose the whole object or nothing, never a corrupt or partial one – because copy-on-write never exposes a torn state. To make an acknowledged PUT survive power loss, set sync=always on the dataset so every write reaches the intent log (SLOG) before it returns. The ZFS tuning guide covers the latency trade-off.

Tenants

A tenant is a single PutFS deployment – one server, one nginx, one data root. Isolation between tenants is at the deployment level (separate servers, separate DNS, separate auth). Or, when running multiple instances on one server, tenant isolation can be enforced via different mount points or zfs pools for the data root.

Could map to: an application, a team, an AWS region equivalent, a customer.

Datasets

The first path segment. This is a convention, not enforced by PutFS. It mirrors the way path-style S3 addressing treats the first segment as a "bucket" – handy for ops, but PutFS never parses it specially.

Maps naturally to:

ZFS dataset – zfs create tank/putfs/invoices for per-dataset quotas, snapshots, encryption
A directory – on any filesystem, it's just a folder

Datasets are created implicitly on first PUT.

Objects

Everything after the dataset prefix. Can be nested: /invoices/2024/q1/report.pdf is fine. Directories are created automatically.

No structure required

PutFS doesn't validate or enforce this model. All of these work:

curl -X PUT -d "hi" http://localhost:8000/hello.txt
curl -X PUT -d "hi" http://localhost:8000/a/b/c/d/e/f.txt
curl -X PUT -d "hi" http://localhost:8000/invoices/q1.pdf

It's just files in directories. The tenant/dataset model is a convention for ops.