Storage backends¶
SpecStar separates persistence into three related but distinct layers:
- metadata — IDs, revision information, search/index state, lifecycle flags
- resource data — the structured payload of the resource itself
- blob data — binary uploads and file-like artifacts
There are two supported setup levels for backend wiring:
- the higher-level unified backend config via
backend= - the lower-level storage and queue factories for more explicit control
This page starts with the unified backend config, then maps the lower-level factory behavior in detail.
The unified backend config is the most compact way to choose the default backend:
from specstar import BackendBinding, BackendConfig, ConnectionProfile, spec
spec.configure(
backend=BackendConfig(
connections={
"local": ConnectionProfile(
type="disk",
options={"rootdir": "./data"},
),
"jobs": ConnectionProfile(
type="simple",
options={"max_retries": 3},
),
},
meta=BackendBinding(use="local"),
resource=BackendBinding(use="local"),
blob=BackendBinding(use="local"),
mq=BackendBinding(use="jobs"),
)
)
You can also load the same shape from a JSON file:
The lower-level storage_factory= and message_queue_factory= parameters remain fully supported when you need more direct control over the underlying storage composition.
For a full option-by-option backend settings reference, see Backend configuration reference.
Unified backend schema¶
The unified config is schema-first and uses type as the backend discriminator.
{
"version": 1,
"connections": {
"local": {
"type": "disk",
"options": {
"rootdir": "./data"
}
},
"jobs": {
"type": "simple",
"options": {
"max_retries": 3
}
}
},
"meta": {"use": "local"},
"resource": {"use": "local"},
"blob": {"use": "local"},
"mq": {"use": "jobs"}
}
This structure lets you:
- reuse one connection profile across multiple backend roles
- configure metadata, resource, blob, and queue backends together
- register custom backend providers under a new
typeand reference them from config
What the storage factory controls¶
| Factory | Metadata | Resource data | Blob data |
|---|---|---|---|
MemoryStorageFactory() |
memory | memory | memory |
DiskStorageFactory("./data") |
local files | local files | local files |
S3StorageFactory(...) |
SQLite synced to S3 | S3 | S3 |
PostgresStorageFactory(...) |
PostgreSQL | PostgreSQL | memory by default |
PostgreSQLS3StorageFactory(...) |
PostgreSQL | S3 | S3 |
PostgresDiskStorageFactory(...) |
PostgreSQL | local disk | memory by default |
PostgresDiskS3StorageFactory(...) |
PostgreSQL | local disk | S3 |
That last column matters. If your app uses file uploads or binary fields, do not assume every SQL-backed setup automatically gives you durable blobs.
Options¶
DiskStorageFactory¶
from specstar import spec
from specstar.resource_manager import DiskStorageFactory
spec.configure(
storage_factory=DiskStorageFactory("./data")
)
Best for:
- local development
- single-node deployments
- MVPs that need restart-safe persistence
Pros:
- zero extra infrastructure
- easy local inspection and backups
- blob persistence works out of the box
Cons:
- not ideal for multi-node deployments
- limited concurrent write scalability
S3StorageFactory¶
from specstar import spec
from specstar.resource_manager import S3StorageFactory
spec.configure(
storage_factory=S3StorageFactory(
bucket="my-bucket",
endpoint_url="https://s3.amazonaws.com",
)
)
Best for:
- cloud deployments
- shared storage across app instances
- object-storage-first architectures
Pros:
- durable resource and blob storage
- works with AWS S3 and S3-compatible services such as MinIO
- no separate database required for the simplest cloud setup
Cons:
- metadata queries are less SQL-like than PostgreSQL-backed setups
- still requires object storage infrastructure and credentials
PostgresStorageFactory¶
from specstar import spec
from specstar.resource_manager import PostgresStorageFactory
spec.configure(
storage_factory=PostgresStorageFactory(
connection_string="postgresql://user:pass@host:5432/appdb",
)
)
Best for:
- production systems with query-heavy workloads
- teams that want a database-centric architecture
- APIs that mainly serve structured records rather than uploaded files
Pros:
- fast searchable metadata
- strong indexing and SQL operations
- all structured data stays in PostgreSQL
Cons:
- requires database infrastructure
- blob data is not durable by default, so binary-upload workloads need an additional plan
If you need durable file uploads as well, prefer PostgresDiskS3StorageFactory for the default production path, or use PostgreSQLS3StorageFactory when you want S3 for resource payloads too.
PostgresDiskStorageFactory¶
from specstar import spec
from specstar.resource_manager import PostgresDiskStorageFactory
spec.configure(
storage_factory=PostgresDiskStorageFactory(
connection_string="postgresql://user:pass@host:5432/appdb",
rootdir="./data",
)
)
Best for:
- the recommended production setup
- systems that want PostgreSQL-backed metadata and search
- deployments that prefer keeping structured resource payloads on mounted disk
Pros:
- strong metadata querying in PostgreSQL
- straightforward local or mounted-volume resource persistence
- a good fit when blob uploads are handled separately in S3
Cons:
- blob durability is still a separate configuration decision
- not as stateless as storing resource payloads fully in object storage
Pair this setup with S3-backed blob handling when your application stores uploads or binary artifacts.
PostgresDiskS3StorageFactory¶
from specstar import spec
from specstar.resource_manager import PostgresDiskS3StorageFactory
spec.configure(
storage_factory=PostgresDiskS3StorageFactory(
connection_string="postgresql://user:pass@host:5432/appdb",
rootdir="./data",
s3_bucket="my-blob-bucket",
)
)
Best for:
- the default production-ready storage setup
- PostgreSQL-backed search and metadata
- disk-backed resource payloads plus durable S3 blob uploads
Pros:
- keeps structured payloads on local or mounted disk
- stores blobs durably in S3-compatible storage
- works out of the box with the current recommended architecture
Cons:
- still requires both database and object storage infrastructure
- resource payloads remain tied to mounted disk rather than full object storage
PostgreSQLS3StorageFactory¶
from specstar import spec
from specstar.resource_manager import Encoding, PostgreSQLS3StorageFactory
spec.configure(
storage_factory=PostgreSQLS3StorageFactory(
connection_string="postgresql://user:pass@host:5432/appdb",
s3_bucket="my-bucket",
s3_region="us-east-1",
encoding=Encoding.msgpack,
)
)
Best for:
- production systems with durable uploads
- multi-node services
- teams that want PostgreSQL search plus object-storage durability
Pros:
- searchable metadata in PostgreSQL
- resource data and blobs stored durably in S3
- strong fit for production deployments
Cons:
- requires both database and object storage infrastructure
- more moving parts than local disk setups
MemoryStorageFactory¶
from specstar import spec
from specstar.resource_manager import MemoryStorageFactory
spec.configure(
storage_factory=MemoryStorageFactory()
)
Best for:
- tests
- short-lived demos
- experiments where restart durability does not matter
⚠️ Data is lost when the process exits.
Choosing a backend¶
| Use case | Recommended backend |
|---|---|
| unit tests and demos | MemoryStorageFactory() |
| local development or MVP | DiskStorageFactory("./data") |
| recommended production setup | PostgresDiskS3StorageFactory(...) |
| object-storage-first production | PostgreSQLS3StorageFactory(...) |
| cloud-first object storage setup | S3StorageFactory(...) |
Rule of thumb:
start local → DiskStorageFactory
recommended production → PostgresDiskS3StorageFactory + RabbitMQ
prefer fully object-backed resource payloads → PostgreSQLS3StorageFactory
¶
start local → DiskStorageFactory
recommended production → PostgresDiskS3StorageFactory + RabbitMQ
prefer fully object-backed resource payloads → PostgreSQLS3StorageFactory
Per-model override¶
Different resources can use different storage backends.
from specstar import SpecStar
from specstar.resource_manager import DiskStorageFactory, S3StorageFactory
spec = SpecStar(
storage_factory=DiskStorageFactory("./data")
)
spec.add_model(User)
spec.add_model(
Image,
storage=S3StorageFactory(bucket="image-bucket")
)
This is useful when:
- one resource is local-only while another needs cloud durability
- binary-heavy resources should live in object storage
- you want to migrate one model at a time rather than the whole system at once
All resources still share the same SpecStar programming model.
Common gotchas¶
- call
spec.configure(...)beforeadd_model(...) - do not use in-memory storage if restarts must preserve data
- if the app stores files, verify the selected factory gives you durable blob storage
- for jobs and workers, combine the storage decision with a queue decision from the Job Queue quickstart