Storage backends¶

SpecStar separates persistence into three related but distinct layers:

metadata — IDs, revision information, search/index state, lifecycle flags
resource data — the structured payload of the resource itself
blob data — binary uploads and file-like artifacts

There are two supported setup levels for backend wiring:

the higher-level unified backend config via backend=
the lower-level storage and queue factories for more explicit control

This page starts with the unified backend config, then maps the lower-level factory behavior in detail.

The unified backend config is the most compact way to choose the default backend:

from specstar import BackendBinding, BackendConfig, ConnectionProfile, spec

spec.configure(
    backend=BackendConfig(
        connections={
            "local": ConnectionProfile(
                type="disk",
                options={"rootdir": "./data"},
            ),
            "jobs": ConnectionProfile(
                type="simple",
                options={"max_retries": 3},
            ),
        },
        meta=BackendBinding(use="local"),
        resource=BackendBinding(use="local"),
        blob=BackendBinding(use="local"),
        mq=BackendBinding(use="jobs"),
    )
)

You can also load the same shape from a JSON file:

spec.configure(backend="./backend.json")

The lower-level storage_factory= and message_queue_factory= parameters remain fully supported when you need more direct control over the underlying storage composition.

For a full option-by-option backend settings reference, see Backend configuration reference.

Unified backend schema¶

The unified config is schema-first and uses type as the backend discriminator.

{
  "version": 1,
  "connections": {
    "local": {
      "type": "disk",
      "options": {
        "rootdir": "./data"
      }
    },
    "jobs": {
      "type": "simple",
      "options": {
        "max_retries": 3
      }
    }
  },
  "meta": {"use": "local"},
  "resource": {"use": "local"},
  "blob": {"use": "local"},
  "mq": {"use": "jobs"}
}

This structure lets you:

reuse one connection profile across multiple backend roles
configure metadata, resource, blob, and queue backends together
register custom backend providers under a new type and reference them from config

What the storage factory controls¶

Factory	Metadata	Resource data	Blob data
`MemoryStorageFactory()`	memory	memory	memory
`DiskStorageFactory("./data")`	local files	local files	local files
`S3StorageFactory(...)`	SQLite synced to S3	S3	S3
`PostgresStorageFactory(...)`	PostgreSQL	PostgreSQL	memory by default
`PostgreSQLS3StorageFactory(...)`	PostgreSQL	S3	S3
`PostgresDiskStorageFactory(...)`	PostgreSQL	local disk	memory by default
`PostgresDiskS3StorageFactory(...)`	PostgreSQL	local disk	S3

That last column matters. If your app uses file uploads or binary fields, do not assume every SQL-backed setup automatically gives you durable blobs.

Options¶

DiskStorageFactory¶

from specstar import spec
from specstar.resource_manager import DiskStorageFactory

spec.configure(
    storage_factory=DiskStorageFactory("./data")
)

Best for:

local development
single-node deployments
MVPs that need restart-safe persistence

Pros:

zero extra infrastructure
easy local inspection and backups
blob persistence works out of the box

Cons:

not ideal for multi-node deployments
limited concurrent write scalability

S3StorageFactory¶

from specstar import spec
from specstar.resource_manager import S3StorageFactory

spec.configure(
    storage_factory=S3StorageFactory(
        bucket="my-bucket",
        endpoint_url="https://s3.amazonaws.com",
    )
)

Best for:

cloud deployments
shared storage across app instances
object-storage-first architectures

Pros:

durable resource and blob storage
works with AWS S3 and S3-compatible services such as MinIO
no separate database required for the simplest cloud setup

Cons:

metadata queries are less SQL-like than PostgreSQL-backed setups
still requires object storage infrastructure and credentials

PostgresStorageFactory¶

from specstar import spec
from specstar.resource_manager import PostgresStorageFactory

spec.configure(
    storage_factory=PostgresStorageFactory(
        connection_string="postgresql://user:pass@host:5432/appdb",
    )
)

Best for:

production systems with query-heavy workloads
teams that want a database-centric architecture
APIs that mainly serve structured records rather than uploaded files

Pros:

fast searchable metadata
strong indexing and SQL operations
all structured data stays in PostgreSQL

Cons:

requires database infrastructure
blob data is not durable by default, so binary-upload workloads need an additional plan

If you need durable file uploads as well, prefer PostgresDiskS3StorageFactory for the default production path, or use PostgreSQLS3StorageFactory when you want S3 for resource payloads too.

PostgresDiskStorageFactory¶

from specstar import spec
from specstar.resource_manager import PostgresDiskStorageFactory

spec.configure(
    storage_factory=PostgresDiskStorageFactory(
        connection_string="postgresql://user:pass@host:5432/appdb",
        rootdir="./data",
    )
)

Best for:

the recommended production setup
systems that want PostgreSQL-backed metadata and search
deployments that prefer keeping structured resource payloads on mounted disk

Pros:

strong metadata querying in PostgreSQL
straightforward local or mounted-volume resource persistence
a good fit when blob uploads are handled separately in S3

Cons:

blob durability is still a separate configuration decision
not as stateless as storing resource payloads fully in object storage

Pair this setup with S3-backed blob handling when your application stores uploads or binary artifacts.

PostgresDiskS3StorageFactory¶

from specstar import spec
from specstar.resource_manager import PostgresDiskS3StorageFactory

spec.configure(
    storage_factory=PostgresDiskS3StorageFactory(
        connection_string="postgresql://user:pass@host:5432/appdb",
        rootdir="./data",
        s3_bucket="my-blob-bucket",
    )
)

Best for:

the default production-ready storage setup
PostgreSQL-backed search and metadata
disk-backed resource payloads plus durable S3 blob uploads

Pros:

keeps structured payloads on local or mounted disk
stores blobs durably in S3-compatible storage
works out of the box with the current recommended architecture

Cons:

still requires both database and object storage infrastructure
resource payloads remain tied to mounted disk rather than full object storage

PostgreSQLS3StorageFactory¶

from specstar import spec
from specstar.resource_manager import Encoding, PostgreSQLS3StorageFactory

spec.configure(
    storage_factory=PostgreSQLS3StorageFactory(
        connection_string="postgresql://user:pass@host:5432/appdb",
        s3_bucket="my-bucket",
        s3_region="us-east-1",
        encoding=Encoding.msgpack,
    )
)

Best for:

production systems with durable uploads
multi-node services
teams that want PostgreSQL search plus object-storage durability

Pros:

searchable metadata in PostgreSQL
resource data and blobs stored durably in S3
strong fit for production deployments

Cons:

requires both database and object storage infrastructure
more moving parts than local disk setups

MemoryStorageFactory¶

from specstar import spec
from specstar.resource_manager import MemoryStorageFactory

spec.configure(
    storage_factory=MemoryStorageFactory()
)

Best for:

tests
short-lived demos
experiments where restart durability does not matter

⚠️ Data is lost when the process exits.

Choosing a backend¶

Use case	Recommended backend
unit tests and demos	`MemoryStorageFactory()`
local development or MVP	`DiskStorageFactory("./data")`
recommended production setup	`PostgresDiskS3StorageFactory(...)`
object-storage-first production	`PostgreSQLS3StorageFactory(...)`
cloud-first object storage setup	`S3StorageFactory(...)`

Rule of thumb:

start local → DiskStorageFactory
recommended production → PostgresDiskS3StorageFactory + RabbitMQ
prefer fully object-backed resource payloads → PostgreSQLS3StorageFactory

¶

Per-model override¶

Different resources can use different storage backends.

from specstar import SpecStar
from specstar.resource_manager import DiskStorageFactory, S3StorageFactory

spec = SpecStar(
    storage_factory=DiskStorageFactory("./data")
)

spec.add_model(User)

spec.add_model(
    Image,
    storage=S3StorageFactory(bucket="image-bucket")
)

This is useful when:

one resource is local-only while another needs cloud durability
binary-heavy resources should live in object storage
you want to migrate one model at a time rather than the whole system at once

All resources still share the same SpecStar programming model.

Common gotchas¶

call spec.configure(...) before add_model(...)
do not use in-memory storage if restarts must preserve data
if the app stores files, verify the selected factory gives you durable blob storage
for jobs and workers, combine the storage decision with a queue decision from the Job Queue quickstart

Storage backends¶

Unified backend schema¶

What the storage factory controls¶

Options¶

DiskStorageFactory¶

S3StorageFactory¶

PostgresStorageFactory¶

PostgresDiskStorageFactory¶

PostgresDiskS3StorageFactory¶

PostgreSQLS3StorageFactory¶

MemoryStorageFactory¶

Choosing a backend¶

start local → DiskStorageFactory recommended production → PostgresDiskS3StorageFactory + RabbitMQ prefer fully object-backed resource payloads → PostgreSQLS3StorageFactory ¶

Per-model override¶

Common gotchas¶

`start local → DiskStorageFactory recommended production → PostgresDiskS3StorageFactory + RabbitMQ prefer fully object-backed resource payloads → PostgreSQLS3StorageFactory`
¶