Strata
Deterministic Data. Canonical truth.
What Strata is
Strata is a deterministic binary data format with canonical encoding. Every value has exactly one valid binary representation; identical logical values must produce identical bytes across all implementations.
It is a protocol and specification for representing structured data in a way that is reproducible across implementations, platforms, and time.
Why it exists
Most data formats allow multiple representations of the same logical value.
JSON objects can order keys differently.
Protocol Buffers permit field reordering.
MessagePack supports multiple integer encodings.
These formats prioritize flexibility, evolution, and developer convenience over byte-level determinism.
This flexibility is a liability when correctness depends on exact bytes: content addressing, cryptographic hashing, digital signatures, and distributed consensus.
Strata eliminates representational ambiguity by design. If two values are logically equal, their encodings are byte-identical.
Core guarantees
Canonical encoding. Every value has exactly one valid binary representation. Stable hashing. Hashes are computed over canonical encoded bytes. The encoding is the canonical form. Cross-language determinism. Independent implementations in different languages produce identical bytes for identical input. Strict validation. Non-canonical encodings are rejected during canonical encoding, not silently accepted.
Explicit non-goals
Strata Core Binary (.scb) is intentionally not optimized for the following:
Human readability.
Compression ratio.
Schema evolution.
It does not support optional fields, default values, backward-compatible changes, or floating-point numbers.
These constraints are deliberate. Flexibility in representation introduces ambiguity. Strata chooses precision over convenience.
Strata Text (.st) exists as a human-readable authoring format that compiles into canonical Strata Core Binary.
Value model
Strata supports a minimal, fixed set of value types, each with an unambiguous binary representation:
null - absence of value bool - true or false int - signed 64-bit integers bytes - arbitrary byte sequences string - UTF-8 text list - ordered sequences map - key-value pairs with string keys, sorted by UTF-8 byte order
There are no floats, no optional types, and no undefined behavior.
Stability & Versioning
Strata is versioned. Once a version is finalized, its encoding rules are frozen.
There are no deprecations, no migrations, and no silent changes within a version line.
Code written against a finalized Strata version will produce identical bytes and hashes for the lifetime of that version line. This is a requirement, not a goal.