Canonical rules
This page defines what canonical means in Strata.
Canonical rules are not guidelines. They are hard invariants enforced by every correct implementation.
If two implementations disagree on canonical output, at least one of them is wrong.
Canonical definition
A Strata value has exactly one valid binary representation.
No alternatives. No equivalent forms. No normalization step.
Given the same logical value, all correct implementations must produce identical bytes.
Scope of canonical rules
Canonical rules apply to:
Binary encoding of all value types
Integer representation
UTF-8 string encoding
Byte sequence representation
Map key ordering
Hash input definition
Canonical rules do not apply to:
Transport framing
Streaming boundaries
Envelopes or wrappers
Compression layers
Encryption or signatures
Protocol metadata
Canonicality exists strictly at the value -> bytes boundary.
Single representation rule
For every Strata value:
there is one valid encoding
all other encodings are invalid
invalid encodings must be rejected
There is no concept of:
equivalent encodings
permissive decoding
post-decode normalization
Canonical form is absolute.
Encoding authority
Encoding is the source of truth.
Encoding enforces canonical rules by construction.
If a value violates a rule:
encoding fails
no bytes are produced
Encoding never:
guesses intent
repairs data
coerces types
relaxes constraints
Decoding relationship to canonicality
Decoding does not enforce canonical form.
Decoding:
reconstructs structure
exposes malformed or hostile input
preserves observed ordering where applicable
Canonicality is enforced only during encoding.
This separation is intentional.
Decoding reveals reality. Encoding enforces truth.
Canonical ordering
Where ordering is defined, it is deterministic and absolute.
Examples:
Map keys are ordered by UTF-8 byte sequence
Lists preserve explicit order
Bytes preserve exact sequence
No locale rules. No Unicode normalization. No platform influence.
Canonical rejection
Non-canonical input must not be silently accepted.
Examples of invalid canonical states:
multiple encodings for the same integer
non-canonical map ordering
invalid UTF-8
duplicate map keys at encode time
trailing bytes after a value
Implementations must fail explicitly.
Hash canonicality
Hashes are computed over canonical bytes only.
Hash input:
includes only the canonical binary encoding
excludes transport, framing, or metadata
is identical across platforms and languages
If two systems hash different bytes for the same value, canonical rules were violated earlier.
Frozen guarantees
Once canonical rules are finalized for a version:
they do not change
they are never weakened
they are never retroactively reinterpreted
Any change that alters canonical bytes:
requires a new version boundary
requires a new Northstar invariant
must be explicitly documented
Non-goals of canonical rules
Canonical rules do not attempt to be:
human-friendly
flexible
forward-compatible by default
schema-aware
self-describing
These concerns belong in higher layers, not in the canonical core.
Philosophy
Canonical rules exist to answer one question conclusively:
"Are these bytes correct?"
If the answer is uncertain, the system must refuse to proceed.
Summary
one value -> one encoding
encoding enforces canonical truth
decoding does not normalize
hashes depend on canonical bytes
canonical rules are frozen per version
Canonical encoding is the foundation that everything else stands on.
Last updated
Was this helpful?