Decoding
Decoding is the inverse of encoding, but not its mirror.
Encoding enforces truth. Decoding reveals reality.
The JavaScript decoder takes raw Strata Core Binary (.scb) bytes and reconstructs a structured Value while preserving all semantic guarantees and rejecting malformed input explicitly.
Role of the decoder
Decoding exists to answer one question only:
"What do these bytes actually say?"
It does not:
Normalize
Repair
Guess intent
Rewrite structure
If the bytes are wrong, decoding fails. If the bytes are non-canonical, decoding still succeeds, but the structure is preserved exactly.
Entry point
The public API is:
This function:
Consumes raw bytes
Walks the byte stream sequentially
Reconstructs a
ValueEnsures the input is fully consumed
If any bytes remain after decoding, decoding fails.
Decoder model
Decoding is implemented as a stateful cursor over the byte array.
Internal state:
offset: current read positioninput: immutable byte buffer
All reads advance the cursor. All errors report the exact offset where decoding failed.
Tag dispatch
Decoding always begins by reading a single tag byte.
The tag determines:
Value type
How many bytes follow
How to interpret the payload
Unknown tags are fatal errors.
Primitive decoding
Null and booleans
Tags:
0x00 → null
0x01 → false
0x02 → true
These values have no payload.
Integer decoding
Integers are decoded as:
Tag: 0x10
Payload: SLEB128
Rules:
Decoded as
bigintMust fit signed 64-bit range
Overflow is rejected
Truncated input is rejected
Any malformed varint results in InvalidVarint.
String decoding
Strings are decoded as:
Tag: 0x20
Length: ULEB128
Payload: UTF-8 bytes
Rules:
Length is byte count
UTF-8 decoding is strict
Invalid UTF-8 is rejected
No normalization is applied
UTF-8 failure produces InvalidUtf8 at the byte offset where decoding started.
Bytes decoding
Bytes are decoded as:
Tag: 0x21
Length: ULEB128
Payload: raw bytes
Rules:
Bytes are preserved exactly
No interpretation
No transformation
List decoding
Lists are decoded as:
Tag: 0x30
Count: ULEB128
Payload:
countencoded values
Rules:
Order is preserved exactly
Count controls iteration
Nested decoding is recursive
Truncated lists fail with UnexpectedEOF.
Map decoding
Maps are decoded as:
Tag: 0x40
Count: ULEB128
Payload: key-value pairs
Rules:
Keys MUST decode to strings
Values may be any valid value
Duplicate keys are allowed
Insertion order reflects wire order
Maps are stored in a Map<string, Value>.
Duplicate keys
Decoding does not reject duplicate keys.
If a key appears multiple times:
The last value wins in the JS map
Earlier entries are overwritten
No error is raised
This is intentional and mirrors Rust behavior.
Decoding shows reality. Encoding enforces rules.
Trailing bytes check
After decoding a single root value, the decoder checks:
This prevents:
Concatenated payloads
Framing leaks
Ambiguous inputs
A valid .scb file encodes exactly one value.
Error model
All decoding failures throw DecodeError.
Error kinds:
InvalidTag
UnexpectedEOF
InvalidVarint
InvalidUtf8
TrailingBytes
Each error includes:
Exact byte offset
Structured error kind
No silent failures. No partial values.
Canonical vs non-canonical input
The decoder:
Accepts non-canonical encodings
Preserves structure
Allows re-encoding into canonical form
This enables:
Inspection
Debugging
Migration
Verification
Canonicality is enforced at encode time, not decode time.
Relationship to Northstar tests
Northstar T2 and T3 rely directly on decoder correctness.
If decoding:
Drops bytes
Reorders data
Misinterprets tags
Then wire determinism is broken.
The decoder is therefore a security boundary.
Summary
JavaScript decoding in Strata is:
Strict
Offset-aware
Non-normalizing
Structure-preserving
Failure-explicit
It does not lie. It does not guess. It tells you exactly what the bytes mean, or it refuses to speak.
Last updated
Was this helpful?