Draft. No backwards compatibility. No flag day. No migration window.
Activated at the genesis of the new final Lux network: **2025-12-25
16:20 Pacific (unix 1766708400)**. The pre-Quasar Edition Lux
network (2020–2025) is a separate network and is out of scope — its
TCP-framed luxfi/p2p + linearcodec stack supports none of the
primitives specified here.
The ZAP buffer is immutable. ZapDB is MVCC. QUIC streams are
multiplexed and independently resettable. The cryptographic transcript
is a pure function over typed offsets. Sign and Verify read those
offsets directly. Block bytes are ZAP frames containing tx ZAP frames.
These four properties — immutability of the buffer, MVCC of the store,
multiplexing of the transport, purity of the transcript — compose into
a single architectural fact: every operation on the stack can be
issued speculatively, observed by N independent consumers, and
unwound without retroactive coordination.
This LP formalizes the contract. It enumerates the five canonical
pipelines that the ZAP Stack admits, the depth at which each
saturates, and the atomic-unwind primitive that backs each mutation
boundary. It specifies the speculative-execution invariant, the QUIC
stream-isolation invariant, the cert leg degradation behavior, and
the light-client chunk re-fetch path. It defines the cross-impl test
vectors that any conformant implementation in Go, Python, Rust, C++,
or C must reproduce byte-for-byte.
Codec-based stacks cannot do this. A codec.Manager-backed pipeline
must re-marshal at every hop because the in-memory struct and the
on-disk row are different shapes; re-marshal is not pure; speculation
costs N codec round-trips, and unwind requires writing a reverse
codec for every mutation. The ZAP Stack removes the marshal step,
which removes the speculation cost, which removes the unwind cost.
The pipelining contract is the operational dividend of that removal.
This LP exists so that consumers of the stack — wallets, light
clients, bridge contracts, indexers, replay tools — know what to
expect, and so that cross-implementation conformance tests can be
written against a stable behavior surface.
Five canonical pipelines. Each is defined by a stage list, the
parallelism dimension along which it scales, and the unwind point at
which a stage failure is realized. No other pipeline shape exists in
the stack — these five cover every concurrent code path from peer
ingress to ZapDB commit.
Pipeline depth = QUIC concurrent-stream limit per peer connection
(default 64 unidirectional streams per peer). A peer can have 64 txs
in flight simultaneously, each at a different stage. The handler is a
goroutine per stream; the immutable buffer makes per-stage state
threadsafe without locks.
Pipeline depth = 2–3 blocks in flight. The proposer can begin
selecting block N+1 before block N has finished aggregating its cert
legs, because the selection touches the mempool's ranked heap and
the cert aggregation touches the consensus engine — orthogonal state.
All four legs run in parallel. Wall-clock per round = `max(legs) +
fan-out overhead`. The cert wire format (LP-182 QuasarCert) is one
struct with one field per leg; an absent leg is wire-encoded as a
zero-length byte string, and the relying party's cert-mode policy
(LP-217 — PQ-off / PQ-fast / PQ-strict / PQ-heavy plus the pure-PQ
variants; was LP-017 Pulsar / Aurora / Polaris codenames, historical)
decides whether the cert is acceptable at the current observed tier.
The cert struct is the same shape regardless of how many legs have
arrived; mode policy is the relying party's concern. See **Cert leg
degradation tiers** below.
All sig-verifies, schema-validates, and offset-bounds checks run in
parallel goroutines reading the same buffer. Order-dependent checks
(balance debits, nonce monotonicity) run serially over the verified
set after the parallel batch joins. A failed parallel verify
invalidates one tx in the block; an invalid block (failing
order-dependent checks) is rejected as a whole and never reaches the
state-apply phase.
A bootstrapping node opens these five streams in parallel against
every peer it dials. Each stream is independent — the state-chunk
stream can stall on a slow peer without blocking the live-vote
subscription. Recovery is per-stream: a stalled chunk stream is
canceled and re-issued against a different peer; the other four
streams to the same peer survive.
Reference numbers from a production validator (Apple M4 Max, 16
performance cores, GOMAXPROCS=16). These are honest production
saturation points, not synthetic peak.
The Tx receive number is the figure that drives backpressure: at
12.8k in-flight ZAP buffers, the working-set is bounded by `avg(tx
size) × 12.8k`. For a 500 B average tx, that's 6.4 MiB resident — well
under L3.
Every mutation boundary in the stack has an O(1) or O(diff) unwind
primitive. The unwind is atomic in the sense that the layer either
commits or it does not — there is no half-state observable by any
other layer.
txn.Discard() releases the version pointer | O(1) |txn.Rollback(diff) | O(diff size) |round.Abandon() advances the round counter; in-flight legs are GC'd | O(N) over pending vote map |stream.CancelWrite(errCode) | O(0) — peer observes RESET_STREAM |The contract: any layer can call its unwind primitive from any
goroutine at any time. No layer's unwind blocks on any other layer.
No layer's unwind leaves observable state behind. The MVCC discipline
of ZapDB, the ref-counting of the ZAP buffer, the cancel semantics of
QUIC, and the abandon-round semantics of consensus compose so that a
cascading failure at one layer is GC'd at the next, without
coordination.
For each failure type: what triggers it, what unwinds at each layer,
what recovery path the stack takes.
txn.Rollback(diff) | block rejected; round abandoned | none — block already received | proposer re-builds without the tx; next round |txn.Discard(); transaction is a no-op | node halts at last committed block | live-vote stream stalls; peers reset streams after timeout | operator action; node restarts and re-syncs from peers |stream.CancelWrite(STREAM_PROTOCOL_VIOLATION) | peer reputation decrement; other streams survive |sha256(block.Bytes()) |The reorg row is not a stub: under Q-Chain finality, a block at
height H is committed to ZapDB only after the QuasarCert at height H
has satisfied its cert-profile policy. The cert is a function of the
block bytes; the block bytes are content-addressed. No alternative
chain at height H can satisfy the same cert (the BLS leg alone makes
that infeasible against a 2/3-honest validator set). There is no
codepath in the executor that accepts a competing block at a
finalized height. The pre-genesis network's reorg surface is gone
because the pre-genesis network is gone.
ZapDB's MVCC primitive admits speculative execution at O(1) snapshot
cost. The pattern:
// Speculatively execute two candidate blocks while consensus decides.
// MVCC snapshots are O(1) — version pointer release on the prior root.
snapA := state.NewSnapshot()
snapB := state.NewSnapshot()
var wg sync.WaitGroup
wg.Add(2)
go func() { defer wg.Done(); applyBlock(blockA, snapA) }()
go func() { defer wg.Done(); applyBlock(blockB, snapB) }()
wg.Wait()
winner := <-consensusDecision // arrives ~350 ms later (Corona leg)
if winner.ID == blockA.ID {
snapA.Commit()
snapB.Discard()
} else {
snapA.Discard()
snapB.Commit()
}
Cost model. Speculation cost = O(executed-tx-count) for each
losing branch. The losing snapshot's diffs are released by
Discard() at O(1); the wasted CPU is bounded by the speculative
fan-out width times the block's tx count. The win is that the
winning branch is already committed-ready when consensus returns, so
the apparent latency from consensusDecision arrival to ZapDB commit
is max(0, applyTime − consensusTime) instead of `applyTime +
consensusTime`. On typical workloads this hides one full apply
latency per round.
Invariant. The buffer that applyBlock reads is the same
immutable buffer that the verifier read, that the consensus engine
hashed for the transcript, that ZapDB will write. No re-marshal,
no copy. N speculative branches share N pointers to one buffer.
Bounds. Speculative fan-out width is bounded by the proposer set
size — there are at most |proposers| candidate blocks at any height.
Concrete implementations should cap speculative width at a small
constant (the reference implementation caps at 3) to bound memory.
QUIC's stream model is the substrate for atomic unwind at the network
layer. The invariants:
stream.CancelWrite(code) | one stream, sender → receiver | unaffected |stream.StopSending(code) | one stream, receiver → sender | unaffected |RESET_STREAM frame | one stream, abrupt termination | unaffected |STREAM_DATA_BLOCKED | one stream, flow control | unaffected |STOP_SENDING | one stream | unaffected |The stream-level errors map to consumer unwind primitives:
STREAM_DATA_BLOCKED | back off; let flow-control window open |RESET_STREAM | drop partial buffer; if request-stream, mark peer down for this request and try another |STOP_SENDING | stop transmitting; treat the request as canceled by peer |FLOW_CONTROL_ERROR | bug; close the connection (consumer code violated the contract) |STREAM_LIMIT_ERROR | back off; we exceeded the peer's concurrent stream limit |Every stream type defined in LP-201 (stream-type bytes 0xD0..0xDF)
maps its application-level errors onto these QUIC primitives. No
custom error envelope. No correlation table. No retry buffer above
the transport.
QuasarCert's leg composition is defined in LP-182 (wire format) and
its operator-facing posture is CertPolicy (LP-217 §"Operator
config" — the single record that pins Mode / Variant /
TimeoutMs / Fallback). LP-017 (historical) is the v1 codename
source that LP-217 supersedes. This LP specifies the temporal
behavior of the cert as legs arrive — the static policy is in LP-217;
the timing is here.
The round deadline this section refers to is the chain's
CertPolicy.TimeoutMs. The tier this section refers to when "the
cert degrades to a lower mode" is the chain's CertPolicy.Fallback.
Both fields are read from the chain's genesis quasar.cert_policy
block. There is no standalone cert_timeout_ms knob; the timeout is
a field of CertPolicy, validated at chain launch (LP-217
§"Validation rules" rule 3 — `TimeoutMs >= 2 × expected_floor_latency
(Mode)`).
Reference timeline for a single round at a 40-validator network
(mapped to LP-217 modes; v1 codename in parentheses):
t = 0 ms Round R starts; proposer broadcasts block hash
t = ~10 ms BLS aggregate complete → cert observable at PQ-off (was BLS-only / Pulsar-profile preliminary)
t = ~15 ms Pulsar aggregate complete → cert observable at PQ-fast (was Pulsar profile)
t = ~350 ms Corona R2 complete → cert observable at PQ-strict (was Aurora profile)
t = ~400 ms Magnetar SLH-DSA complete → cert observable at PQ-heavy (was Polaris profile)
The block is unchanged across these events; the cert wire format is
unchanged; only the cert's posture relative to the LP-217 cert-mode
policy upgrades. A relying party at PQ-fast finalizes at t≈15 ms;
one at PQ-strict finalizes at t≈350 ms; one at PQ-heavy finalizes
at t≈400 ms. All three are looking at the same block, the same cert
struct, the same chain.
Leg failure does not abandon the round. If, say, the Corona leg
fails to gather sufficient sigs by its deadline (insufficient
threshold participation), the cert is published with an empty
Corona field. Relying parties whose mode requires Corona (PQ-strict,
PQ-heavy, strict-PQ-strict, strict-PQ-heavy) wait or fall back
to the previous height; relying parties at PQ-fast finalize
normally. Consensus liveness is preserved by the surviving legs. This
is the operational meaning of "degradation tier" — the cert never
blocks on the slowest primitive.
If a light client requests a block but the serving peer disconnects
mid-stream, the recovery path is content-addressable, not
peer-dependent.
1. Partial buffer is GC'd. No allocations leak; the ZAP buffer
is ref-counted and a partial buffer with no live readers is
reclaimed at the next GC cycle. There is no codec partial-state
to clean up because there is no codec.
2. Re-fetch via Kademlia DHT keyed by sha256(block.Bytes()).
LP-201 Layer C provides the DHT lookup; the block's identity is
its sha256, which is the same hash the proposer signed and the
cert covers.
3. Multiple peers serve the same content. DHT replication factor
k=20 (LP-201 standard parameters) means up to 20 peers can be
sources for the same block hash.
4. Chunked parallel pull. A light client can pull 64-KiB chunks
in parallel from k different peers. Each chunk is sha256-verified
against a Merkle commitment in the block header before insertion
into the receive buffer. Out-of-order chunk arrival is fine; the
buffer is reassembled by offset.
5. Final buffer is byte-identical to what the original peer
would have served, regardless of chunk arrival order. This is a
conformance requirement — see Cross-impl test vectors.
A conformant implementation in Go, Python, Rust, C++, or C must
reproduce the following four behaviors byte-identically:
1. MVCC unwind. Given a fixed sequence of `(insert, snapshot,
insert, discard)` operations against a ZapDB instance with a
pinned random seed for compaction ordering, the post-discard
state root is byte-identical across implementations.
2. Cert leg degradation determinism. Given a fixed schedule of
per-leg arrival timestamps `(t_BLS, t_Pulsar, t_Corona,
t_Magnetar)`, the sequence of cert-tier transitions produced by
the relying-party policy at each timestamp is identical across
implementations.
3. QUIC stream reset semantics. Given a fixed sequence of
(open_stream, write_n_bytes, CancelWrite(code)) operations,
the receiver-observed event sequence (`bytes_received,
RESET_STREAM(code)`) is identical across implementations. This
tests that the application code above the transport reacts the
same way regardless of host QUIC library (quic-go vs quiche vs
msquic vs picoquic).
4. DHT chunk re-fetch. Given a block split into N=64-KiB chunks
with a published Merkle commitment, and a fixed adversarial
chunk-arrival schedule (out-of-order, with duplicates, with
bogus chunks injected by adversarial peers), the final
reassembled buffer is byte-identical to the original block bytes
across implementations.
Test vector format: JSONL files in
~/work/lux/consensus/test/vectors/pipeline/. Each line is one
test case with input fields and expected output. CI runs the
conformance harness against every implementation tagged for v1.
Pipelining vs single-codec serial. Numbers from luxd
microbenchmarks against the legacy linearcodec-based stack; the
serial-baseline column is the legacy implementation, not a worst
case.
The consensus row deserves a note: the 14 s serial-aggregate
baseline is the legacy Corona serial implementation, which was
itself a bug per the LP-200 measurement; the honest comparison
point against a correctly-implemented serial-aggregate consensus
would be smaller. The pipelining win that this LP claims is the
parallelism gate, not the Corona serial-vs-parallel gate. The
speculative-execution row is the load-bearing one: speculation is
only operationally usable when the per-branch cost is
pointer-share, not codec round-trip.
this whole contract possible)
PQ-off / PQ-fast / PQ-strict / PQ-heavy — whose tiered observability
this LP times)
v1 codenames Pulsar / Aurora / Polaris remain as internal identifiers
per LP-217 §"Mode-to-internal mapping")
tiered modes this LP times)
this LP relies on for network-layer unwind)