Draft. No backwards compatibility. No flag day.
Activated at the genesis of the new final Lux network: **2025-12-25
16:20 Pacific (unix 1766708400)**. The pre-Quasar Edition Lux
network (2020–2025) is round-blocking by construction and is a separate
network out of scope for this LP.
Quasar today is round-blocking: round N must finalize before round N+1
starts. The proposer cannot begin selecting candidate transactions for
N+1 until N's QuasarCert has aggregated. For a PQ-heavy-mode chain
(LP-217; was "Polaris" in LP-017) with cert wall-clock ~400 ms, this
caps single-leader throughput at
2.5 blocks/sec regardless of how fast the underlying stages are.
Pipelined Quasar removes the floor. Three blocks are in flight
simultaneously: block N is in the Sign stage while N+1 is in the Build
stage while N+2 is in the Select stage. Each stage runs against an
orthogonal subsystem — Select touches mempool, Build touches ZAP
encoder + Block-STM speculator, Sign touches the cert legs (LP-217
mode) — so they execute in parallel without cross-stage locks. Per-block
wall-clock collapses from sum(stages) to max(stages).
The depth bound is set by network RTT and consensus-round
independence: depth N requires N consecutive different leaders so that
a single byzantine proposer cannot stall the pipeline by holding all
N blocks. The reference implementation pipelines at depth 3, which
matches the LP-202 block-production stage list (Select / Build / Sign)
and saturates against a single-leader-per-block round-robin rotation.
The wire format is unchanged. The QuasarCert struct (LP-182) is the
same shape per block; the cert-mode (LP-217) is the same per chain;
the round-digest binding (LP-077) is the same per height. Pipelined
Quasar is a protocol-orchestration change: it specifies WHEN a stage
runs relative to other in-flight blocks, not what it produces.
This LP composes with the DAG mempool (LP-208 + LP-209) without
conflict: pipelined Quasar specifies the leader's pipeline; DAG
mempool specifies where the txs come from. A single-leader pipelined
Quasar consuming from a leaderless DAG mempool is the operational
shape of the billion-user sub-millisecond path together with LP-202
(atomic unwind), LP-203 (GPU verify), and LP-217 (PQ-off mode for the
HFT critical path).
Three concrete bottlenecks of round-blocking Quasar at billion-user
scale:
1. Per-block latency dominates throughput. A PQ-heavy-mode chain
(LP-217; was "Polaris" in LP-017) at ~400 ms cert wall-clock is
throughput-capped at 2.5 blocks/sec. If
each block holds 1000 txs, that is 2,500 tps single-leader
regardless of how fast the matching engine, block-STM executor, or
GPU verify is. The bottleneck is not capacity — it is the serial
per-block wait on the cert.
2. Stage utilization is poor. During the Sign stage of block N,
the mempool, the Block-STM speculator, and the ZAP encoder all sit
idle. The cert-leg threshold signing is not bottlenecked on those
subsystems. Round-blocking wastes ~80% of single-validator wall
clock.
3. Leader rotation cost is paid per block. A new leader takes the
ZAP buffer's first byte for its block at round-start. Round-blocking
means the leader switch happens N times for N blocks; pipelining
means the leader-rotation happens at N consecutive heights in
parallel and the switch cost is amortized over the in-flight depth.
Pipelining is the standard HotStuff/PBFT throughput pattern (Buterin,
Castro-Liskov, the HotStuff-1B paper) applied to Quasar's specific
4-leg cert composition. The novelty here is not the pipeline
technique — it is the composition with LP-202's atomic-unwind
primitives, LP-203's GPU sign pipeline, and LP-217's mode-tier
degradation, so that pipeline failures unwind cleanly without manual
coordination.
Three stages, one stage per in-flight block. Each stage is defined by
its trigger, its work, the subsystem it touches, and its independence
properties relative to the other two stages.
The stage list maps 1:1 onto LP-202's "Block production pipeline"
(3-stage). LP-202 specifies the per-block pipeline; this LP specifies
the multi-block pipeline composed of multiple per-block pipelines
running concurrently.
Pipeline depth is the number of consecutive blocks in flight at any
instant. Depth = 3 is the reference.
Reference cap = 3. Depth 3 saturates the per-block stage list. A
fourth in-flight block would either repeat a stage (two Select stages
operating on different blocks) or operate ahead of leader rotation.
The first case is bounded by stage independence; the second is
bounded by the leader-rotation contract.
Depth limit. The hard depth bound is:
depth_max = min(
cert_round_independence, // how many heights have non-overlapping cert deps
leader_rotation_independence, // how many heights have different leaders
mempool_speculation_width // how many speculative branches the mempool can pin
)
For Lux validator sets with round-robin leader rotation, each of these
is at least N (validator count). For VRF-weighted random rotation, the
minimum is min(N, expected_consecutive_distinct_leaders). Depth 3 is
safe against both rotation schemes for any N ≥ 4.
Pipelined Quasar requires that every in-flight block has a different
leader. Otherwise a single byzantine leader can stall the pipeline by
withholding its block at multiple in-flight heights.
Round-robin rotation. Leader at height H is
validators[H mod N]. Pipeline depth 3 requires
validators[H], validators[H+1], validators[H+2] are distinct, which
holds for any N ≥ 3.
VRF-weighted random rotation. Leader at height H is sampled from
the stake-weighted validator set via VRF over the previous block's
randomness beacon. Pipeline depth 3 requires three consecutive distinct
samples; on stake-weighted draws this holds with probability
1 − (stake_top / total_stake)^2 per step. For a sane stake
distribution (top validator ≤ 1/3 of total) this is > 89% per height.
When the rotation produces a non-distinct leader at depth, the
pipeline shortens: the second instance of that leader's block waits
for its predecessor to complete Sign before entering its own Build
stage. Shortened-depth pipelining is throughput-degraded but
liveness-preserving.
Leader failure. If the in-flight leader fails to propose
(crash, partition, byzantine withhold), the next height's leader takes
over via the standard Quasar timeout path (LP-017 §Round-timeout —
historical; LP-217 inherits this timeout contract unchanged). The
pipeline shortens by 1 for that span. The atomic-unwind primitive
(LP-202 round.Abandon()) discards the in-flight Select/Build state
for the failed height; the cert-leg aggregators GC their partial
state. No cross-height coordination needed.
round.Abandon() discards build state | Quorum override at LP-202 lower tier; next leader at H+1 proposes |cert_timeout_ms | Cert published at LP-217 lower tier; block finalizes at the highest mode whose required-leg set is satisfied | LP-202 tier-degradation contract — no pipeline action needed |The failure-mode contract: any single-block failure unwinds via LP-202
primitives and does not propagate to other in-flight blocks. A
partition-class failure freezes the whole pipeline at its current
stage configuration; healing restarts from the frozen position
without re-doing the completed stages.
The throughput model for pipelined Quasar at depth D:
throughput = D / sum(stage_times) (round-blocking baseline if D=1)
= 1 / max(stage_times) (steady-state at D = number_of_stages = 3)
Reference numbers from the LP-217 mode table (Blackwell, N=64
validators) crossed with the LP-202 pipeline-depth saturation:
Numbers above the PQ-strict row are GPU-saturated (LP-203 measured);
PQ-strict and PQ-heavy are projected based on LP-203's Pulsar +
Corona + Magnetar bench rows extrapolated to depth 3. The Select+Build
~3 ms estimate is the Block-STM ordering pass (LP-010 measured at
~1.2 ms for a 1000-tx block on M4 Max) plus the ZAP encoder
(measured at ~150 µs per block of 1000 txs) plus the mempool sample
pass (~1 ms heap traversal).
At PQ-off mode with the projected ~1 ms cert wall-clock, pipelining
takes single-leader throughput from 1000 blocks/sec round-blocking to
~330 blocks/sec at depth 3 — wait, that is slower. Re-reading: at
depth 3 the throughput is 1 / max(stages) = 1 / 3 ms ≈ 333 blocks/sec.
Round-blocking at PQ-off would be 1 / (1 ms + 3 ms) = 250 blocks/sec.
Pipelining wins by ~33% in this mode; the win scales with the ratio of
cert-time to non-cert-time. At PQ-heavy mode with 80 ms cert and 3 ms
prep, depth-3 pipelining yields 1 / 80 ms ≈ 12 blocks/sec versus
round-blocking 1 / 83 ms ≈ 12 blocks/sec — essentially the same,
because cert dominates.
Sweet spot. Pipelining wins are largest when stage times are
balanced. The LP-217 PQ-fast mode is the sweet spot: cert
wall-clock ~5 ms versus prep ~3 ms gives depth-3 throughput
1 / 5 ms = 200 blocks/sec vs round-blocking `1 / 8 ms = 125
blocks/sec` — a 60% win.
Beyond depth 3. Depth 4+ does not gain throughput because there
are only 3 stages. Additional depth would require either pipelining
across heights at the cert level (out of scope; would be a HotStuff-2B
4-phase pipeline) or restructuring the per-block stage list. Neither
is in scope for this LP.
Projected. All numbers in this section are projected from the LP-202
+ LP-203 + LP-217 measured rows. Measured pipelined-Quasar throughput
will land in the LP-203 bench addendum once the pipelined-Quasar
implementation tag ships.
LP-202 specifies the per-block atomic-unwind primitives. Pipelined
Quasar uses every one of them:
txn.Discard() on Block-STM transaction |stream.CancelWrite(STREAM_PROTOCOL_VIOLATION) on the partial block frame |The composition is the load-bearing claim. Pipelined Quasar is
not a new consensus protocol — it is LP-202's per-block pipeline
contract instantiated at N=3 with leader rotation interleaving the
heights. The unwind primitives at each stage are unchanged. The
QuasarCert struct is unchanged. The relying-party policy (LP-217 mode)
is unchanged. Only the orchestration — which stage runs against which
block at which wall-clock — is new.
This LP introduces no new wire types. The block frame is the LP-186
block-VM frame; the cert is the LP-182 QuasarCert struct; the
round-digest is LP-077. Schema IDs 0xD0..0xDF (LP-201) and the
schema IDs in LP-200 are unaffected.
The pipeline-depth knob is a runtime config setting:
quasar:
pipeline_depth: 3
# default 3; minimum 1 (round-blocking); maximum bounded by
# leader-rotation-independence (see "Leader rotation" above)
The knob is per-chain at genesis and is upgradable via the chain's
governance path (analog to LP-217 mode upgrade). Validators in a
chain must agree on the pipeline depth; mixed-depth validators within
a single chain would race on stage entry. The depth field is
broadcast in the LP-022 Handshake message under a new optional
field (no schema bump — backwards-tolerant field at the end of the
handshake payload).
activates: 2025-12-25T16:20:00-08:00
activates-unix: 1766708400
Pipelined Quasar is the default at activation. Round-blocking
(pipeline_depth: 1) remains a valid config for chains that
explicitly disable pipelining — typically PQ-heavy chains where the
cert dominates the wall-clock and pipelining gains are negligible.
The pipeline orchestrator lives under
~/work/lux/consensus/protocol/quasar/pipeline/:
pipeline.go — Pipeline struct: 3-stage state machine; onegoroutine per in-flight block; stage entry gated on the
predecessor's stage exit.
stages.go — Select, Build, Sign stage interfaces; each stage is a func(context.Context, *Block) (*Block, error).
rotation.go — Leader-rotation independence checker; rejectsdepth > rotation-independence at config load.
metrics.go — Per-stage wall-clock histograms exposed viaPrometheus.
Stage implementations are the existing Quasar code:
~/work/lux/consensus/protocol/quasar/select.go(mempool sample + Block-STM speculate)
~/work/lux/consensus/protocol/quasar/build.go(ZAP encoder + round-digest binder)
~/work/lux/consensus/protocol/quasar/sign.go(cert-leg ceremony per LP-217 mode)
The orchestrator composes the existing implementations; no per-stage
code is rewritten for this LP.
A conformant implementation MUST:
1. Accept quasar.pipeline_depth ∈ {1, 2, 3} in config; reject
higher values that exceed the chain's leader-rotation independence.
2. At depth 3, produce 3 valid blocks per max(stage_times) wall-clock
in steady state.
3. On any in-flight stage failure, unwind via the LP-202 primitive
listed in §"Composition with LP-202"; other in-flight blocks
continue.
4. On leader-rotation collision (non-distinct leader at depth),
shorten the pipeline transparently without violating liveness.
5. Produce a QuasarCert at every height whose mode satisfies
cert_mode configured in LP-217.
Conformance test vectors land at
~/work/lux/consensus/test/vectors/pipeline/quasar-pipelined.jsonl.
PQ-heavy); this LP pipelines the block production around the
cert-leg ceremony at whichever mode the chain is configured to.
Pulsar / Aurora / Polaris are now internal identifiers superseded by
LP-217 cert modes).
top of the LP-020 round structure.
stage.
finality commits.
legs reset without disturbing other blocks.
composes at N=3.
Blackwell.
knob.
from a DAG mempool (LP-208 changes WHERE txs come from, not how
blocks are signed).
parameterized by the chosen cert mode.
with this LP — DAG mempool replaces the Select stage's mempool with
a Narwhal-style DAG.
with this LP — Mysticeti's ordered output replaces the Select
stage's tx-ordering substep.
txs in the Select stage of block N+2 while N is being signed;
commit on Sign of N+2.
shard interlocks with the LP-205 pipeline at peer shards via 2PC
over the cert layer.
Copyright and related rights waived via CC0.