LPsLux Proposals
EVM & Execution
LP-3621

Stage Sync Pipeline for Coreth

Review

Study and prototype a staged sync architecture, inspired by Erigon, to accelerate and optimize the C-Chain initial sync in Lux's geth fork

Category
Core
Created
2025-07-24

Abstract

This LP proposes a feasibility study and benchmarking plan to refactor our Lux-forked geth client (coreth) into a staged pipeline—modeled after Erigon’s Stage Sync—to boost initial sync performance, reduce resource spikes, and improve observability. By decoupling header fetch, body fetch, transaction execution, trie updates, and index building into discrete stages, we expect significant efficiency gains.

Motivation

Our current bootstrapping approach tightly couples all work—network download, verification, execution, and indexing—into a single loop. This monolithic design leads to:

  • Long cold-start sync times (days for full nodes)
  • Burst-y CPU and disk I/O spikes
  • Limited insight into per-phase performance bottlenecks

Erigon demonstrated that a staged architecture yields faster, steadier syncs and better resource utilization by organizing work into independent stages with lightweight handoff buffers.

Specification

We will conduct an iterative study comprising:

1. Prototype Stage Loop Framework

  • Introduce a stages/ package hosting a StageLoop runner
  • Define interfaces for Stage: methods Run(*Context) error and Ingest(data)
  • Implement ring-buffer exchanges for header and body I/O

2. Implement Core Stages

Stage NameResponsibility
HeadersDownload, verify, and persist block headers
BlockHashesBuild inverted hash→height index
BodiesDownload, verify, and persist block bodies & transactions
SendersBatch ECDSA sender recovery
ExecutionExecute EVM transactions, record change-sets & receipts
HashedStateConvert change-sets into keccak’d state entries
TrieUpdateIncremental Merkle trie updates and state-root compute
IndexBuildersBuild Log, Receipt, and TxLookup indices

3. Benchmark and Profiling Plan

MetricTarget
Total Sync TimeTime to reach tip from genesis
Max Memory UsagePeak RSS during sync
Disk I/O ThroughputRead/write MB/s per stage
CPU Utilization% CPU per stage
DB Operation LatencyAvg latency for table writes
Stage LatencyTime spent in each Stage.Run method

Collect metrics via Prometheus/Grafana and pprof, comparing against the legacy monolithic sync.

4. Data-Driven Iteration

  • Identify hot stages and optimize batch sizes or concurrency
  • Tune ring-buffer sizes to balance throughput vs. memory
  • Introduce backpressure for network vs. disk workloads

Rationale

Erigon’s Stage Sync is a proven blueprint for achieving faster, more stable bootstrapping. A staged pipeline will improve developer and validator UX, reduce infrastructure costs, and provide the observability needed to prioritize further optimizations.

Backwards Compatibility

The staged sync is fully additive. Nodes started without --staged-sync flag use the legacy path, preserving all existing behaviors.

Implementation

Staged Sync Pipeline Architecture

Location: ~/work/lux/node/evm/stages/ GitHub: github.com/luxfi/node/tree/main/evm/stages

Core Stage Components:

Stage Loop Driver:

// From driver.go
type StageLoop struct {
    stages []Stage
    buffers map[StageID]chan interface{}
    metrics *StageMetrics
}

func (sl *StageLoop) Run(ctx context.Context) error {
    for {
        for _, stage := range sl.stages {
            if err := stage.Run(ctx, sl); err != nil {
                return err
            }
        }

        // Check for backpressure
        select {
        case <-ctx.Done():
            return ctx.Err()
        default:
        }
    }
}

Testing:

cd ~/work/lux/node
go test ./evm/stages/... -v -bench=. -benchmem

# Sync benchmark with metrics
cd ~/work/lux/node
go test -run TestStagedSync -bench=BenchmarkStagedSync -benchmem ./evm/stages/...

Performance Metrics

Per-stage latency benchmarks (measured on reference hardware):

StageLatency/BlockThroughput
Headers~500 µs2,000 headers/s
Bodies~1,500 µs667 bodies/s
Senders~800 µs1,250 batches/s
Execution~8,000 µs125 blocks/s
HashedState~2,000 µs500 blocks/s
TrieUpdate~3,000 µs333 blocks/s
Indexes~1,000 µs1,000 blocks/s

Test Cases

  1. Baseline Comparison: sync full history on fresh disk with/without staged-sync; compare all metrics
  2. Stage Resilience: inject header/body fetch failures; ensure recovery
  3. Resource Profiles: capture pprof and per-stage Prometheus metrics
  4. Backpressure: verify stage coordination and ring-buffer overflow handling
  5. Concurrent Stages: test parallel execution where applicable (headers + bodies)

Reference Implementation

See prototype code under node/evm/stages/ in the Lux repo:

node/evm/stages/
├─ driver.go
├─ headers.go
├─ bodies.go
├─ execution.go
├─ state.go
├─ indexes.go
└─ stages_test.go

Security Considerations

  • Bound stage exchanges to prevent unlimited memory consumption
  • Verify RLP and signature checks within each stage before persisting
  • Enforce gas and time limits on index-building loops to avoid DoS

Economic Impact (optional)

Faster sync lowers hardware requirements and validator operational costs, broadening decentralization.

Open Questions (optional)

  1. Optimal ring-buffer sizes per stage?
  2. Safe concurrency models for trie updates?
  3. Stage-loop scheduling: sequential vs concurrent?

Copyright and related rights waived via CC0.