Q1: What happens from when a new block arrives to when it’s persisted on disk?
Overview
Block arrives (network / Engine API) │ ▼InsertChain() ├─ Contiguity check (sequential block numbers? parent hashes match?) ├─ Acquire write lock (chainmu) │ │ Three things start concurrently: │ ├─ Background: ECDSA signature recovery (most CPU-intensive) │ ├─ Background: engine.VerifyHeaders() (consensus rule checks) │ └─ Foreground: process blocks one by one ↓ │ ▼ProcessBlock() (for each block) ├─ ① Create StateDB from parent's state root ├─ ② processor.Process() — execute all transactions (Chapter 6) ├─ ③ validator.ValidateState() — verify gas, receipt root, state root └─ ④ Write to disk ↓ │ ▼writeBlockAndSetHead() ├─ writeBlockWithState() — persist block + state ├─ reorg() (if needed) — switch canonical chain ├─ writeHeadBlock() — update head pointers └─ Emit events (ChainEvent, ChainHeadEvent)Parallelization design
Three things happen concurrently to maximize throughput:
Signature recovery runs in background goroutines, performing ECDSA recovery on all transactions. This is the most CPU-intensive part of validation — starting early means results are ready by the time execution needs them.
Header verification also runs in parallel. The consensus engine checks each header’s fields (timestamp, gas limit, base fee, difficulty…). Results arrive via channel to the main loop.
Block execution must be sequential — each block’s state depends on the previous block’s result, so it cannot be parallelized.
Three layers of persistence
writeBlockWithState() writes across three layers:
Layer 1: Block data (atomic batch) ├─ rawdb.WriteBlock() — header + body ├─ rawdb.WriteReceipts() — receipts └─ rawdb.WritePreimages() — preimage mappings → batch.Write() (all succeed or all fail)
Layer 2: State commit └─ statedb.Commit() — dirty state → trie database
Layer 3: Trie GC ├─ Archive node: flush to disk every block └─ Full node: keep last 128 blocks' tries in memory, older ones eligible for GCUpdating the chain head
writeHeadBlock() atomically writes five markers:
batch := bc.db.NewBatch()rawdb.WriteHeadHeaderHash(batch, block.Hash()) // head headerrawdb.WriteHeadFastBlockHash(batch, block.Hash()) // snap sync headrawdb.WriteCanonicalHash(batch, block.Hash(), num) // block number → hash mappingrawdb.WriteTxLookupEntriesByBlock(batch, block) // tx hash → block numberrawdb.WriteHeadBlockHash(batch, block.Hash()) // head blockbatch.Write()
// Then update in-memory atomic pointersbc.currentBlock.Store(block.Header())Because the pointers are atomic.Pointer, readers (RPC, tx pool, etc.) immediately see the new head without waiting for locks.
Engine API’s two-step path
The Engine API uses a slightly different path:
NewPayload → InsertBlockWithoutSetHead() — validate + store, don't move headForkchoiceUpdated → SetCanonical() — move head to specified blockWhy two steps? Because the CL may ask you to validate multiple blocks before telling you which one is canonical. Block validation and chain head selection are decoupled.
Q2: How does chain reorganization (reorg) work?
When does a reorg happen?
When a new block’s parent is not the current chain head, a reorg is needed. For example:
Current canonical chain: 1 → 2 → 3 → A4 → A5 (current head)
New block B5 arrives, parent is B4, B4's parent is 3:
┌→ A4 → A5 (old head)1 → 2 → 3 ──┤ └→ B4 → B5 (new head)Geth needs to switch the canonical chain from fork A to fork B.
Algorithm: finding the common ancestor
Step 1: Bring both chains to the same height Old chain: A5 (height 5) New chain: B5 (height 5) → Already equal, skip
Step 2: Walk both back simultaneously until hashes match A5 vs B5 → different, continue A4 vs B4 → different, continue 3 vs 3 → match! Common ancestor = block 3
Result: Old chain only: [A4, A5] New chain only: [B4, B5]If the two chains have different heights, the longer one is walked back to match the shorter one first, then both are walked back in parallel.
The switch process
After finding the common ancestor, five steps execute the switch:
1. Collect logs from old chain blocks → send RemovedLogsEvent (notify subscribers these logs are no longer valid)
2. Collect transaction hashes from old chain blocks → deletion list
3. Iterate new chain blocks in forward order, for each: ├─ writeHeadBlock() — update canonical hash mapping and tx lookup index └─ Collect reborn logs → send to logsFeed
4. Clean up tx index: Deleted txs = old chain txs - new chain txs (some txs may exist in both chains — don't accidentally delete those)
5. Clear tx lookup LRU cache (may hold stale data)Concrete example
Old chain: block A4 contains tx1, tx2; block A5 contains tx3New chain: block B4 contains tx1, tx4; block B5 contains tx5
Old chain txs: {tx1, tx2, tx3}New chain txs: {tx1, tx4, tx5}
Need to delete index for: {tx2, tx3} (old-chain-only)tx1 exists in both chains → keeptx4, tx5 are new → create indexInteraction with the transaction pool
After reorg completes, ChainHeadEvent is sent. The transaction pool receives it and executes Reset() (covered in Chapter 8):
TxPool Reset: 1. Find common ancestor (similar logic to reorg) 2. Transactions from old-chain-only blocks → re-inject into pool 3. promoteExecutables() → promote executable transactions 4. demoteUnexecutables() → demote invalid transactionsSo tx2 and tx3 from A4/A5 don’t vanish — they return to the transaction pool, waiting to be re-included in future blocks.
Large reorg warning
Reorgs deeper than 63 blocks trigger log.Warn("Large chain reorg detected") to alert operators. Under normal conditions, reorgs are only 1-2 blocks deep; excessive depth usually indicates network issues or consensus failures.
Some information may be outdated






