Geth(10) The Blockchain - Kehao Zheng's Website

The previous chapters showed how blocks are built (Chapter 09) and how transactions are executed (Chapter 06). This chapter covers what happens after a block is ready: how geth inserts it into the chain, decides which fork is canonical, handles reorganizations, and persists everything to disk. The central type is BlockChain in core/blockchain.go — the orchestrator that ties together the consensus engine, the state database, the trie layer, and the on-disk storage.

The Block Insertion Pipeline#

When a new block arrives — whether from the network, the consensus layer’s NewPayload, or local block building — it travels through this pipeline:

1
 InsertChain(blocks)
2
      |
3
      v
4
 1. Sanity checks          ── contiguous? linked? chain not stopped?
5
      |
6
      v
7
 2. Parallel header verify ── engine.VerifyHeaders() runs concurrently
8
      |
9
      v
10
 3. Skip known blocks      ── already in DB and behind current head?
11
      |
12
      v
13
 4. For each new block:
14
      |
15
      +-- ProcessBlock()
16
      |     |
17
      |     +-- a. Create StateDB from parent root
18
      |     +-- b. Prefetch state in background goroutine
19
      |     +-- c. processor.Process()     ── execute all transactions
20
      |     +-- d. validator.ValidateState() ── check roots, gas, bloom
21
      |     |
22
      |     +-- e. writeBlockAndSetHead()
23
      |           |
24
      |           +-- writeBlockWithState() ── persist block + state
25
      |           +-- reorg() (if needed)   ── switch canonical chain
26
      |           +-- writeHeadBlock()      ── update head pointers
27
      |           +-- emit events           ── ChainEvent, logs, ChainHeadEvent
28
      |
29
      v
30
 5. Fire accumulated ChainHeadEvent

Each stage is covered in detail below.

The BlockChain Struct#

The BlockChain struct is the central manager for the canonical chain. It holds references to every subsystem that touches block storage and validation:

1
type BlockChain struct {
2
    chainConfig *params.ChainConfig // Chain & network configuration
3
    cfg         *BlockChainConfig   // Blockchain configuration
4

5
    db            ethdb.Database                   // Low level persistent database
6
    snaps         *snapshot.Tree                   // Snapshot tree for fast trie leaf access
7
    triegc        *prque.Prque[int64, common.Hash] // Priority queue mapping block numbers to tries to gc
8
    gcproc        time.Duration                    // Accumulates canonical block processing for trie dumping
9
    lastWrite     uint64                           // Last block when the state was flushed
10
    flushInterval atomic.Int64                     // Time interval after which to flush a state
11
    triedb        *triedb.Database                 // Trie node database handler
12
    statedb       *state.CachingDB                 // State database with caching
13
    txIndexer     *txIndexer                       // Transaction indexer (optional)
14

15
    hc               *HeaderChain
16
    rmLogsFeed       event.Feed
17
    chainFeed        event.Feed
18
    chainHeadFeed    event.Feed
19
    logsFeed         event.Feed
20
    blockProcFeed    event.Feed
21
    // ...
22
    genesisBlock     *types.Block
23

24
    chainmu *syncx.ClosableMutex // Synchronizes chain write operations
25

26
    currentBlock      atomic.Pointer[types.Header] // Current head of the chain
27
    currentSnapBlock  atomic.Pointer[types.Header] // Current head of snap-sync
28
    currentFinalBlock atomic.Pointer[types.Header] // Latest (consensus) finalized block
29
    currentSafeBlock  atomic.Pointer[types.Header] // Latest (consensus) safe block
30

31
    bodyCache     *lru.Cache[common.Hash, *types.Body]
32
    bodyRLPCache  *lru.Cache[common.Hash, rlp.RawValue]
33
    receiptsCache *lru.Cache[common.Hash, []*types.Receipt]
34
    blockCache    *lru.Cache[common.Hash, *types.Block]
35
    txLookupCache *lru.Cache[common.Hash, txLookup]
36

37
    stopping      atomic.Bool // true when chain is stopped
38
    procInterrupt atomic.Bool // interrupt signaler for block processing
39

40
    engine     consensus.Engine
41
    validator  Validator
42
    prefetcher Prefetcher
43
    processor  Processor
44
    // ...
45
}

The fields group into several roles:

Role	Key Fields
Persistence	`db` (key-value store), `triedb` (trie nodes), `statedb` (state cache)
Chain heads	`currentBlock`, `currentSnapBlock`, `currentFinalBlock`, `currentSafeBlock` — all `atomic.Pointer[types.Header]`
Concurrency	`chainmu` (write lock), `stopping` / `procInterrupt` (shutdown signals)
Caches	LRU caches for bodies, receipts, blocks, tx lookups
Events	`chainFeed`, `chainHeadFeed`, `logsFeed`, `rmLogsFeed` — pub-sub feeds
Validation	`engine` (consensus), `validator` (block/state), `processor` (tx execution)
GC	`triegc` (priority queue), `gcproc`, `lastWrite`, `flushInterval`

The four head pointers deserve special attention. They are all atomic.Pointer so readers can access them lock-free:

currentBlock — the latest fully-validated block. This is the canonical chain tip.
currentSnapBlock — the latest block whose state was downloaded during snap sync (may be ahead of currentBlock during sync).
currentFinalBlock — the latest block marked as finalized by the consensus layer. Once finalized, this block and its ancestors will never be reverted.
currentSafeBlock — the latest block the consensus layer considers safe (very unlikely to be reverted, but not yet finalized).

Initialization: NewBlockChain#

The constructor wires together every component:

1
func NewBlockChain(db ethdb.Database, genesis *Genesis, engine consensus.Engine,
2
    cfg *BlockChainConfig) (*BlockChain, error) {
3
    // ...
4
    // 1. Open trie database
5
    triedb := triedb.NewDatabase(db, cfg.triedbConfig(enableVerkle))
6

7
    // 2. Write or verify genesis block
8
    chainConfig, genesisHash, compatErr, err := SetupGenesisBlockWithOverride(
9
        db, triedb, genesis, cfg.Overrides)
10

11
    // 3. Allocate the BlockChain struct with LRU caches
12
    bc := &BlockChain{
13
        chainConfig:   chainConfig,
14
        db:            db,
15
        triedb:        triedb,
16
        triegc:        prque.New[int64, common.Hash](nil),
17
        chainmu:       syncx.NewClosableMutex(),
18
        bodyCache:     lru.NewCache[common.Hash, *types.Body](bodyCacheLimit),
19
        receiptsCache: lru.NewCache[common.Hash, []*types.Receipt](receiptsCacheLimit),
20
        blockCache:    lru.NewCache[common.Hash, *types.Block](blockCacheLimit),
21
        txLookupCache: lru.NewCache[common.Hash, txLookup](txLookupCacheLimit),
22
        engine:        engine,
23
        // ...
24
    }
25

26
    // 4. Create the header chain
27
    bc.hc, _ = NewHeaderChain(db, chainConfig, engine, bc.insertStopped)
28

29
    // 5. Set up state database, validator, prefetcher, processor
30
    bc.statedb = state.NewDatabase(bc.triedb, nil)
31
    bc.validator = NewBlockValidator(chainConfig, bc)
32
    bc.prefetcher = newStatePrefetcher(chainConfig, bc.hc)
33
    bc.processor = NewStateProcessor(bc.hc)
34

35
    // 6. Restore chain state from disk
36
    bc.loadLastState()
37

38
    // 7. Recover if head state is missing
39
    // 8. Set up snapshots
40
    // 9. Start optional services (tx indexer, state size tracker)
41
    // ...
42
}

Walking through the key steps:

Step 1 opens the trie database with the configured scheme (hash-based or path-based — see Chapter 05).
Step 2 calls SetupGenesisBlockWithOverride(), which writes the genesis block if the database is empty or verifies it matches the stored one.
Steps 3–5 create the struct and wire in the header chain, state database, validator, and processor. The HeaderChain (covered later in this chapter) manages header storage and is embedded within BlockChain.
Step 6 calls loadLastState() which reads the persisted head markers from the database and restores currentBlock, currentSnapBlock, currentFinalBlock, and currentSafeBlock.
Step 7 handles the case where the head block’s state is missing (e.g., after a crash). It rewinds the chain to a block whose state is available.

InsertChain: The Public Entry Point#

InsertChain() is the main entry point for adding blocks to the chain. Both the downloader (during sync) and the Engine API (for new payloads) ultimately call this:

1
func (bc *BlockChain) InsertChain(chain types.Blocks) (int, error) {
2
    if len(chain) == 0 {
3
        return 0, nil
4
    }
5
    // Verify the chain is contiguous and properly linked
6
    for i := 1; i < len(chain); i++ {
7
        block, prev := chain[i], chain[i-1]
8
        if block.NumberU64() != prev.NumberU64()+1 || block.ParentHash() != prev.Hash() {
9
            return 0, fmt.Errorf("non contiguous insert: ...")
10
        }
11
    }
12
    // Acquire the chain write lock
13
    if !bc.chainmu.TryLock() {
14
        return 0, errChainStopped
15
    }
16
    defer bc.chainmu.Unlock()
17

18
    _, n, err := bc.insertChain(chain, true, false)
19
    return n, err
20
}

The method does two things before delegating to the internal insertChain():

Contiguity check — every block must be the direct child of the previous one (consecutive numbers, matching parent hash).
Lock acquisition — chainmu.TryLock() ensures only one goroutine writes to the chain at a time. If the chain is stopped, TryLock() returns false immediately rather than blocking forever.

The setHead parameter passed as true means insertChain will update the canonical head. There is also InsertBlockWithoutSetHead() which passes false — this is used by the Engine API to insert a block without changing the canonical tip, letting SetCanonical() do that separately.

insertChain: The Internal Implementation#

The internal insertChain() method does the heavy lifting. It processes blocks one by one, verifying headers in parallel:

1
// core/blockchain.go  (simplified)
2

3
func (bc *BlockChain) insertChain(chain types.Blocks, setHead bool, makeWitness bool) (
4
    *stateless.Witness, int, error) {
5

6
    // Start parallel signature recovery for all blocks
7
    SenderCacher().RecoverFromBlocks(types.MakeSigner(bc.chainConfig, ...), chain)
8

9
    var lastCanon *types.Block
10
    defer func() {
11
        if lastCanon != nil && bc.CurrentBlock().Hash() == lastCanon.Hash() {
12
            bc.chainHeadFeed.Send(ChainHeadEvent{Header: lastCanon.Header()})
13
        }
14
    }()
15

16
    // Start the parallel header verifier
17
    headers := make([]*types.Header, len(chain))
18
    for i, block := range chain {
19
        headers[i] = block.Header()
20
    }
21
    abort, results := bc.engine.VerifyHeaders(bc, headers)
22
    defer close(abort)
23

24
    // Create an iterator that pairs blocks with verification results
25
    it := newInsertIterator(chain, results, bc.validator)
26
    block, err := it.next()
27

28
    // Skip known blocks that are behind the current head
29
    // ...
30

31
    // Main processing loop
32
    for ; block != nil && (err == nil || errors.Is(err, ErrKnownBlock)); block, err = it.next() {
33
        if bc.insertStopped() {
34
            break
35
        }
36
        parent := it.previous()
37
        if parent == nil {
38
            parent = bc.GetHeader(block.ParentHash(), block.NumberU64()-1)
39
        }
40

41
        // Execute and validate the block
42
        res, err := bc.ProcessBlock(parent.Root, block, setHead, makeWitness)
43
        if err != nil {
44
            return nil, it.index, err
45
        }
46
        // Track stats and report progress
47
        // ...
48
    }
49
    return witness, it.index, err
50
}

Three things happen concurrently to maximize throughput:

Sender recovery — SenderCacher().RecoverFromBlocks() starts ECDSA signature recovery for all transactions across all blocks in background goroutines. This is the most CPU-intensive part of validation.
Header verification — engine.VerifyHeaders() launches a goroutine that checks consensus rules for all headers. Results arrive via a channel, consumed by the insertIterator.
Sequential block execution — the main loop processes blocks one at a time (state execution cannot be parallelized since each block depends on the previous state).

The deferred function at the end fires a single ChainHeadEvent if the chain progressed — this avoids emitting one event per block during batch imports.

Handling Edge Cases#

Before the main loop, insertChain handles several edge cases:

Known blocks behind the head are skipped — they are already in the database and do not change the canonical chain.
Pruned ancestor (ErrPrunedAncestor) — the parent block’s state has been garbage-collected. If setHead is true, the block is inserted as a sidechain; otherwise recoverAncestors() re-executes ancestors to rebuild the state.
Clique blocks get special handling because Clique’s proof-of-authority mechanism allows blocks to share state, so a known block might still need re-import to fill in snapshots.

ProcessBlock: Execute and Persist#

ProcessBlock() is where a single block is executed and, if valid, written to storage:

1
// core/blockchain.go  (simplified)
2

3
func (bc *BlockChain) ProcessBlock(parentRoot common.Hash, block *types.Block,
4
    setHead bool, makeWitness bool) (*blockProcessingResult, error) {
5

6
    // 1. Create state database from parent root
7
    statedb, err := state.New(parentRoot, bc.statedb)
8

9
    // (If prefetching is enabled, a throwaway state runs transactions
10
    //  in parallel to warm up the trie cache)
11

12
    // 2. Execute all transactions in the block
13
    res, err := bc.processor.Process(block, statedb, bc.cfg.VmConfig)
14

15
    // 3. Validate the execution results against the header
16
    err = bc.validator.ValidateState(block, statedb, res, false)
17

18
    // 4. Write block and state to disk
19
    if !setHead {
20
        err = bc.writeBlockWithState(block, res.Receipts, statedb)
21
    } else {
22
        status, err = bc.writeBlockAndSetHead(block, res.Receipts, res.Logs, statedb, false)
23
    }
24
    return &blockProcessingResult{usedGas: res.GasUsed, procTime: ..., status: status}, nil
25
}

Walking through each step:

Step 1 creates a StateDB from the parent block’s state root. If prefetching is enabled (the default), two readers are created — one for a throwaway prefetch execution and one for the real execution — sharing a cache so the real execution benefits from the prefetcher’s trie node lookups.
Step 2 calls processor.Process() which executes every transaction in the block (see Chapter 06 for the full execution pipeline).
Step 3 calls validator.ValidateState() which checks that GasUsed, the bloom filter, the receipt root, the requests hash (Prague), and the state root all match the block header (see Chapter 09 for validation details).
Step 4 persists the block. The setHead flag controls whether the canonical head is updated. When called from InsertChain, setHead is true. When called from the Engine API’s InsertBlockWithoutSetHead, it is false — the block is stored but the head is not moved until the consensus layer explicitly calls SetCanonical().

writeBlockWithState: Persisting Block and State#

Once a block passes validation, it must be durably stored:

1
func (bc *BlockChain) writeBlockWithState(block *types.Block, receipts []*types.Receipt,
2
    statedb *state.StateDB) error {
3

4
    // 1. Write block data atomically
5
    blockBatch := bc.db.NewBatch()
6
    rawdb.WriteBlock(blockBatch, block)
7
    rawdb.WriteReceipts(blockBatch, block.Hash(), block.NumberU64(), receipts)
8
    rawdb.WritePreimages(blockBatch, statedb.Preimages())
9
    blockBatch.Write()
10

11
    // 2. Commit state changes from memory into the trie database
12
    root, stateUpdate, err := statedb.CommitWithUpdate(block.NumberU64(), ...)
13

14
    // 3. Trie garbage collection (hash-based scheme only)
15
    if bc.triedb.Scheme() == rawdb.PathScheme {
16
        return nil // path-based scheme handles GC internally
17
    }
18
    if bc.cfg.ArchiveMode {
19
        return bc.triedb.Commit(root, false) // archive: always flush
20
    }
21
    // Full node: selective garbage collection
22
    bc.triedb.Reference(root, common.Hash{})
23
    bc.triegc.Push(root, -int64(block.NumberU64()))
24
    // ...
25
}

The persistence splits into three layers:

Layer 1 — Block data is written in an atomic batch: the block (header + body), receipts, and any preimage mappings. The batch ensures all components are either fully written or not written at all.

Layer 2 — State commit flushes dirty state from StateDB’s in-memory maps into the trie database (see Chapter 04). CommitWithUpdate() returns the state root (which should match the block header’s Root field) and a state update record for the size tracker.

Layer 3 — Trie garbage collection manages which trie nodes stay in memory and which get flushed to disk. This only applies to the hash-based trie scheme (the path-based scheme handles GC internally):

Archive nodes flush every block’s trie to disk immediately.
Full nodes keep recent tries in memory and use a priority queue (triegc) to track which tries can be garbage-collected. Tries are flushed when either the memory limit (TrieDirtyLimit) is exceeded or enough processing time has accumulated (flushInterval). The constant TriesInMemory (128) controls how far back tries are retained — tries older than HEAD - 128 are eligible for dereference.

writeBlockAndSetHead: Updating the Canonical Chain#

When a block should become the new canonical head, writeBlockAndSetHead() handles both persistence and head updates:

1
func (bc *BlockChain) writeBlockAndSetHead(block *types.Block, receipts []*types.Receipt,
2
    logs []*types.Log, state *state.StateDB, emitHeadEvent bool) (WriteStatus, error) {
3

4
    // 1. Persist block and state
5
    if err := bc.writeBlockWithState(block, receipts, state); err != nil {
6
        return NonStatTy, err
7
    }
8

9
    // 2. Reorganise if the parent is not the current head
10
    currentBlock := bc.CurrentBlock()
11
    if block.ParentHash() != currentBlock.Hash() {
12
        if err := bc.reorg(currentBlock, block.Header()); err != nil {
13
            return NonStatTy, err
14
        }
15
    }
16

17
    // 3. Update head pointers
18
    bc.writeHeadBlock(block)
19

20
    // 4. Emit events
21
    bc.chainFeed.Send(ChainEvent{
22
        Header: block.Header(), Receipts: receipts, Transactions: block.Transactions(),
23
    })
24
    if len(logs) > 0 {
25
        bc.logsFeed.Send(logs)
26
    }
27
    if emitHeadEvent {
28
        bc.chainHeadFeed.Send(ChainHeadEvent{Header: block.Header()})
29
    }
30
    return CanonStatTy, nil
31
}

The critical decision is in step 2: if the new block’s parent is not the current head, a reorganization is needed. This happens when a fork produces a block that becomes the preferred chain tip. The reorg() function (covered in the next section) switches the canonical chain from the old fork to the new one.

Step 3 calls writeHeadBlock(), which updates both on-disk markers and in-memory atomic pointers:

1
func (bc *BlockChain) writeHeadBlock(block *types.Block) {
2
    batch := bc.db.NewBatch()
3
    rawdb.WriteHeadHeaderHash(batch, block.Hash())
4
    rawdb.WriteHeadFastBlockHash(batch, block.Hash())
5
    rawdb.WriteCanonicalHash(batch, block.Hash(), block.NumberU64())
6
    rawdb.WriteTxLookupEntriesByBlock(batch, block)
7
    rawdb.WriteHeadBlockHash(batch, block.Hash())
8
    batch.Write()
9

10
    bc.hc.SetCurrentHeader(block.Header())
11
    bc.currentSnapBlock.Store(block.Header())
12
    bc.currentBlock.Store(block.Header())
13
}

The database batch writes five things atomically: the head header hash, the head fast (snap) block hash, the canonical number-to-hash mapping, transaction lookup entries (hash → block number), and the head block hash. Then the in-memory pointers are updated. Because the pointers are atomic.Pointer, readers see the new head immediately without needing the chain lock.

Chain Reorganization#

A reorg occurs when a new block’s parent is not the current chain tip — meaning the chain must switch from one fork to another. The reorg() function handles this:

1
 Old canonical chain          New canonical chain
2
       |                            |
3
   block A5                     block B5  <-- new head
4
       |                            |
5
   block A4                     block B4
6
       |                            |
7
   block A3  (current head)     block B3
8
       \                           /
9
        +--- common ancestor -----+
10
                  block 2

The algorithm:

1
// core/blockchain.go  (simplified)
2

3
func (bc *BlockChain) reorg(oldHead *types.Header, newHead *types.Header) error {
4
    var (
5
        newChain    []*types.Header
6
        oldChain    []*types.Header
7
        commonBlock *types.Header
8
    )
9
    // Step 1: Reduce the longer chain to match the shorter one's height
10
    if oldHead.Number.Uint64() > newHead.Number.Uint64() {
11
        for ; oldHead.Number.Uint64() != newHead.Number.Uint64(); oldHead = bc.GetHeader(...) {
12
            oldChain = append(oldChain, oldHead)
13
        }
14
    } else {
15
        for ; newHead.Number.Uint64() != oldHead.Number.Uint64(); newHead = bc.GetHeader(...) {
16
            newChain = append(newChain, newHead)
17
        }
18
    }
19
    // Step 2: Walk both chains back until hashes match
20
    for {
21
        if oldHead.Hash() == newHead.Hash() {
22
            commonBlock = oldHead
23
            break
24
        }
25
        oldChain = append(oldChain, oldHead)
26
        newChain = append(newChain, newHead)
27
        oldHead = bc.GetHeader(oldHead.ParentHash, ...)
28
        newHead = bc.GetHeader(newHead.ParentHash, ...)
29
    }
30
    // Step 3: Undo old blocks, apply new blocks
31
    // ...
32
}

After finding the common ancestor, the function performs the actual switch:

Emit removed logs — iterates through the old chain in forward order, collects logs from removed blocks, and sends them via rmLogsFeed as RemovedLogsEvent.
Collect deleted transactions — iterates through the old chain, gathering all transaction hashes that will be removed from the canonical index.
Apply new blocks — iterates through the new chain in forward order, calling writeHeadBlock() for each block to update canonical hash mappings and tx lookup entries. Collects reborn logs and emits them via logsFeed.
Clean up indexes — deletes tx lookup entries for transactions that were in the old chain but not the new one (using HashDifference(deletedTxs, rebirthTxs)). Removes stale canonical hash mappings above the new head.
Purge tx lookup cache — the LRU cache may hold stale entries from the old chain.

The entire tx-lookup mutation is protected by txLookupLock, a separate sync.RWMutex that ensures API readers see a consistent view of the transaction index during the reorg.

Large reorgs (more than 63 blocks) trigger a log.Warn("Large chain reorg detected") to alert operators.

Head Management: Finalized and Safe Blocks#

Post-Merge, the consensus layer manages two additional head markers beyond the canonical tip:

1
func (bc *BlockChain) SetFinalized(header *types.Header) {
2
    bc.currentFinalBlock.Store(header)
3
    if header != nil {
4
        rawdb.WriteFinalizedBlockHash(bc.db, header.Hash())
5
    } else {
6
        rawdb.WriteFinalizedBlockHash(bc.db, common.Hash{})
7
    }
8
}
9

10
func (bc *BlockChain) SetSafe(header *types.Header) {
11
    bc.currentSafeBlock.Store(header)
12
    // ...
13
}

SetFinalized() is called by the Engine API’s ForkchoiceUpdated. A finalized block is guaranteed to never be reverted by the consensus protocol. It is both stored to the atomic pointer (for fast reads) and persisted to disk (so it survives restarts).
SetSafe() marks a block as safe — very unlikely to be reverted but not yet finalized. Unlike finalized, the safe block is not persisted to disk. On restart, loadLastState() sets safe equal to finalized.

The reader methods are lock-free:

1
func (bc *BlockChain) CurrentBlock() *types.Header      { return bc.currentBlock.Load() }
2
func (bc *BlockChain) CurrentSnapBlock() *types.Header   { return bc.currentSnapBlock.Load() }
3
func (bc *BlockChain) CurrentFinalBlock() *types.Header  { return bc.currentFinalBlock.Load() }
4
func (bc *BlockChain) CurrentSafeBlock() *types.Header   { return bc.currentSafeBlock.Load() }

SetHead: Rewinding the Chain#

SetHead() rewinds the chain to a specific block number, deleting blocks and state above that point:

1
func (bc *BlockChain) SetHead(head uint64) error {
2
    if _, err := bc.setHeadBeyondRoot(head, 0, common.Hash{}, false); err != nil {
3
        return err
4
    }
5
    bc.chainHeadFeed.Send(ChainHeadEvent{Header: bc.CurrentBlock()})
6
    return nil
7
}

There is also SetHeadWithTimestamp() which rewinds to the latest block at or before a given timestamp. Both delegate to setHeadBeyondRoot(), which handles the complexity of deleting data while preserving consistency between the key-value store and the freezer (ancient) database.

The Engine API Path: InsertBlockWithoutSetHead#

The Engine API uses a two-step approach to inserting blocks:

1
func (bc *BlockChain) InsertBlockWithoutSetHead(block *types.Block, makeWitness bool) (
2
    *stateless.Witness, error) {
3
    // ...
4
    witness, _, err := bc.insertChain(types.Blocks{block}, false, makeWitness)
5
    return witness, err
6
}
7

8
func (bc *BlockChain) SetCanonical(head *types.Block) (common.Hash, error) {
9
    // Recover state if missing
10
    if !bc.HasState(head.Root()) {
11
        bc.recoverAncestors(head, false)
12
    }
13
    // Run reorg if needed and update head
14
    if head.ParentHash() != bc.CurrentBlock().Hash() {
15
        bc.reorg(bc.CurrentBlock(), head.Header())
16
    }
17
    bc.writeHeadBlock(head)
18
    // ...
19
}

This split exists because in post-Merge Ethereum, the consensus layer tells geth which block to build on via ForkchoiceUpdated, separately from which blocks to validate via NewPayload. A block might be validated (inserted without setting head) long before the consensus layer decides it should be canonical.

The HeaderChain#

HeaderChain manages header-only storage and is embedded within BlockChain. During snap sync, geth downloads headers first (via the skeleton syncer) before fetching bodies and state, so the header chain can be ahead of the full block chain.

1
type HeaderChain struct {
2
    config        *params.ChainConfig
3
    chainDb       ethdb.Database
4
    genesisHeader *types.Header
5

6
    currentHeader     atomic.Pointer[types.Header]
7
    currentHeaderHash common.Hash
8

9
    headerCache *lru.Cache[common.Hash, *types.Header]
10
    numberCache *lru.Cache[common.Hash, uint64]
11

12
    procInterrupt func() bool
13
    engine        consensus.Engine
14
}

HeaderChain provides:

GetHeader(hash, number) / GetHeaderByNumber(n) — read headers from cache or database.
InsertHeaderChain(headers) — validates and inserts a batch of headers. Used during snap sync’s skeleton phase.
Reorg(headers) — reorganizes the header chain to point to a new set of canonical headers.
SetCurrentHeader(header) — updates the in-memory head header pointer.

During normal block insertion, BlockChain calls hc.SetCurrentHeader() as part of writeHeadBlock(). During snap sync, the header chain can advance independently via InsertHeaderChain().

The Genesis Block#

Every chain starts with a genesis block — block number 0, with no parent, containing the initial state allocation (pre-funded accounts, contract code, etc.). The Genesis struct defines this initial state:

1
type Genesis struct {
2
    Config     *params.ChainConfig `json:"config"`
3
    Nonce      uint64              `json:"nonce"`
4
    Timestamp  uint64              `json:"timestamp"`
5
    ExtraData  []byte              `json:"extraData"`
6
    GasLimit   uint64              `json:"gasLimit"   gencodec:"required"`
7
    Difficulty *big.Int            `json:"difficulty" gencodec:"required"`
8
    Mixhash    common.Hash         `json:"mixHash"`
9
    Coinbase   common.Address      `json:"coinbase"`
10
    Alloc      types.GenesisAlloc  `json:"alloc"      gencodec:"required"`
11

12
    BaseFee       *big.Int        `json:"baseFeePerGas"` // EIP-1559
13
    ExcessBlobGas *uint64         `json:"excessBlobGas"` // EIP-4844
14
    BlobGasUsed   *uint64         `json:"blobGasUsed"`   // EIP-4844
15
    // ...
16
}

The Alloc field is a GenesisAlloc (a map of addresses to accounts), defining the initial balances, code, nonces, and storage for all pre-existing accounts. For mainnet, this includes the crowdsale allocation, system contracts, and other initial state.

Genesis Commit#

When a node starts for the first time, the genesis block is committed to the database:

1
func (g *Genesis) Commit(db ethdb.Database, triedb *triedb.Database) (*types.Block, error) {
2
    if g.Number != 0 {
3
        return nil, errors.New("can't commit genesis block with number > 0")
4
    }
5
    // Validate config and signers
6
    // ...
7

8
    // Flush genesis allocations into the trie → compute state root
9
    root, err := flushAlloc(&g.Alloc, triedb)
10

11
    // Build the genesis block with the computed state root
12
    block := g.toBlockWithRoot(root)
13

14
    // Write everything atomically
15
    batch := db.NewBatch()
16
    rawdb.WriteGenesisStateSpec(batch, block.Hash(), blob) // JSON alloc spec
17
    rawdb.WriteBlock(batch, block)                          // header + body
18
    rawdb.WriteReceipts(batch, block.Hash(), 0, nil)        // empty receipts
19
    rawdb.WriteCanonicalHash(batch, block.Hash(), 0)        // number 0 → hash
20
    rawdb.WriteHeadBlockHash(batch, block.Hash())           // head block
21
    rawdb.WriteHeadFastBlockHash(batch, block.Hash())       // head fast block
22
    rawdb.WriteHeadHeaderHash(batch, block.Hash())          // head header
23
    rawdb.WriteChainConfig(batch, block.Hash(), config)     // chain config
24
    return block, batch.Write()
25
}

The flushAlloc() function iterates over every account in Alloc, inserts them into a trie, and returns the state root. This root becomes the genesis block’s Root field. The entire genesis state — block, receipts, canonical mappings, head markers, and chain config — is written in a single atomic batch.

SetupGenesisBlock#

NewBlockChain calls SetupGenesisBlockWithOverride() which decides what to do based on the database state:

	`genesis == nil`	`genesis != nil`
DB has no genesis	Use mainnet default	Commit provided genesis
DB has genesis	Load from DB	Verify compatible, use DB

If the database already has a genesis block and the provided one conflicts, a GenesisMismatchError is returned. If the chain config is updated (e.g., a new fork is scheduled), the stored config is updated as long as the fork activation points are above the current chain head.

Startup State Recovery#

When loadLastState() restores the chain on startup, it must handle the case where the head block’s state trie is not available (e.g., after a crash before the trie was flushed):

1
// core/blockchain.go  (inside loadLastState, simplified)
2

3
func (bc *BlockChain) loadLastState() error {
4
    head := rawdb.ReadHeadBlockHash(bc.db)
5
    if head == (common.Hash{}) {
6
        return bc.Reset() // empty database
7
    }
8
    headBlock := bc.GetBlockByHash(head)
9
    if headBlock == nil {
10
        return bc.Reset() // corrupt database
11
    }
12
    bc.currentBlock.Store(headBlock.Header())
13

14
    // Restore snap sync head
15
    bc.currentSnapBlock.Store(headBlock.Header())
16
    if head := rawdb.ReadHeadFastBlockHash(bc.db); head != (common.Hash{}) {
17
        if block := bc.GetBlockByHash(head); block != nil {
18
            bc.currentSnapBlock.Store(block.Header())
19
        }
20
    }
21

22
    // Restore finalized and safe blocks
23
    if head := rawdb.ReadFinalizedBlockHash(bc.db); head != (common.Hash{}) {
24
        if block := bc.GetBlockByHash(head); block != nil {
25
            bc.currentFinalBlock.Store(block.Header())
26
            bc.currentSafeBlock.Store(block.Header()) // safe defaults to finalized
27
        }
28
    }
29
    return nil
30
}

After loadLastState(), the constructor checks whether the head block’s state is actually available. If not, it calls setHeadBeyondRoot() to rewind the chain until it finds a block with available state. This makes geth resilient to crashes — at worst, it re-executes a few recent blocks on the next startup.

Events#

The BlockChain emits events through event.Feed channels that other subsystems subscribe to:

1
type ChainEvent struct {
2
    Header       *types.Header
3
    Receipts     []*types.Receipt
4
    Transactions []*types.Transaction
5
}
6

7
type ChainHeadEvent struct {
8
    Header *types.Header
9
}
10

11
type RemovedLogsEvent struct {
12
    Logs []*types.Log
13
}

Feed	Event Type	When Emitted
`chainFeed`	`ChainEvent`	Every new canonical block
`chainHeadFeed`	`ChainHeadEvent`	When the chain tip changes (batched per `insertChain` call)
`logsFeed`	`[]*types.Log`	New logs from canonical blocks or reborn blocks during reorg
`rmLogsFeed`	`RemovedLogsEvent`	Logs from blocks removed during a reorg
`blockProcFeed`	`bool`	`true` when block processing starts, `false` when it ends

The transaction pool subscribes to ChainHeadEvent to re-validate pending transactions against the new state. The eth handler subscribes to ChainHeadEvent to broadcast new blocks to peers. The RPC layer uses chainFeed and logsFeed to serve eth_subscribe notifications.

Graceful Shutdown#

Stop() ensures all in-memory state is safely persisted before the node exits:

1
func (bc *BlockChain) Stop() {
2
    bc.stopWithoutSaving()
3

4
    // Journal snapshots to disk
5
    if bc.snaps != nil {
6
        bc.snaps.Journal(bc.CurrentBlock().Root)
7
        bc.snaps.Release()
8
    }
9
    if bc.triedb.Scheme() == rawdb.PathScheme {
10
        bc.triedb.Journal(bc.CurrentBlock().Root)
11
    } else {
12
        // Hash-based scheme: commit recent tries
13
        if !bc.cfg.ArchiveMode {
14
            for _, offset := range []uint64{0, 1, state.TriesInMemory - 1} {
15
                if number := bc.CurrentBlock().Number.Uint64(); number > offset {
16
                    recent := bc.GetBlockByNumber(number - offset)
17
                    bc.triedb.Commit(recent.Root(), true)
18
                }
19
            }
20
            // Dereference all remaining GC entries
21
            for !bc.triegc.Empty() {
22
                bc.triedb.Dereference(bc.triegc.PopItem())
23
            }
24
        }
25
    }
26
    bc.triedb.Close()
27
}

The shutdown strategy depends on the trie scheme:

Path-based scheme — journals the current trie state to disk. On restart, the journal is replayed to restore the in-memory state.
Hash-based scheme — commits three specific tries to disk: HEAD, HEAD-1, and HEAD-127. This covers three restart scenarios: normal restart (HEAD state is there), uncle-reorg (HEAD-1 is there), and worst-case rewind (HEAD-127 limits re-execution to 127 blocks). The remaining GC queue entries are all dereferenced to free memory.

The stopWithoutSaving() helper handles the non-persistence part of shutdown: setting the stopping flag, closing the tx indexer, unsubscribing events, signaling procInterrupt to abort any in-progress block insertion, and closing the chainmu mutex.

WriteStatus: Canonical vs. Sidechain#

Every block insertion returns a WriteStatus indicating the block’s relationship to the canonical chain:

1
type WriteStatus byte
2

3
const (
4
    NonStatTy   WriteStatus = iota // Unknown or non-canonical
5
    CanonStatTy                     // Part of the canonical chain
6
    SideStatTy                      // Stored but not canonical (fork)
7
)

CanonStatTy means the block extended or replaced the canonical chain tip. SideStatTy means the block was stored (it might become canonical later during a reorg) but did not change the current head. The insertChain loop uses this status for logging — canonical blocks log at Debug level as “Inserted new block”, while side chain blocks log as “Inserted forked block”.

Welcome