Geth(14) QA - Kehao Zheng's Website

Q1: How does geth assemble and start all subsystems at startup?#

Startup pipeline overview#

1
main()
2
  │
3
  ▼
4
app.Run(os.Args)          urfave/cli framework parses command line
5
  │
6
  ▼
7
geth()                    default action when no subcommand given
8
  ├─ prepare()            log network type, bump mainnet cache to 4096MB
9
  ├─ makeFullNode()       build Node container + create all subsystems ← core
10
  ├─ startNode()          start all services + install signal handler
11
  └─ stack.Wait()         block until Close() is called

The geth() function itself is very concise:

1
func geth(ctx *cli.Context) error {
2
    prepare(ctx)
3
    stack := makeFullNode(ctx)
4
    defer stack.Close()
5

6
    startNode(ctx, stack, false)
7
    stack.Wait()    // blocks on n.stop channel until shutdown
8
    return nil
9
}

Four lines, four stages. Let’s expand each layer.

Stage 1: Configuration loading#

Configuration is layered from three sources:

1
Hardcoded defaults (ethconfig.Defaults, defaultNodeConfig())
2
       │
3
       ▼ override
4
TOML config file (if --config is set)
5
       │
6
       ▼ override
7
CLI flags (--syncmode, --cache, --maxpeers, etc.)

The final output is two config structs: node.Config (P2P, RPC, data directory, etc.) and ethconfig.Config (sync mode, gas price, cache allocation, etc.).

Stage 2: Node container creation#

node.New() creates the Node but does not start it:

1
func New(conf *Config) (*Node, error) {
2
    node := &Node{
3
        config:        conf,
4
        inprocHandler: rpc.NewServer(),        // in-process RPC server
5
        server:        &p2p.Server{...},       // P2P server (not started)
6
        databases:     make(map[*closeTrackingDB]struct{}),
7
        stop:          make(chan struct{}),     // channel Wait() blocks on
8
    }
9
    node.rpcAPIs = append(node.rpcAPIs, node.apis()...)  // admin, debug, web3
10
    node.openDataDir()                                    // file lock prevents duplicate instances
11
    // Create HTTP, WS, IPC server objects (not started)
12
    return node, nil
13
}

Key design: Node has a three-state state machine:

1
initializingState(0) ──Start()──→ runningState(1) ──Close()──→ closedState(2)

All registrations (RegisterLifecycle, RegisterAPIs, RegisterProtocols) must complete before Start() — calling them in non-initializing state panics.

Stage 3: eth.New() — core assembly#

This is the chapter’s most important function. It creates and connects all components from the previous 13 chapters in dependency order:

1
func New(stack *node.Node, config *ethconfig.Config) (*Ethereum, error) {
2
    // Step 1-2: Open database
3
    chainDb := stack.OpenDatabaseWithOptions("chaindata", ...)
4

5
    // Step 3: Determine state storage scheme
6
    scheme := rawdb.ParseStateScheme(config.StateScheme, chainDb)
7

8
    // Step 4: Load chain config, create consensus engine
9
    chainConfig := core.LoadChainConfig(chainDb, config.Genesis)
10
    engine := ethconfig.CreateConsensusEngine(chainConfig, chainDb)
11

12
    // Step 5: Assemble Ethereum struct
13
    eth := &Ethereum{config, chainDb, engine, ...}
14

15
    // Step 6: Create BlockChain
16
    eth.blockchain = core.NewBlockChain(chainDb, config.Genesis, engine, ...)
17

18
    // Step 7: Log index
19
    eth.filterMaps = filtermaps.NewFilterMaps(...)
20

21
    // Step 8: Create transaction pools
22
    legacyPool := legacypool.New(config.TxPool, eth.blockchain)
23
    eth.blobTxPool = blobpool.New(config.BlobPool, eth.blockchain, ...)
24
    eth.txPool = txpool.New(..., []txpool.SubPool{legacyPool, eth.blobTxPool})
25

26
    // Step 9: Create protocol handler
27
    eth.handler = newHandler(&handlerConfig{...})
28

29
    // Step 10: Create miner
30
    eth.miner = miner.New(eth, config.Miner, engine)
31

32
    // Step 11: Create API backend
33
    eth.APIBackend = &EthAPIBackend{...}
34

35
    // Step 12: Register on Node
36
    stack.RegisterAPIs(eth.APIs())           // JSON-RPC methods
37
    stack.RegisterProtocols(eth.Protocols()) // eth/68, snap/1 sub-protocols
38
    stack.RegisterLifecycle(eth)             // Start/Stop lifecycle
39
}

Dependency chain visualization#

The order is not arbitrary — each step depends on the previous ones:

1
chainDb (Ch.5: storage layer)
2
    │
3
    ▼
4
engine (Ch.9: consensus engine) ─── depends on chainDb for chain config
5
    │
6
    ▼
7
BlockChain (Ch.10) ──────────── depends on chainDb + engine
8
    │
9
    ▼
10
TxPool (Ch.8) ───────────────── depends on BlockChain (chain head, state validation)
11
    │
12
    ▼
13
handler (Ch.12) ─────────────── depends on BlockChain + TxPool (sync and broadcast)
14
    │
15
    ▼
16
miner (Ch.9) ────────────────── depends on Ethereum + engine (block building)
17
    │
18
    ▼
19
APIBackend (Ch.13) ──────────── depends on all of the above (RPC method entry point)

If TxPool were created before BlockChain, TxPool would have no chain head state to validate transactions against. If handler were created before TxPool, handler would have no pool to route transactions to.

Stage 4: Optional services and Engine API#

After the core Ethereum service, optional components are registered:

1
// Log filter API
2
filterSystem := utils.RegisterFilterAPI(stack, backend, &cfg.Eth)
3

4
// GraphQL (optional)
5
if ctx.IsSet(utils.GraphQLEnabledFlag.Name) {
6
    utils.RegisterGraphQLService(stack, backend, filterSystem, &cfg.Node)
7
}
8

9
// Engine API (three modes)
10
if --dev {
11
    SimulatedBeacon   // dev mode: auto-seal blocks
12
} else if --beacon.api {
13
    BLSync            // experimental light sync
14
} else {
15
    catalyst.Register // normal mode: Engine API connects to external CL
16
}

Stage 5: Node.Start()#

After all registration is complete, startNode() calls stack.Start():

1
func (n *Node) Start() error {
2
    // 1. State check: must be initializingState
3
    n.state = runningState
4

5
    // 2. Start network endpoints
6
    n.openEndpoints()
7
    //   ├─ n.server.Start()  → P2P server starts (Chapter 11)
8
    //   └─ n.startRPC()      → HTTP, WS, IPC, auth endpoints all start
9

10
    // 3. Start all lifecycles in registration order
11
    for _, lifecycle := range lifecycles {
12
        lifecycle.Start()   // Ethereum.Start() is called here
13
    }
14
    // If any fails, already-started ones are stopped in reverse order
15
}

Ethereum.Start() starts the network layer:

1
func (s *Ethereum) Start() error {
2
    s.setupDiscovery()     // DNS + DHT hybrid discovery (Chapter 11)
3
    s.handler.Start(...)   // sync + tx/block broadcast loops (Chapter 12)
4
    s.dropper.Start(...)   // connection quality management
5
    s.filterMaps.Start()   // log indexer
6
}

After startup completes#

1
stack.Wait() blocks on n.stop channel
2
  │
3
  Concurrently running goroutines:
4
  ├─ P2P server: accept inbound connections, discover new nodes
5
  ├─ handler: sync blockchain, broadcast transactions
6
  ├─ miner: wait for CL's ForkchoiceUpdated to build blocks
7
  ├─ txPool: receive and manage transactions
8
  ├─ RPC servers: handle external requests
9
  └─ signal handler: wait for SIGINT/SIGTERM

Q2: How does geth shut down gracefully? Why is teardown order reversed?#

Signal handling#

utils.StartNode() installs a signal handler:

1
func StartNode(ctx *cli.Context, stack *node.Node, isConsole bool) {
2
    stack.Start()
3

4
    go func() {
5
        sigc := make(chan os.Signal, 1)
6
        signal.Notify(sigc, syscall.SIGINT, syscall.SIGTERM)
7

8
        <-sigc   // first signal
9
        log.Info("Got interrupt, shutting down...")
10
        go stack.Close()   // close in separate goroutine (may take time)
11

12
        // Wait for up to 10 more signals
13
        for i := 10; i > 0; i-- {
14
            <-sigc
15
            log.Warn("Already shutting down, interrupt more to panic.", "times", i-1)
16
        }
17
        debug.LoudPanic("boom")   // 11th time: force kill process
18
    }()
19
}

Design:

1
1st Ctrl-C   → start graceful shutdown
2
2nd-10th     → print warning + countdown
3
11th         → panic force exit (last resort for stuck shutdown)

In console mode (geth attach), SIGINT is ignored (left for the JavaScript console), only SIGTERM triggers shutdown.

Disk space monitoring#

There’s a hidden shutdown trigger — low disk space:

1
Background goroutine checks available disk space every 30 seconds
2
  │
3
  If < 2 × TrieDirtyCache (default 512MB)
4
  │
5
  ▼
6
Automatically send SIGTERM → trigger graceful shutdown

Why? Because databases (Pebble/LevelDB) corrupt when disk is full. Shutting down early is much better than data corruption.

Node.Close() teardown order#

1
func (n *Node) Close() error {
2
    // 1. Stop all services
3
    n.stopServices(n.lifecycles)
4
    // 2. Release remaining resources
5
    n.doClose(errs)
6
}

The key in stopServices(): reverse order:

1
func (n *Node) stopServices(running []Lifecycle) error {
2
    // First stop RPC
3
    n.stopRPC()
4

5
    // Stop all lifecycles in reverse
6
    for i := len(running) - 1; i >= 0; i-- {
7
        running[i].Stop()
8
    }
9

10
    // Last stop P2P server
11
    n.server.Stop()
12
}

Ethereum.Stop() internals#

When Node calls eth.Stop():

1
func (s *Ethereum) Stop() error {
2
    // Layer 1: Stop network (no more incoming data)
3
    s.discmix.Close()      // stop node discovery
4
    s.dropper.Stop()       // stop connection management
5
    s.handler.Stop()       // stop sync and broadcast
6

7
    // Layer 2: Stop internal processing
8
    s.filterMaps.Stop()    // stop log indexer
9
    s.txPool.Close()       // close transaction pool
10
    s.blockchain.Stop()    // stop blockchain (flush trie to disk)
11
    s.engine.Close()       // close consensus engine
12

13
    // Layer 3: Persistence and cleanup
14
    s.shutdownTracker.Stop()  // mark clean shutdown
15
    s.chainDb.Close()         // close database
16
    s.eventMux.Stop()         // stop event dispatch
17
}

Why reverse teardown is necessary#

Illustrating with an incorrect order:

1
Wrong example: close database before stopping handler
2

3
  handler is syncing blocks
4
    → calls blockchain.InsertChain()
5
      → calls chainDb.Write()
6
        → database is already closed!
7
        → panic or data corruption

The correct order follows the reverse of the dependency chain:

1
Startup order:  chainDb → engine → blockchain → txPool → handler → miner
2
Shutdown order: handler → txPool → blockchain → engine → chainDb
3

4
Principle: stop data consumers first, then data producers, then storage last

Complete shutdown hierarchy:

1
Layer 1: Cut off external input
2
  ├─ stopRPC()         → stop accepting RPC requests
3
  ├─ discmix.Close()   → stop discovering new nodes
4
  └─ handler.Stop()    → stop sync and broadcast
5

6
Layer 2: Stop internal processing
7
  ├─ txPool.Close()    → stop transaction processing
8
  ├─ blockchain.Stop() → flush trie cache to disk
9
  └─ engine.Close()    → close consensus engine
10

11
Layer 3: Close infrastructure
12
  ├─ chainDb.Close()   → close database
13
  ├─ server.Stop()     → close P2P server
14
  └─ close(n.stop)     → unblock Wait() → process exits

doClose() final cleanup#

After all lifecycles stop, doClose() releases remaining resources:

1
func (n *Node) doClose(errs []error) error {
2
    n.state = closedState
3
    n.closeDatabases()     // close all tracked databases
4
    n.accman.Close()       // stop hardware wallet USB monitoring
5
    if n.keyDirTemp {
6
        os.RemoveAll(n.keyDir)  // delete temp key directory
7
    }
8
    n.closeDataDir()       // release file lock
9
    close(n.stop)          // unblock stack.Wait() in geth()
10
}

close(n.stop) is the final step — it unblocks stack.Wait() in the geth() function, geth() returns, main() returns, process exits.

Welcome