Mobile wallpaper 1Mobile wallpaper 2Mobile wallpaper 3
1621 words
8 minutes
Geth(15) QA

Q1: What is the complete lifecycle of a transaction from user submission to finalization?#

Overall flow#

User / DApp
│ eth_sendRawTransaction (JSON-RPC)
┌─────────────┐
│ RPC Server │ ← Ch.13: transport, dispatch, reflection
└──────┬──────┘
┌─────────────┐
│TransactionAPI│ ← Ch.13: decode raw bytes → types.Transaction
└──────┬──────┘
│ txPool.Add()
┌─────────────┐
│ Transaction │ ← Ch.8: validation, pending/queue, blob pool
│ Pool │
└──────┬──────┘
│ NewTxsEvent
┌─────────────┐
│ Handler │ ← Ch.12: broadcast to peers
└──────┬──────┘ sqrt(N) direct + rest hash announcement
│ CL calls ForkchoiceUpdated (triggers block building)
┌─────────────┐
│ Miner │ ← Ch.9: fillTransactions, sort by tip
└──────┬──────┘
│ ApplyTransaction for each tx
┌─────────────┐
│ State │ ← Ch.6: preCheck → buyGas → EVM → refund → fee distribution
│ Transition │
└──────┬──────┘
│ evm.Call() or evm.Create()
┌─────────────┐
│ EVM │ ← Ch.7: interpreter loop, opcode execution, gas metering
└──────┬──────┘
│ SSTORE, CREATE etc. modify state
┌─────────────┐
│ StateDB │ ← Ch.4: stateObject dirty map, journal rollback
└──────┬──────┘
│ FinalizeAndAssemble
┌─────────────┐
│ Block │ ← Ch.9: compute state root, assemble block
│ Assembly │
└──────┬──────┘
│ CL: getPayload → newPayload → forkchoiceUpdated
┌─────────────┐
│ BlockChain │ ← Ch.10: InsertChain, validate, writeBlockWithState
└──────┬──────┘
│ ChainHeadEvent
┌─────────────┐
│ Storage │ ← Ch.3/4/5: trie commit → TrieDB → Pebble/LevelDB + Freezer
│ Stack │
└──────┬──────┘
│ broadcast to peers
┌─────────────┐
│ P2P Network │ ← Ch.11/12: block propagation, lagging nodes sync
└──────┬──────┘
Finalized ← CL finalizes, never rolled back

Stage 1: RPC arrival (Chapter 13)#

The user calls eth_sendRawTransaction via HTTP/WebSocket/IPC, submitting a signed raw transaction.

HTTP POST → RPC Server decodes JSON
→ split method name on "_": service="eth", method="sendRawTransaction"
→ reflection lookup → TransactionAPI.SendRawTransaction
→ tx.UnmarshalBinary(rawBytes) → types.Transaction
→ SubmitTransaction() checks fee cap, EIP-155 protection
→ txPool.Add()

From this point, the transaction leaves the external world and enters geth internals.

Stage 2: Transaction pool validation and storage (Chapter 8)#

The TxPool coordinator routes to the appropriate sub-pool based on transaction type:

TxPool.Add()
├─ Normal transactions (type 0/1/2/4) → LegacyPool
└─ Blob transactions (type 3) → BlobPool
Sub-pool validation:
├─ Signature recovery (ECDSA recover)
├─ Nonce check (not too low, not too large a gap)
├─ Balance check (can cover gasLimit × gasFeeCap + value)
├─ Gas limit check (doesn't exceed block gas limit)
└─ Pool-level limits (account slots, global slots, minimum price)
Passes validation → pending map (ready for block inclusion)

The pool emits NewTxsEvent via event.Feed.

Stage 3: P2P broadcast (Chapter 12)#

The handler’s txBroadcastLoop() receives the event and executes dual-layer broadcast:

BroadcastTransactions()
├─ ~sqrt(N) peers: send full transaction (TransactionsMsg)
└─ remaining peers: send only hash (NewPooledTransactionHashesMsg)
▼ remote peers receiving hash
TxFetcher 3-stage pipeline
├─ wait 500ms (direct broadcast may arrive)
├─ didn't arrive → queue for request
└─ send GetPooledTransactionsMsg to fetch
Within seconds → virtually every node in the network has this transaction

Stage 4: Block building trigger (Chapter 9)#

The transaction waits in the pool. When the consensus layer decides “you are this slot’s proposer”:

CL calls ForkchoiceUpdated (with payloadAttributes)
→ Engine API notifies miner
→ BuildPayload() starts:
├─ Immediately build empty block (guarantee, never miss a slot)
└─ Background goroutine repeatedly builds full blocks
├─ 0s: first full build
├─ 2s: rebuild (new txs may have arrived)
├─ 4s: rebuild again
└─ 6s: CL calls GetPayload, return best version

Stage 5: Transaction execution (Chapter 6)#

During each build, fillTransactions() pulls transactions from the pool, sorts by effective tip, and executes one by one:

For each transaction:
core.ApplyTransaction()
→ stateTransition.execute()
├─ preCheck() validate nonce, balance
├─ buyGas() pre-deduct gasLimit × gasFeeCap
├─ EVM dispatch evm.Call() or evm.Create()
├─ calcRefund() refund cap = gasUsed / 5 (EIP-3529)
├─ returnGas() return remaining gas to sender
└─ fee distribution tip → coinbase, baseFee → burned

If execution fails, state rolls back to the pre-transaction snapshot, gas pool is restored — failed transactions leave no trace.

Stage 6: EVM execution (Chapter 7)#

Inside evm.Call(), the interpreter loop runs the contract bytecode:

for {
① op = contract.GetOp(pc) // fetch opcode
② operation = jumpTable[op] // lookup gas and handler
③ validate stack depth
④ deduct constantGas
⑤ compute and deduct dynamicGas (e.g., SLOAD cold/warm)
⑥ expand memory and charge
⑦ operation.execute() // execute!
pc++
}

State-modifying opcodes (SSTORE, CREATE, etc.) write to StateDB’s dirty map, protected by the journal for rollback.

Stage 7: State commit (Chapters 3/4)#

After all transactions are executed:

FinalizeAndAssemble()
├─ Process withdrawals (validator balances)
├─ System-level operations (beacon root, etc.)
├─ statedb.Commit()
│ ├─ Each modified account → storage trie updated and hashed
│ ├─ Account trie updated and hashed
│ └─ Produces 32-byte Merkle state root
└─ Assemble block (header + txs + receipts + withdrawals)

This block is the payload returned to the CL via GetPayload.

Stage 8: Block insertion (Chapter 10)#

After the CL validates the block on the beacon chain, it sends it back to geth via NewPayload:

InsertBlockWithoutSetHead()
├─ Header validation (timestamp, gas limit, consensus constraints)
├─ Body validation (tx root, uncles hash, withdrawals hash)
├─ State processing — re-execute all transactions (identical to stages 5-6!)
│ Compare resulting state root with header's StateRoot
│ Mismatch → reject block (INVALID)
└─ writeBlockWithState() persist to disk
Then CL calls ForkchoiceUpdated to designate new head:
→ writeHeadBlock() updates canonical chain pointers
→ emit ChainHeadEvent

Stage 9: Persistence and propagation (Chapters 5/11/12)#

Data flows down the storage stack:

StateDB → Trie → TrieDB → ethdb (Pebble/LevelDB)
└─ Old blocks → Freezer (append-only ancient storage)

Meanwhile, the handler broadcasts the new block to peers. Lagging nodes catch up via snap sync or full sync.

Stage 10: Finalization#

The CL eventually marks the block (or its ancestor) as finalized:

Geth updates currentFinalBlock pointer
→ This transaction can never be rolled back
→ Permanently part of the canonical chain

From user pressing send to finalization, a transaction passes through RPC → pool → P2P → miner → state transition → EVM → StateDB → trie → blockchain → storage → network → finality — spanning every major subsystem in geth.


Q2: What are the cross-cutting design patterns in the geth codebase?#

Recognizing these patterns gives you a “I’ve seen this before” feeling when reading any subsystem.

Pattern 1: Lifecycle interface (start/stop contract)#

type Lifecycle interface {
Start() error
Stop() error
}

Almost all long-running components follow this contract:

Component Start() Stop()
─────────────────────────────────────────────────────────────────
Ethereum setupDiscovery, handler.Start handler.Stop, txPool.Close, blockchain.Stop
handler txBroadcastLoop, txFetcher stop sync and broadcast
P2P Server TCP listen, discovery, dialer disconnect all peers

Key rules:

  • Registration must precede Start — registering at runtime panics
  • Stop in reverse order — consumers first, producers next, storage last
  • Naming may vary (Init/Close, New/Stop), but the pattern is the same

Pattern 2: Event Feed (publish/subscribe decoupling)#

// Producer
type BlockChain struct {
chainHeadFeed event.Feed
}
func (bc *BlockChain) insertChain(...) {
bc.chainHeadFeed.Send(ChainHeadEvent{Block: block})
}
// Consumer
headCh := make(chan core.ChainHeadEvent, 64)
sub := bc.SubscribeChainHeadEvent(headCh)
for {
select {
case head := <-headCh:
// react to new chain head
case err := <-sub.Err():
return
}
}

Key feeds in geth:

Event Producer Consumer Purpose
──────────────────────────────────────────────────────────────────────
ChainHeadEvent BlockChain miner, handler, new block arrived
filter system, txpool
NewTxsEvent TxPool handler new tx, needs broadcast
WalletEvent account backend startNode wallet listener wallet plugged/unplugged
ChainSideEvent BlockChain filter system side chain block (uncles)

Why Feed instead of direct calls? Decoupling. BlockChain doesn’t need to import the miner package, miner doesn’t need to import the handler package. They communicate through events, unaware of each other’s existence. Adding a new consumer requires no changes to producer code.

Pattern 3: Backend interface (API vs. implementation separation)#

JSON-RPC methods (ethapi/api.go)
│ calls
Backend interface (ethapi/backend.go)
│ implemented by
├─ EthAPIBackend (full node)
└─ LESAPIBackend (light client)
│ delegates to
BlockChain, TxPool, Miner, StateDB...

eth_getBalance code doesn’t care whether it’s running on a full node or light client — it just calls backend.StateAndHeaderByNumberOrHash(). The concrete implementation decides where data comes from (local database vs. remote request).

This pattern appears in many places:

  • Backend interface → bridge between RPC API and core
  • consensus.Engine interface → bridge between consensus logic and blockchain
  • ethdb.KeyValueStore interface → bridge between storage logic and concrete KV engine
  • txpool.SubPool interface → bridge between pool coordinator and pool implementations

Pattern 4: Config struct (one configuration aggregate per subsystem)#

CLI flags
│ utils.SetNodeConfig()
node.Config → data dir, P2P settings, RPC endpoints
│ utils.SetEthConfig()
ethconfig.Config → sync mode, cache size, gas price
├─ p2p.Config → max peers, listen addr, NAT, bootnodes
├─ ChainConfig → fork activation times, chain ID, consensus rules
└─ TxPool.Config → price limits, slot limits, journal settings

Each Config has hardcoded defaults, overridable by TOML file, overridable again by CLI flags. Three layers stacked, finally passed to the corresponding subsystem’s constructor.

Pattern 5: Four-layer storage model (read-through + write-flush)#

Layer 4: StateDB in-memory dirty maps + journal rollback
│ read-through ↓ ↑ writes accumulate
Layer 3: Trie Merkle Patricia Trie nodes
│ read-through ↓ ↑ commit flushes
Layer 2: TrieDB caching layer (path-based or hash-based)
│ read-through ↓ ↑ flush to disk
Layer 1: ethdb KV store (Pebble/LevelDB) + Freezer

Read direction: StateDB checks dirty cache first → miss penetrates to trie → trie penetrates to TrieDB → TrieDB penetrates to disk.

Write direction: Modifications accumulate in StateDB’s dirty map → commit flushes to trie → trie commits to TrieDB → TrieDB eventually flushes to disk.

This “accumulate at upper layers, periodically flush to lower layers” pattern makes per-transaction state modifications extremely cheap (memory only), paying the trie hashing and disk write cost only at block commit time.

Why recognizing these patterns matters#

When you encounter an unfamiliar subsystem:

1. Does it have Start/Stop? → Lifecycle pattern, find where it's registered
2. Does it Send or Subscribe to something? → Event Feed pattern, find producers and consumers
3. Does it call an interface or concrete? → Backend pattern, find the implementation
4. Does its constructor accept a Config? → Config pattern, find defaults and CLI mapping
5. How many layers does its data cross? → Four-layer storage, find penetration and flush paths

These 5 questions can help you locate any subsystem’s position in the geth architecture within minutes.

Geth(15) QA
https://kehaozheng.vercel.app/posts/chainethgeth/15_qa/
Author
Kehao Zheng
Published at
2026-04-24
License
CC BY-NC-SA 4.0

Some information may be outdated