Q1: How does a transaction propagate across the entire network?
Overall flow
User submits transaction (eth_sendRawTransaction) │ ▼ Local txpool receives it, emits NewTxsEvent │ ▼ handler.txBroadcastLoop() receives the event │ ▼ BroadcastTransactions() dual-layer distribution ├─ Send full tx to ~sqrt(N) peers directly (TransactionsMsg) └─ Send only hash to all remaining peers (NewPooledTransactionHashesMsg) │ │ ▼ ▼ Peers receiving full tx Peers receiving hash announcement Add directly to txpool TxFetcher 3-stage pipeline ├─ Wait 500ms (direct broadcast may arrive) ├─ Didn't arrive? Queue for request └─ Send GetPooledTransactionsMsg to fetch │ ▼ Receive full tx, add to txpool │ ▼ This peer also fires NewTxsEvent → continues propagating to its peers...Within seconds, the transaction reaches virtually every node in the network.
Dual-layer broadcast strategy
Why not send the full transaction to all peers? Because it wastes too much bandwidth. With 50 peers, sending 50 full copies per transaction, with every node doing the same — traffic would explode.
Geth’s strategy is direct broadcast + hash announcement:
func (h *handler) BroadcastTransactions(txs types.Transactions) { for _, tx := range txs { switch { case tx.Type() == types.BlobTxType: // Blob transactions: announce only (~768KB, too large) case tx.Size() > txMaxBroadcastSize: // 4KB // Large transactions: announce only default: // Normal transactions: select sqrt(N) peers for direct send directSet = choice.choosePeers(peers, txSender) }
for _, peer := range peers { if peer.KnownTransaction(tx.Hash()) { continue // peer already knows this tx, skip } if _, ok := directSet[peer]; ok { txset[peer] = append(...) // direct send list } else { annos[peer] = append(...) // hash announcement list } } }}Three key details:
1) sqrt(N) selection: With 50 peers, roughly 7 get the full transaction, the remaining 43 get only the hash. Those 43 will actively request the full transaction if they need it.
2) Deterministic selection: Which peers are chosen is not random — it’s deterministically computed via siphash(key, self_id, peer_id, tx_sender). Different nodes choose different peer subsets for direct broadcast, ensuring good network-wide coverage.
3) Special cases: Blob transactions (~768KB) and large transactions (>4KB) are always announce-only — sending full data is too expensive.
knownTxs: avoiding duplicate sends
Each peer maintains a knownCache (up to 32768 hashes) recording “transactions this peer already knows about”:
Peer A's knownTxs: {tx1, tx2, tx3, ...}
When I want to broadcast tx2 to peer A: peer.KnownTransaction(tx2.Hash()) == true → skip, don't sendA transaction is marked as known when:
- I send it to the peer → mark
- The peer sends it to me → mark
- The peer announces it to me → mark
Per-peer send queues
Direct broadcasts and hash announcements each have an async channel:
Peer struct: txBroadcast chan []common.Hash ← direct broadcast queue txAnnounce chan []common.Hash ← hash announcement queue
broadcastTransactions() goroutine: Read hashes from txBroadcast → Fetch full tx data from txpool → Pack into packets up to 100KB → Send TransactionsMsg
announceTransactions() goroutine: Read hashes from txAnnounce → Send NewPooledTransactionHashesMsg (includes hash + type + size)Announcements carry type and size metadata, letting the receiver decide whether to fetch without a round trip.
TxFetcher: three-stage fetch pipeline
When a peer announces a transaction hash, the receiving node’s TxFetcher processes it through a three-stage pipeline:
Stage 1: Waitlist (wait 500ms) Hash announced → placed in waitlist Wait 500ms to see if the full tx arrives via another peer's direct broadcast
Why wait? Because another peer is very likely broadcasting the full tx to you. If it arrives → done, no need to fetch actively
Stage 2: Queue (ready to request) 500ms passed and still not arrived → move to request queue
Stage 3: Fetching (send request) Take from queue, send GetPooledTransactionsMsg to a peer that announced this hash → Peer replies with PooledTransactionsMsg (full tx data) → Add to local txpool
If no reply within 5 seconds → retry with a different peerKey constants:
maxTxAnnounces = 4096 // max pending announcements per peermaxTxRetrievals = 256 // max txs per fetch requestmaxTxRetrievalSize = 128 * 1024 // max 128KB per fetch requesttxArriveTimeout = 500ms // waitlist timeouttxFetchTimeout = 5s // fetch request timeoutUnderpriced transaction cache
TxFetcher also tracks transactions rejected by the txpool as “too cheap”:
Fetch tx8 → txpool rejects (fee too low)→ tx8's hash is cached for 5 minutes
During those 5 minutes, if other peers also announce tx8:→ TxFetcher skips it immediately, no re-requestAvoids repeatedly wasting bandwidth fetching the same transaction that’s doomed to be rejected.
Q2: How does a node sync the entire chain from scratch?
Two sync modes
| Full Sync | Snap Sync | |
|---|---|---|
| Downloads | headers + bodies + receipts | headers + bodies + receipts |
| State acquisition | Re-execute every tx from genesis | Download state snapshot at pivot block |
| Tx execution | All (hundreds of millions) | Only last ~64 blocks |
| Time | Days | Hours |
| Requires snap protocol peers | No | Yes |
Snap sync is the default mode. It skips the most time-consuming step: re-executing historical transactions.
Sync pipeline overview
Consensus layer: "New head at block N" │ ▼Stage 1: Skeleton sync (download headers backwards) Download header chain from block N towards genesis 512 headers per batch, stored in scratch space │ ▼Stage 2: Backfill (concurrently download bodies + receipts) Once skeleton links to local chain, spawn concurrent fetchers: ├─ fetchHeaders() — read already-downloaded headers from skeleton ├─ fetchBodies() — download block bodies (txs, uncles, withdrawals) └─ fetchReceipts() — download receipts (snap sync only) │ ▼Stage 3: Processing and import Full sync: processFullSyncContent() → InsertChain() executes every tx, builds state from scratch Snap sync: processSnapSyncContent() ├─ Below pivot: import with downloaded receipts (no execution) ├─ SnapSyncer downloads state snapshot at pivot in parallel └─ Above pivot (~64 blocks): full executionSkeleton syncer: why download backwards?
The skeleton is the core mechanism for header downloading. It starts from the head provided by the consensus layer and downloads backwards:
Chain head → → → → → → → → → → → → → → Genesis ←───────────────────────────────────── skeleton download directionWhy backwards? Because after the Merge, the consensus layer tells you “the chain head is here.” You only know the endpoint, not what blocks are in between. Downloading backwards allows:
- Starting from a known trusted point (head provided by CL)
- Each header contains
parentHash, verifying chain continuity in the backward direction - Eventually connecting to locally existing chain data
Subchains: handling interruptions and restarts
Sync can be interrupted (network disconnect, node restart). The skeleton tracks progress using a subchain list:
Initial state (CL announces head = 1000):
Subchain 1: [Head: 1000, Tail: 1000] (just the tip)
After downloading 200 headers:
Subchain 1: [Head: 1000, Tail: 800] (headers 800~1000 downloaded)
Node restarts, CL announces new head = 1050:
Subchain 1: [Head: 1050, Tail: 1050] ← new tip Subchain 2: [Head: 1000, Tail: 800] ← previous progress
After filling the gap between 1000~1050:
Subchain 1: [Head: 1050, Tail: 800] ← merged!
Continue downloading backwards, eventually link to local chain (or genesis):
Subchain 1: [Head: 1050, Tail: 0] ← completeEach subchain records three values: Head (newest block number), Tail (oldest block number), Next (parent hash of Tail, for link verification). Progress is persisted to disk, so restarts don’t lose already-downloaded data.
Snap sync’s pivot block
The key concept in snap sync is the pivot block:
Genesis ──────────────────────── Pivot ──── Chain head │ Zone A │ Zone B │ │ │ │ │ Import with downloaded │ Full │ │ receipts (no execution) │ execution│ │ │ (~64 blk)│ └───────────────────────────────┘ │ Meanwhile: SnapSyncer downloads │ state snapshot at pivot │The pivot is chosen at least 64 blocks behind the chain head. Why 64?
- Full execution of 64 blocks ensures local state is correct
- 64 also corresponds to half the trie’s in-memory retention depth (the 128-block GC window from Chapter 10)
SnapSyncer uses the snap protocol to download the complete state at the pivot in parallel from multiple peers — account trie, storage tries, and contract bytecode. This is much faster than re-executing all historical transactions.
Concurrent fetcher architecture
Body and receipt downloading use a concurrent fetcher pattern:
fetchBodies() goroutine: for { 1. Take a batch of headers needing bodies from queue (up to 128) 2. Select a peer that has this data 3. Send GetBlockBodiesMsg 4. Wait for BlockBodiesMsg response 5. Validate data, pass to processor }
fetchReceipts() goroutine: (snap sync only) Similar logic, up to 256 receipts per batchMultiple fetchers run concurrently, pulling different ranges of data from different peers, maximizing download bandwidth utilization.
Key limits
MaxBlockFetch = 128 // max 128 bodies per requestMaxHeaderFetch = 192 // max 192 headers per requestMaxReceiptFetch = 256 // max 256 receipts per requestmaxResultsProcess = 2048 // max 2048 results to import at oncefsMinFullBlocks = 64 // min fully-executed blocks in snap syncQ3: What is Fork ID and why is it needed? (EIP-2124)
The problem
Ethereum has gone through many hard forks (Homestead, Byzantium, London, Shanghai…). Different nodes may run different software versions with different forks activated. If two nodes on different chains try to sync with each other, they only waste time and bandwidth.
How to quickly determine during handshake whether two nodes are compatible?
Fork ID design
Fork ID is an extremely compact identifier — only 8 bytes:
type ID struct { Hash [4]byte // CRC32(genesis hash + all activated fork block numbers) Next uint64 // Block/timestamp of next upcoming fork (0 = no known future fork)}How it’s computed
func NewID(config *params.ChainConfig, genesis *types.Block, head, time uint64) ID { // Start from genesis block hash hash := crc32.ChecksumIEEE(genesis.Hash().Bytes())
// Gather all fork points (by block number and timestamp) forksByBlock, forksByTime := gatherForks(config, genesis.Time())
// Mix in each already-passed fork for _, fork := range forksByBlock { if fork <= head { hash = checksumUpdate(hash, fork) // activated → mix in continue } return ID{Hash: checksumToBytes(hash), Next: fork} // not yet → set as Next } // Same for timestamp-based forks...
return ID{Hash: checksumToBytes(hash), Next: 0} // no known future forks}Concrete example:
Mainnet genesis hash: 0xd4e56740...Activated forks: Homestead(1150000), Byzantium(4370000), London(12965000), ...Current head: 20000000Next unactivated fork: suppose Prague at block 21000000
Hash = CRC32(genesis_hash) = CRC32_update(hash, 1150000) // Homestead = CRC32_update(hash, 4370000) // Byzantium = CRC32_update(hash, 12965000) // London = ... each activated fork mixed in sequentially = 0xABCD1234 (final 4 bytes)
Next = 21000000 (next unactivated fork)
Fork ID = {Hash: 0xABCD1234, Next: 21000000}Four validation scenarios
During handshake, two nodes exchange Fork IDs and check compatibility:
Scenario 1: Same fork state — fully compatible
Local: Hash=0xABCD, Next=21000000Remote: Hash=0xABCD, Next=21000000→ ✓ On the same chain, same forks activatedScenario 2: Remote is a subset (still syncing) — compatible
Local: Hash=0xABCD, Next=21000000 (London etc. activated)Remote: Hash=0x1234, Next=12965000 (London not yet activated)→ ✓ Remote node may still be syncing, hasn't reached the London fork block yet But it knows Next=12965000 (London), meaning it's aware of this forkScenario 3: Remote is a superset (we’re behind) — compatible
Local: Hash=0x1234, Next=12965000 (London not yet activated)Remote: Hash=0xABCD, Next=21000000 (London activated)→ ✓ Local node may be behind, it will catch upScenario 4: Mismatch — incompatible, disconnect
Local: Hash=0xABCD (mainnet)Remote: Hash=0x9999 (some testnet)→ ✗ Different fork history, cannot be the same chainWhy not just compare genesis hash + full fork list?
- Compactness: Fork ID is only 8 bytes. A full fork list grows with each hard fork.
- Forward compatibility: The
Nextfield lets older node versions know “a new fork is coming,” even if they don’t know the specifics. - CRC32 irreversibility: You can’t reverse-engineer which specific forks were activated from the Fork ID, but that doesn’t matter — you only need to know “are we compatible.”
gatherForks() implementation
gatherForks() scans the ChainConfig struct via reflection to collect all fork points:
ChainConfig fields: HomesteadBlock: 1150000 ← block-number-based forks ByzantiumBlock: 4370000 LondonBlock: 12965000 ShanghaiTime: 1681338455 ← timestamp-based forks CancunTime: 1710338135 ...
gatherForks() output: forksByBlock = [1150000, 4370000, 12965000, ...] forksByTime = [1681338455, 1710338135, ...]Deduplicated and sorted, then mixed into CRC32 sequentially. This means every time a new hard fork is added, Fork ID updates automatically — no need to manually maintain a compatibility list.
Some information may be outdated






