Project: Scanning through the entire Bitcoin blockchain

"... The story of the last 15 years is one of a steady buildup to peak throughput and size at the block ~500k, followed by a slow decline, and then a recent uptick..."

Project: Scanning through the entire Bitcoin blockchain

If you have any questions, comments, corrections, or suggestions, please contact StJohn Piano on Tela:
tela.app/id/stjohn_piano/7c51a6


Project Summary:


NewHedge (newhedge.io), a Bitcoin financial terminal company, hired me through Toptal Network to load the Bitcoin blockchain and store it in a database.

Using an 8-core machine and Python multiprocessing, I retrieved the first 800000 blocks from a Bitcoin node and stored them in a PostgreSQL database.

Average time per block: 0.32 seconds

Time required to process the entire blockchain: 3 days

Project Outcome:


NewHedge hired me through Toptal Network to load the Bitcoin blockchain and store it in a database. NewHedge is a new Bitcoin financial terminal that helps investors make data-driven decisions. It offers a personalised dashboard that tracks news updates, on-chain data, and adoption metrics.

Working on an 8-core machine, connected to a dedicated Bitcoin node, I wrote a Python script that retrieved the first 800000 blocks (out of 807449) from the node and stored them in a PostgreSQL database. The script took advantage of Python multiprocessing to use all the available cores and make concurrent block retrieval requests to the node. Queuing and retrieval tracking was required to distribute the work among the cores, calculate the request failure rate, and retry failures.

For the last 434342 blocks, I added timing code. The script processed 434342 blocks in 139815 seconds (38 hours, or 1.6 days). The average time per block was 0.32 seconds. Using this average time for the earlier 365658 blocks, we get 800000 * 0.32 = 256000 seconds (71 hours, or 3 days). Note: Early blocks were not particularly full, so the overall time to process the blockchain from start to finish would actually be less than this.

Project Data:


During this project, I gathered some interesting data about the Bitcoin blockchain.

  • The average block size during each set of 10000 blocks.
  • The average number of transactions during each set of 10000 blocks.

Here is a chart of the results:

Note: The theoretical maximum size of a block is 1 MB.

Observations:

  • Generally the two measurements correlate quite well.
  • The story of the last 15 years is one of a steady buildup to peak throughput and size at the block ~500k, followed by a slow decline, and then a recent uptick.
  • Towards the right-hand side of the chart, the average number of transactions per block has spiked upwards, above its usual position relative to the average number of transactions. This is interesting. Perhaps the ecosystem has now optimised more thoroughly for throughput.

Join Tela Network, become a consultant, and publish your research on Tela Blog:
tela.network/join

Follow Tela Network on Twitter:
twitter.com/tela_updates

Join the Tela Social channel on Telegram for all new updates:
t.me/tela_social