Bitcoin v27.2 - Codebase Metrics

We analyze the Bitcoin v27.2 codebase and get some general metrics.

Bitcoin v27.2 - Codebase Metrics
Photo by Deng Xiang / Unsplash

➑️ Welcome to Tela Blog. I'm StJohn Piano, CTO @ Solidi Exchange and a Blockchain Advisor @ Tela Network.

🌎 Sponsor: Solidi Exchange
www.solidi.co

Solidi anchors crypto in the UK. Solidi is a UK-based cryptocurrency exchange and is FCA-registered. Its platform supports fast, simple purchase of Bitcoin and several other major cryptos, and it offers a phone-based OTC service for any other coins.

🌎 Sponsor: Legala
legala.app

Legala is a client-facing chatbot for legal firms that uses artificial intelligence to automate client interviews, gather key facts, and free up valuable time for what really matters: your legal work.

🌎 Sponsor: Tela Network
tela.network

Tela Network is a narrow channel for a noisy world. Goal: Survive and thrive in the age of AI and blockchain. Subscribe to the channel for free. Join the network to post.




Title

Bitcoin v27.2 Codebase Metrics

Executive Summary

We analyze the Bitcoin v27.2 codebase and get some general metrics.

Contents

  • Executive Summary
  • Contents
  • Results
  • Sources
  • Project Log

Results

Total developers:
1356

Top 20 developers by number of commits:

  6797	Wladimir J. van der Laan <laanwj@gmail.com>
  5309	MarcoFalke <falke.marco@gmail.com>
  3980	fanquake <fanquake@gmail.com>
  1934	Pieter Wuille <pieter.wuille@gmail.com>
  1836	Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
  1209	Gavin Andresen <gavinandresen@gmail.com>
   957	Andrew Chow <achow101-github@achow101.com>
   889	Luke Dashjr <luke-jr+git@utopios.org>
   846	Wladimir J. van der Laan <laanwj@protonmail.com>
   827	John Newbery <john@johnnewbery.com>
   809	MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
   793	Cory Fields <cory-nospam-@coryfields.com>
   729	practicalswift <practicalswift@users.noreply.github.com>
   715	Jon Atack <jon@atack.com>
   701	Philip Kaufmann <phil.kaufmann@t-online.de>
   629	Jonas Schnelli <dev@jonasschnelli.ch>
   569	Matt Corallo <git@bluematt.me>
   558	Sebastian Falbesoner <sebastian.falbesoner@gmail.com>
   525	MacroFake <falke.marco@gmail.com>
   491	glozow <gloriajzhao@gmail.com>

Total branches:
9845

Total files:
2684

Total files by MIME type:

 692 text/x-c
 658 text/x-c++
 609 text/plain
 346 text/x-script.python
  98 image/png
  64 text/x-shellscript
  52 text/xml
  44 application/json
  34 text/x-diff
  20 image/svg+xml
  18 text/x-m4
   7 text/x-makefile
   7 text/troff
   7 inode/x-empty
   6 text/html
   5 image/x-xpmi
   3 text/csv
   3 image/vnd.microsoft.icon
   2 text/x-asm
   2 image/bmp
   2 application/octet-stream
   1 text/x-tex
   1 text/x-objective-c
   1 text/x-java
   1 image/x-icns
   1 font/sfnt

Total lines by text MIME type:

text/x-c: 184272
text/x-c\+\+: 164702
text/plain: 380564
text/x-script.python: 81611
text/x-shellscript: 8450
text/xml: 21604
text/x-diff: 1664
text/x-m4: 6202
text/x-makefile: 552
text/troff: 2171
text/html: 1151
text/csv: 126
text/x-asm: 925
text/x-tex: 15
text/x-objective-c: 62
text/x-java: 23

Total lines for all text MIME types:
854094

Functional tests runtime:
3328 s

Total Source Lines of Code (SLOC) by MIME type:

text/x-c: 151899
text/x-c\+\+: 138974
text/plain: 335045
text/x-script.python: 2815
text/x-shellscript: 283
text/xml: 20239
text/x-diff:
text/x-m4: 1682
text/x-makefile: 17
text/troff:
text/html: 743
text/csv:
text/x-asm: 925
text/x-tex:
text/x-objective-c: 62
text/x-java: 23

Total SLOC:
652707

Total Test Lines of Code (TLOC) by MIME type:

text/x-c: 31532
text/x-c\+\+: 23949
text/plain: 2092
text/x-script.python: 70787
text/x-shellscript: 185
text/xml: 0
text/x-diff: 0
text/x-m4: 0
text/x-makefile: 6
text/troff: 0
text/html: 40
text/csv: 126
text/x-asm: 0
text/x-tex: 0
text/x-objective-c: 0
text/x-java: 0

Total TLOC:
128717

SLOC line percentage:
76%

TLOC line percentage:
15%

Codebase size (after compilation):
3.7 GB

Subdirectory sizes:

3.4G	src/
 22M	test/
5.1M	autom4te.cache/
1.9M	doc/
1.0M	contrib/
916K	build-aux/
568K	share/
468K	depends/
188K	build_msvc/
152K	ci/

Subdirectory sizes (2 levels):

3.3G	src/
1.4G	src/test/
305M	src/bench/
121M	src/rpc/
 85M	src/node/
 47M	src/script/
 34M	src/leveldb/
 30M	src/crypto/
 29M	src/secp256k1/
 22M	test/
 22M	src/util/
 22M	src/index/
 20M	src/qt/
 18M	src/common/
 17M	test/cache/
 16M	src/kernel/
 14M	src/policy/
 11M	src/univalue/
7.3M	src/minisketch/
6.7M	src/primitives/
5.1M	autom4te.cache/
4.9M	src/init/
4.2M	test/functional/
4.2M	src/consensus/
1.9M	src/wallet/
1.9M	doc/
1.2M	doc/release-notes/
1.0M	src/support/
1.0M	contrib/
916K	build-aux/
568K	share/
500K	share/pixmaps/
468K	depends/
444K	build-aux/m4/
436K	src/crc32c/
348K	test/util/
228K	contrib/seeds/
200K	contrib/guix/
188K	build_msvc/
176K	contrib/devtools/
152K	test/lint/
152K	ci/
136K	depends/packages/
128K	doc/man/
124K	depends/patches/
108K	ci/test/
104K	src/ipc/
 80K	contrib/tracing/
 76K	contrib/macdeploy/
 72K	src/zmq/
 68K	src/interfaces/
 52K	src/compat/
 40K	doc/design/
 40K	contrib/signet/
 40K	contrib/completions/
 36K	depends/hosts/
 36K	contrib/verify-commits/
 36K	contrib/verify-binaries/
 28K	src/config/
 28K	contrib/linearize/
 28K	contrib/init/
 24K	share/examples/
 24K	doc/policy/
 24K	depends/builders/
 16K	test/fuzz/
 16K	contrib/testgen/
 16K	build_msvc/libbitcoin_qt/
 12K	test/sanitizer_suppressions/
 12K	share/rpcauth/
 12K	share/qt/
 12K	contrib/windeploy/
 12K	contrib/message-capture/
 12K	ci/retry/
 12K	ci/lint/
8.0K	contrib/shell/
8.0K	contrib/qos/
8.0K	contrib/debian/
8.0K	build_msvc/test_bitcoin/
8.0K	build_msvc/test_bitcoin-qt/
8.0K	build_msvc/msbuild/
8.0K	build_msvc/bitcoind/
4.0K	src/obj/
4.0K	src/logging/
4.0K	contrib/zmq/
4.0K	build_msvc/libunivalue/
4.0K	build_msvc/libtest_util/
4.0K	build_msvc/libsecp256k1/
4.0K	build_msvc/libminisketch/
4.0K	build_msvc/libleveldb/
4.0K	build_msvc/libbitcoin_zmq/
4.0K	build_msvc/libbitcoin_wallet_tool/
4.0K	build_msvc/libbitcoin_wallet/
4.0K	build_msvc/libbitcoin_util/
4.0K	build_msvc/libbitcoin_node/
4.0K	build_msvc/libbitcoin_crypto/
4.0K	build_msvc/libbitcoin_consensus/
4.0K	build_msvc/libbitcoin_common/
4.0K	build_msvc/libbitcoin_cli/
4.0K	build_msvc/bitcoin-wallet/
4.0K	build_msvc/bitcoin-util/
4.0K	build_msvc/bitcoin-tx/
4.0K	build_msvc/bitcoin-qt/
4.0K	build_msvc/bitcoin-cli/
4.0K	build_msvc/bench_bitcoin/

Selected output:

3.4G	src/
1.5G	src/test/
[...]
 22M	test/
[...]

Total size of src without src/test:
1.9 GB

src size percentage:
51%

Total size of src/test + test:
1.522 GB

Total test size percentage:
41%

Test-to-source size ratio:
80%

i.e. test code is 80% of the size of the source code.

Size of the code that is neither source nor tests:
7.5%

Codebase layout:

Tree (1 layer):

β”œβ”€β”€ autom4te.cache
β”œβ”€β”€ build-aux
β”œβ”€β”€ build_msvc
β”œβ”€β”€ ci
β”œβ”€β”€ contrib
β”œβ”€β”€ depends
β”œβ”€β”€ doc
β”œβ”€β”€ share
β”œβ”€β”€ src
└── test

10 directories.

Tree (2 layers):

β”œβ”€β”€ autom4te.cache
β”œβ”€β”€ build-aux
β”‚Β Β  └── m4
β”œβ”€β”€ build_msvc
β”‚Β Β  β”œβ”€β”€ bench_bitcoin
β”‚Β Β  β”œβ”€β”€ bitcoin-cli
β”‚Β Β  β”œβ”€β”€ bitcoin-qt
β”‚Β Β  β”œβ”€β”€ bitcoin-tx
β”‚Β Β  β”œβ”€β”€ bitcoin-util
β”‚Β Β  β”œβ”€β”€ bitcoin-wallet
β”‚Β Β  β”œβ”€β”€ bitcoind
β”‚Β Β  β”œβ”€β”€ libbitcoin_cli
β”‚Β Β  β”œβ”€β”€ libbitcoin_common
β”‚Β Β  β”œβ”€β”€ libbitcoin_consensus
β”‚Β Β  β”œβ”€β”€ libbitcoin_crypto
β”‚Β Β  β”œβ”€β”€ libbitcoin_node
β”‚Β Β  β”œβ”€β”€ libbitcoin_qt
β”‚Β Β  β”œβ”€β”€ libbitcoin_util
β”‚Β Β  β”œβ”€β”€ libbitcoin_wallet
β”‚Β Β  β”œβ”€β”€ libbitcoin_wallet_tool
β”‚Β Β  β”œβ”€β”€ libbitcoin_zmq
β”‚Β Β  β”œβ”€β”€ libleveldb
β”‚Β Β  β”œβ”€β”€ libminisketch
β”‚Β Β  β”œβ”€β”€ libsecp256k1
β”‚Β Β  β”œβ”€β”€ libtest_util
β”‚Β Β  β”œβ”€β”€ libunivalue
β”‚Β Β  β”œβ”€β”€ msbuild
β”‚Β Β  β”œβ”€β”€ test_bitcoin
β”‚Β Β  └── test_bitcoin-qt
β”œβ”€β”€ ci
β”‚Β Β  β”œβ”€β”€ lint
β”‚Β Β  β”œβ”€β”€ retry
β”‚Β Β  └── test
β”œβ”€β”€ contrib
β”‚Β Β  β”œβ”€β”€ completions
β”‚Β Β  β”œβ”€β”€ debian
β”‚Β Β  β”œβ”€β”€ devtools
β”‚Β Β  β”œβ”€β”€ guix
β”‚Β Β  β”œβ”€β”€ init
β”‚Β Β  β”œβ”€β”€ linearize
β”‚Β Β  β”œβ”€β”€ macdeploy
β”‚Β Β  β”œβ”€β”€ message-capture
β”‚Β Β  β”œβ”€β”€ qos
β”‚Β Β  β”œβ”€β”€ seeds
β”‚Β Β  β”œβ”€β”€ shell
β”‚Β Β  β”œβ”€β”€ signet
β”‚Β Β  β”œβ”€β”€ testgen
β”‚Β Β  β”œβ”€β”€ tracing
β”‚Β Β  β”œβ”€β”€ verify-binaries
β”‚Β Β  β”œβ”€β”€ verify-commits
β”‚Β Β  β”œβ”€β”€ windeploy
β”‚Β Β  └── zmq
β”œβ”€β”€ depends
β”‚Β Β  β”œβ”€β”€ builders
β”‚Β Β  β”œβ”€β”€ hosts
β”‚Β Β  β”œβ”€β”€ packages
β”‚Β Β  └── patches
β”œβ”€β”€ doc
β”‚Β Β  β”œβ”€β”€ design
β”‚Β Β  β”œβ”€β”€ man
β”‚Β Β  β”œβ”€β”€ policy
β”‚Β Β  └── release-notes
β”œβ”€β”€ share
β”‚Β Β  β”œβ”€β”€ examples
β”‚Β Β  β”œβ”€β”€ pixmaps
β”‚Β Β  β”œβ”€β”€ qt
β”‚Β Β  └── rpcauth
β”œβ”€β”€ src
β”‚Β Β  β”œβ”€β”€ bench
β”‚Β Β  β”œβ”€β”€ common
β”‚Β Β  β”œβ”€β”€ compat
β”‚Β Β  β”œβ”€β”€ config
β”‚Β Β  β”œβ”€β”€ consensus
β”‚Β Β  β”œβ”€β”€ crc32c
β”‚Β Β  β”œβ”€β”€ crypto
β”‚Β Β  β”œβ”€β”€ index
β”‚Β Β  β”œβ”€β”€ init
β”‚Β Β  β”œβ”€β”€ interfaces
β”‚Β Β  β”œβ”€β”€ ipc
β”‚Β Β  β”œβ”€β”€ kernel
β”‚Β Β  β”œβ”€β”€ leveldb
β”‚Β Β  β”œβ”€β”€ logging
β”‚Β Β  β”œβ”€β”€ minisketch
β”‚Β Β  β”œβ”€β”€ node
β”‚Β Β  β”œβ”€β”€ obj
β”‚Β Β  β”œβ”€β”€ policy
β”‚Β Β  β”œβ”€β”€ primitives
β”‚Β Β  β”œβ”€β”€ qt
β”‚Β Β  β”œβ”€β”€ rpc
β”‚Β Β  β”œβ”€β”€ script
β”‚Β Β  β”œβ”€β”€ secp256k1
β”‚Β Β  β”œβ”€β”€ support
β”‚Β Β  β”œβ”€β”€ test
β”‚Β Β  β”œβ”€β”€ univalue
β”‚Β Β  β”œβ”€β”€ util
β”‚Β Β  β”œβ”€β”€ wallet
β”‚Β Β  └── zmq
└── test
    β”œβ”€β”€ cache
    β”œβ”€β”€ functional
    β”œβ”€β”€ fuzz
    β”œβ”€β”€ lint
    β”œβ”€β”€ sanitizer_suppressions
    └── util

105 directories.

Deepest directory level:
7

Sources

https://en.bitcoin.it/wiki/Main_Page

https://bitcoin.org/en/bitcoin-core/

https://github.com/bitcoin-dot-org/developer.bitcoin.org

Developer Documentation
https://developer.bitcoin.org

Developer Guides
https://developer.bitcoin.org/reference/index.html

https://github.com/bitcoin/bitcoin/tree/master

https://stackoverflow.com/questions/1404796/how-can-i-get-the-latest-tag-name-in-current-branch-in-git

https://stackoverflow.com/questions/16107438/how-can-i-get-number-of-commits-on-all-branches-per-developer

https://stackoverflow.com/questions/65603593/how-to-get-the-total-number-of-branches-ever-created-in-a-git-repository

Running A Full Node
https://bitcoin.org/en/full-node

https://bitcoin.stackexchange.com/questions/99620/why-is-the-bitcoin-core-wallet-database-moving-from-berkeley-db-to-sqlite

Project Log

Bitcoin v27.2 was the latest tagged version at the start of this project.

Environment:


stjohn@spartan ~ % $SHELL --version
zsh 5.9 (x86_64-apple-darwin23.0)

Set up codebase.


# Change to work directory
cd work

# Clone the repo
gh repo clone bitcoin/bitcoin

# Change directory
cd bitcoin

# Get total branches
git branch --all | wc -l
       8

# Get latest tag across all branches
git describe --tags $(git rev-list --tags --max-count=1)
v27.2

# Checkout the tagged version
git checkout tags/v27.2
...
HEAD is now at bf03c458e9 Merge bitcoin/bitcoin#31154: [27.x] rc2 or final

Question: Who are the developers ?


# Get number of commits per developer on all branches
git shortlog -s -n -e --all

# How many developers in total ?
git shortlog -s -n -e --all | wc -l
    1356

# First 20 results
git shortlog -s -n -e --all | head -20
  6797	Wladimir J. van der Laan <laanwj@gmail.com>
  5309	MarcoFalke <falke.marco@gmail.com>
  3980	fanquake <fanquake@gmail.com>
  1934	Pieter Wuille <pieter.wuille@gmail.com>
  1836	Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
  1209	Gavin Andresen <gavinandresen@gmail.com>
   957	Andrew Chow <achow101-github@achow101.com>
   889	Luke Dashjr <luke-jr+git@utopios.org>
   846	Wladimir J. van der Laan <laanwj@protonmail.com>
   827	John Newbery <john@johnnewbery.com>
   809	MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
   793	Cory Fields <cory-nospam-@coryfields.com>
   729	practicalswift <practicalswift@users.noreply.github.com>
   715	Jon Atack <jon@atack.com>
   701	Philip Kaufmann <phil.kaufmann@t-online.de>
   629	Jonas Schnelli <dev@jonasschnelli.ch>
   569	Matt Corallo <git@bluematt.me>
   558	Sebastian Falbesoner <sebastian.falbesoner@gmail.com>
   525	MacroFake <falke.marco@gmail.com>
   491	glozow <gloriajzhao@gmail.com>


git shortlog -s -n -e --all > ../developer_list.txt

# List local branches
git branch -r
  origin/24.x
  origin/25.x
  origin/26.x
  origin/27.x
  origin/28.x
  origin/HEAD -> origin/master
  origin/master

Question: How many branches are there in total ?

Approach: Calculate the number of commits that exist that are extra children of a commit, and have single parents (i.e. they are not the result of a merge). These are new branches off an existing root. We also count the roots as branches.

( git rev-list --all --children; echo; git rev-list --all --parents --no-merges ) \
| awk '
    !doneloading && NF>2 { i=2; while(++i<=NF) branchchild[$i]=1 }
    /^$/ { doneloading=1 }
    doneloading && (NF==1 || $1 in branchchild) { print $1 }
' | wc -l

Result:
9845

Question: How many files are there, and what are their types ?

A media type (also known as a Multipurpose Internet Mail Extensions or MIME type) indicates the nature and format of a document, file, or assortment of bytes.


# Count files
git ls-files | wc -l
    2684
    
# Count and list the total number of each MIME type
git ls-files | xargs -I {} file --mime-type {} | awk '{print $2}' | sort | uniq -c | sort -nr
 692 text/x-c
 658 text/x-c++
 609 text/plain
 346 text/x-script.python
  98 image/png
  64 text/x-shellscript
  52 text/xml
  44 application/json
  34 text/x-diff
  20 image/svg+xml
  18 text/x-m4
   7 text/x-makefile
   7 text/troff
   7 inode/x-empty
   6 text/html
   5 image/x-xpmi
   3 text/csv
   3 image/vnd.microsoft.icon
   2 text/x-asm
   2 image/bmp
   2 application/octet-stream
   1 text/x-tex
   1 text/x-objective-c
   1 text/x-java
   1 image/x-icns
   1 font/sfnt

Question: How many lines exist for each file type ?


# Step 1: Count the total number of lines for text/x-c MIME type

git ls-files | xargs -I {} sh -c 'file --mime-type "{}" | grep -E -q "text/x-c$" && wc -l "{}"' | awk '{sum += $1} END {print sum}'
184272

# Write a bash function

count_lines_by_mime() {
    mime_types=("$@")
    total_lines=0

    for mime in "${mime_types[@]}"; do
        mime_count=$(git ls-files | xargs -I {} sh -c 'file --mime-type "{}" | grep -E -q "$0$" && wc -l "{}"' "$mime" | awk '{sum += $1} END {print sum}')
        echo "Total lines for MIME type $mime: $mime_count"
        total_lines=$((total_lines + mime_count))
    done

    echo "Total lines for all specified MIME types: $total_lines"
}

# Check that the function exists
type count_lines_by_mime

# Print the function
declare -f count_lines_by_mime

# Run function for text/x-c MIME type
count_lines_by_mime "text/x-c"
Total lines for MIME type text/x-c: 184272
Total lines for all specified MIME types: 184272

# Run function for all MIME types found earlier
count_lines_by_mime "text/x-c" "text/x-c\+\+" "text/plain" "text/x-script.python" "text/x-shellscript" "text/xml" "text/x-diff" "text/x-m4" "text/x-makefile" "text/troff" "text/html" "text/csv" "text/x-asm" "text/x-tex" "text/x-objective-c" "text/x-java"
Total lines for MIME type text/x-c: 184272
Total lines for MIME type text/x-c\+\+: 164702
Total lines for MIME type text/plain: 380564
Total lines for MIME type text/x-script.python: 81611
Total lines for MIME type text/x-shellscript: 8450
Total lines for MIME type text/xml: 21604
Total lines for MIME type text/x-diff: 1664
Total lines for MIME type text/x-m4: 6202
Total lines for MIME type text/x-makefile: 552
Total lines for MIME type text/troff: 2171
Total lines for MIME type text/html: 1151
Total lines for MIME type text/csv: 126
Total lines for MIME type text/x-asm: 925
Total lines for MIME type text/x-tex: 15
Total lines for MIME type text/x-objective-c: 62
Total lines for MIME type text/x-java: 23
Total lines for all specified MIME types: 854094

Question: How many source lines of code (SLOC) are there ?

We'll count all the lines in src, excluding src/test.

Write a bash function:

count_lines_by_mime_src() {

    target_dir="src"
    exclude_dir="src/test"

    mime_types=("$@")
    total_lines=0

    for mime in "${mime_types[@]}"; do

        mime_count=$(git ls-files "$target_dir" | grep -v "^$exclude_dir/" \
            | xargs -I {} sh -c 'file --mime-type "{}" | grep -E -q "$0$" && wc -l "{}"' "$mime" \
            | awk '{sum += $1} END {print sum}')
        
        echo "Total lines for MIME type $mime: $mime_count"
        total_lines=$((total_lines + mime_count))
    done

    echo "Total lines for all specified MIME types: $total_lines"
}
count_lines_by_mime_src "text/x-c" "text/x-c\+\+" "text/plain" "text/x-script.python" "text/x-shellscript" "text/xml" "text/x-diff" "text/x-m4" "text/x-makefile" "text/troff" "text/html" "text/csv" "text/x-asm" "text/x-tex" "text/x-objective-c" "text/x-java"
Total lines for MIME type text/x-c: 151899
Total lines for MIME type text/x-c\+\+: 138974
Total lines for MIME type text/plain: 335045
Total lines for MIME type text/x-script.python: 2815
Total lines for MIME type text/x-shellscript: 283
Total lines for MIME type text/xml: 20239
Total lines for MIME type text/x-diff:
Total lines for MIME type text/x-m4: 1682
Total lines for MIME type text/x-makefile: 17
Total lines for MIME type text/troff:
Total lines for MIME type text/html: 743
Total lines for MIME type text/csv:
Total lines for MIME type text/x-asm: 925
Total lines for MIME type text/x-tex:
Total lines for MIME type text/x-objective-c: 62
Total lines for MIME type text/x-java: 23
Total lines for all specified MIME types: 652707

Question: How many Test Lines of Code (TLOC) are there ?

Write a bash function:

count_lines_by_mime_test() {

    target_dirs=("test" "src/test")


    mime_types=("$@")
    total_lines=0

    for mime in "${mime_types[@]}"; do

        mime_count=0


        for dir in "${target_dirs[@]}"; do

            dir_count=$(git ls-files "$dir" \
                | xargs -I {} sh -c 'file --mime-type "{}" | grep -E -q "$0$" && wc -l "{}"' "$mime" \
                | awk '{sum += $1} END {print sum}')

            mime_count=$((mime_count + dir_count))
        done
        
        echo "Total lines for MIME type $mime: $mime_count"
        total_lines=$((total_lines + mime_count))
    done

    echo "Total lines for all specified MIME types: $total_lines"
}
count_lines_by_mime_test "text/x-c" "text/x-c\+\+" "text/plain" "text/x-script.python" "text/x-shellscript" "text/xml" "text/x-diff" "text/x-m4" "text/x-makefile" "text/troff" "text/html" "text/csv" "text/x-asm" "text/x-tex" "text/x-objective-c" "text/x-java"
Total lines for MIME type text/x-c: 31532
Total lines for MIME type text/x-c\+\+: 23949
Total lines for MIME type text/plain: 2092
Total lines for MIME type text/x-script.python: 70787
Total lines for MIME type text/x-shellscript: 185
Total lines for MIME type text/xml: 0
Total lines for MIME type text/x-diff: 0
Total lines for MIME type text/x-m4: 0
Total lines for MIME type text/x-makefile: 6
Total lines for MIME type text/troff: 0
Total lines for MIME type text/html: 40
Total lines for MIME type text/csv: 126
Total lines for MIME type text/x-asm: 0
Total lines for MIME type text/x-tex: 0
Total lines for MIME type text/x-objective-c: 0
Total lines for MIME type text/x-java: 0
Total lines for all specified MIME types: 128717

Question: What percentage of the line count is source code and what percentage is test code ?

Total lines = 854094

Total source lines = 652707

Total test lines = 128717

652707 / 854094 ~= 0.76 = 76% source code

128717 / 854094 ~= 0.15 = 15% test code

Question: How many lines are not source or test code ?

854094 - 652707 - 128717 = 72670

72670 / 854094 ~= 0.085 = 8.5% other

Question: What is the size of the codebase ?

Size of current dir:

du -sh .
3.7G	.

Size of subdirs in descending order:

du -sh */ | sort -hr
3.4G	src/
 22M	test/
5.1M	autom4te.cache/
1.9M	doc/
1.0M	contrib/
916K	build-aux/
568K	share/
468K	depends/
188K	build_msvc/
152K	ci/

Size of subdirs (2 levels) in descending order:

du -sh */ */*/ | sort -hr
3.3G	src/
1.4G	src/test/
305M	src/bench/
121M	src/rpc/
 85M	src/node/
 47M	src/script/
 34M	src/leveldb/
 30M	src/crypto/
 29M	src/secp256k1/
 22M	test/
 22M	src/util/
 22M	src/index/
 20M	src/qt/
 18M	src/common/
 17M	test/cache/
 16M	src/kernel/
 14M	src/policy/
 11M	src/univalue/
7.3M	src/minisketch/
6.7M	src/primitives/
5.1M	autom4te.cache/
4.9M	src/init/
4.2M	test/functional/
4.2M	src/consensus/
1.9M	src/wallet/
1.9M	doc/
1.2M	doc/release-notes/
1.0M	src/support/
1.0M	contrib/
916K	build-aux/
568K	share/
500K	share/pixmaps/
468K	depends/
444K	build-aux/m4/
436K	src/crc32c/
348K	test/util/
228K	contrib/seeds/
200K	contrib/guix/
188K	build_msvc/
176K	contrib/devtools/
152K	test/lint/
152K	ci/
136K	depends/packages/
128K	doc/man/
124K	depends/patches/
108K	ci/test/
104K	src/ipc/
 80K	contrib/tracing/
 76K	contrib/macdeploy/
 72K	src/zmq/
 68K	src/interfaces/
 52K	src/compat/
 40K	doc/design/
 40K	contrib/signet/
 40K	contrib/completions/
 36K	depends/hosts/
 36K	contrib/verify-commits/
 36K	contrib/verify-binaries/
 28K	src/config/
 28K	contrib/linearize/
 28K	contrib/init/
 24K	share/examples/
 24K	doc/policy/
 24K	depends/builders/
 16K	test/fuzz/
 16K	contrib/testgen/
 16K	build_msvc/libbitcoin_qt/
 12K	test/sanitizer_suppressions/
 12K	share/rpcauth/
 12K	share/qt/
 12K	contrib/windeploy/
 12K	contrib/message-capture/
 12K	ci/retry/
 12K	ci/lint/
8.0K	contrib/shell/
8.0K	contrib/qos/
8.0K	contrib/debian/
8.0K	build_msvc/test_bitcoin/
8.0K	build_msvc/test_bitcoin-qt/
8.0K	build_msvc/msbuild/
8.0K	build_msvc/bitcoind/
4.0K	src/obj/
4.0K	src/logging/
4.0K	contrib/zmq/
4.0K	build_msvc/libunivalue/
4.0K	build_msvc/libtest_util/
4.0K	build_msvc/libsecp256k1/
4.0K	build_msvc/libminisketch/
4.0K	build_msvc/libleveldb/
4.0K	build_msvc/libbitcoin_zmq/
4.0K	build_msvc/libbitcoin_wallet_tool/
4.0K	build_msvc/libbitcoin_wallet/
4.0K	build_msvc/libbitcoin_util/
4.0K	build_msvc/libbitcoin_node/
4.0K	build_msvc/libbitcoin_crypto/
4.0K	build_msvc/libbitcoin_consensus/
4.0K	build_msvc/libbitcoin_common/
4.0K	build_msvc/libbitcoin_cli/
4.0K	build_msvc/bitcoin-wallet/
4.0K	build_msvc/bitcoin-util/
4.0K	build_msvc/bitcoin-tx/
4.0K	build_msvc/bitcoin-qt/
4.0K	build_msvc/bitcoin-cli/
4.0K	build_msvc/bench_bitcoin/

Selected output:

du -sh */ */*/ | sort -hr
3.4G	src/
1.5G	src/test/
[...]
 22M	test/
[...]

The total size of src without src/test is:
3.4 - 1.5 = 1.9 GB

This is 1.9 / 3.7 ~= 0.51 = 51% of the codebase size.

Total size of src/test + test is:
1.5 + 0.022 = 1.522 GB

This is
1.522 / 3.7 ~= 0.41 = 41% of the codebase size.

We have a size ratio of:
1.522 / 1.9 ~= 0.8

I.e. Test code is 80% of the size of the source code.

What is the size of the code that is not source or tests ?

3.7 - 3.4 - 0.022 = 0.278 GB

This is 0.278 / 3.7 ~= 0.075 = 7.5% of the codebase size.

Question: What does the layout of the codebase look like ?

admin@horizon bitcoin % tree -d -L 1
.
β”œβ”€β”€ autom4te.cache
β”œβ”€β”€ build-aux
β”œβ”€β”€ build_msvc
β”œβ”€β”€ ci
β”œβ”€β”€ contrib
β”œβ”€β”€ depends
β”œβ”€β”€ doc
β”œβ”€β”€ share
β”œβ”€β”€ src
└── test

11 directories

admin@horizon bitcoin % tree -d -L 2
.
β”œβ”€β”€ autom4te.cache
β”œβ”€β”€ build-aux
β”‚Β Β  └── m4
β”œβ”€β”€ build_msvc
β”‚Β Β  β”œβ”€β”€ bench_bitcoin
β”‚Β Β  β”œβ”€β”€ bitcoin-cli
β”‚Β Β  β”œβ”€β”€ bitcoin-qt
β”‚Β Β  β”œβ”€β”€ bitcoin-tx
β”‚Β Β  β”œβ”€β”€ bitcoin-util
β”‚Β Β  β”œβ”€β”€ bitcoin-wallet
β”‚Β Β  β”œβ”€β”€ bitcoind
β”‚Β Β  β”œβ”€β”€ libbitcoin_cli
β”‚Β Β  β”œβ”€β”€ libbitcoin_common
β”‚Β Β  β”œβ”€β”€ libbitcoin_consensus
β”‚Β Β  β”œβ”€β”€ libbitcoin_crypto
β”‚Β Β  β”œβ”€β”€ libbitcoin_node
β”‚Β Β  β”œβ”€β”€ libbitcoin_qt
β”‚Β Β  β”œβ”€β”€ libbitcoin_util
β”‚Β Β  β”œβ”€β”€ libbitcoin_wallet
β”‚Β Β  β”œβ”€β”€ libbitcoin_wallet_tool
β”‚Β Β  β”œβ”€β”€ libbitcoin_zmq
β”‚Β Β  β”œβ”€β”€ libleveldb
β”‚Β Β  β”œβ”€β”€ libminisketch
β”‚Β Β  β”œβ”€β”€ libsecp256k1
β”‚Β Β  β”œβ”€β”€ libtest_util
β”‚Β Β  β”œβ”€β”€ libunivalue
β”‚Β Β  β”œβ”€β”€ msbuild
β”‚Β Β  β”œβ”€β”€ test_bitcoin
β”‚Β Β  └── test_bitcoin-qt
β”œβ”€β”€ ci
β”‚Β Β  β”œβ”€β”€ lint
β”‚Β Β  β”œβ”€β”€ retry
β”‚Β Β  └── test
β”œβ”€β”€ contrib
β”‚Β Β  β”œβ”€β”€ completions
β”‚Β Β  β”œβ”€β”€ debian
β”‚Β Β  β”œβ”€β”€ devtools
β”‚Β Β  β”œβ”€β”€ guix
β”‚Β Β  β”œβ”€β”€ init
β”‚Β Β  β”œβ”€β”€ linearize
β”‚Β Β  β”œβ”€β”€ macdeploy
β”‚Β Β  β”œβ”€β”€ message-capture
β”‚Β Β  β”œβ”€β”€ qos
β”‚Β Β  β”œβ”€β”€ seeds
β”‚Β Β  β”œβ”€β”€ shell
β”‚Β Β  β”œβ”€β”€ signet
β”‚Β Β  β”œβ”€β”€ testgen
β”‚Β Β  β”œβ”€β”€ tracing
β”‚Β Β  β”œβ”€β”€ verify-binaries
β”‚Β Β  β”œβ”€β”€ verify-commits
β”‚Β Β  β”œβ”€β”€ windeploy
β”‚Β Β  └── zmq
β”œβ”€β”€ depends
β”‚Β Β  β”œβ”€β”€ builders
β”‚Β Β  β”œβ”€β”€ hosts
β”‚Β Β  β”œβ”€β”€ packages
β”‚Β Β  └── patches
β”œβ”€β”€ doc
β”‚Β Β  β”œβ”€β”€ design
β”‚Β Β  β”œβ”€β”€ man
β”‚Β Β  β”œβ”€β”€ policy
β”‚Β Β  └── release-notes
β”œβ”€β”€ share
β”‚Β Β  β”œβ”€β”€ examples
β”‚Β Β  β”œβ”€β”€ pixmaps
β”‚Β Β  β”œβ”€β”€ qt
β”‚Β Β  └── rpcauth
β”œβ”€β”€ src
β”‚Β Β  β”œβ”€β”€ bench
β”‚Β Β  β”œβ”€β”€ common
β”‚Β Β  β”œβ”€β”€ compat
β”‚Β Β  β”œβ”€β”€ config
β”‚Β Β  β”œβ”€β”€ consensus
β”‚Β Β  β”œβ”€β”€ crc32c
β”‚Β Β  β”œβ”€β”€ crypto
β”‚Β Β  β”œβ”€β”€ index
β”‚Β Β  β”œβ”€β”€ init
β”‚Β Β  β”œβ”€β”€ interfaces
β”‚Β Β  β”œβ”€β”€ ipc
β”‚Β Β  β”œβ”€β”€ kernel
β”‚Β Β  β”œβ”€β”€ leveldb
β”‚Β Β  β”œβ”€β”€ logging
β”‚Β Β  β”œβ”€β”€ minisketch
β”‚Β Β  β”œβ”€β”€ node
β”‚Β Β  β”œβ”€β”€ obj
β”‚Β Β  β”œβ”€β”€ policy
β”‚Β Β  β”œβ”€β”€ primitives
β”‚Β Β  β”œβ”€β”€ qt
β”‚Β Β  β”œβ”€β”€ rpc
β”‚Β Β  β”œβ”€β”€ script
β”‚Β Β  β”œβ”€β”€ secp256k1
β”‚Β Β  β”œβ”€β”€ support
β”‚Β Β  β”œβ”€β”€ test
β”‚Β Β  β”œβ”€β”€ univalue
β”‚Β Β  β”œβ”€β”€ util
β”‚Β Β  β”œβ”€β”€ wallet
β”‚Β Β  └── zmq
└── test
    β”œβ”€β”€ cache
    β”œβ”€β”€ functional
    β”œβ”€β”€ fuzz
    β”œβ”€β”€ lint
    β”œβ”€β”€ sanitizer_suppressions
    └── util

105 directories

Through experimentation, by adding 1 to the -L arg and noting the level beyond which it didn't change, I found that the deepest directory level is 7.

Find greatest depth of directory structure:

find . -type d | awk -F'/' '{print NF-1}' | sort -nr | head -n 1
7



πŸ‘‰ Any thoughts ? You can write a reply to this article on Tela Network:
tela.network/log/ARTICLE_ID_GOES_HERE

πŸ“© Contact StJohn Piano on Tela:
tela.app/id/stjohn_piano/5db830

β˜•οΈ Follow StJohn Piano on LinkedIn:
linkedin.com/in/stjohnpiano

🌎 Sponsor: Tela Network
tela.network

Tela Network is a narrow channel for a noisy world. Goal: Survive and thrive in the age of AI and blockchain. Subscribe to the channel for free. Join the network to post.