Understanding BIP39

March 23, 2023

@trizin

BIP39, or Bitcoin Improvement Proposal 39, is a widely adopted standard for generating and managing mnemonic phrases or seed phrases in cryptocurrency wallets. Seed phrases are used to create a hierarchy of private and public keys, which allow users to access and manage their cryptocurrencies securely.

Generating a seed phrase

The first step in generating a seed phrase is creating entropy. Entropy provides randomness and security. The higher the entropy, the more secure the seed phrase, but the number of words in the phrase increases as well. The table below shows the relationship between the number of words and the entropy size:

Words Entropy size
12 128
18 192
24 256

The following example goes over creating a 12 words seed phrase.

  1. Generate 128-bit entropy.
import secrets n_bits = 128 n_bytes = n_bits // 8 entropy = secrets.token_bytes(n_bytes) entropybits = "".join([f"{x:08b}" for x in entropy])
  1. Generate checksum and append it to the end of the entropy to create final entropy. Checksum is generated by taking the first checksum_size bits of SHA256 hash of entropy.
import hashlib checksum_size = n_bits // 32 hash = hashlib.sha256(entropy).digest() hashbits = "".join([f"{x:08b}" for x in hash]) checksum = hashbits[:checksum_size] final_entropy_bits = entropybits + checksum
  1. Load BIP39 word list into memory. The wordlist can be downloaded from here.
words = [] with open("bip39-wordlist.txt", "r") as wordlist: words = [word for word in wordlist.readlines()]
  1. Split the binary seed into 11-bit chunks and convert each 11-bit chunk to an integer and use it as an index to look up the corresponding word in the word list
seedphrase = [] for i in range(0, n_bits + checksum_size, 11): chunk = final_entropy_bits[int(i) : int(i + 11)] word_index = int(chunk, 2) seedphrase.append(words[word_index].strip()) print(" ".join(seedphrase))

In this case, the 128-bit entropy size results in a 132-bit final entropy size, which is split into 12 words using 11-bit chunks. Thus, the seed phrase consists of 12 words.

Similarly, if you were to follow the same steps with a 256-bit entropy size, the final entropy size would be 264 bits, which is split into 24 words using 11-bit chunks. This would result in the generation of a 24-word seed phrase.

Deriving crypto wallets from seed phrase

To derive the public and private keys, the seed phrase is input into a deterministic algorithm which generates a sequence of numbers that are used to derive the private and public keys for the wallet. The algorithm uses a mathematical formula to derive a sequence of numbers based on the seed phrase and an index number.

This sequence of numbers is then used to generate a chain of private and public keys, with each key being derived from the previous key in the chain. This hierarchical structure allows for the creation of an infinite number of keys, each one being unique and secure.