by Dirk Merkel

Bitcoin for beginners, Part 2: Bitcoin as a technology and network

how-to
Dec 06, 201114 mins
Jakarta EEJavaWeb Development

Bitcoin transactions and the block chain

Dirk Merkel continues his introduction to Bitcoin with a look at the Bitcoin network as a system. He briefly explains the mechanics of transactions, blocks, and the block chain, as well as the Bitcoin wallet. He also discusses double-spending and Bitcoin mining, two controversial aspects of Bitcoin.

Having seen how Bitcoin is used (in Part 1) we’re ready to look at some of its underlying concepts as a technology and network. This article introduces the mechanics of Bitcoin’s core components: transactions, blocks, and the block chain. I also demystify controversial aspects of Bitcoin such as encryption (added in v0.4 of the standard client), the problem of double-spending, and how and why Bitcoin mining happens. Finally, I use Bitcoin Block Explorer to extract some data that I then use to gauge the overall health of the Bitcoin economy today.

To get started, consider Figure 1, which illustrates the relationship between the block chain, individual blocks, and transactions, as well as the structure of each component. I discuss the components below.

Figure 1. The Bitcoin network system (click for larger image)

Transactions

Bitcoin uses public-key cryptography for signing transactions. In the Bitcoin marketplace, coins are transferred between users via an exchange of keys. First, an amount of coins is associated with a given user via a key pair. The public key serves as an address to which Bitcoins can be sent, similar to an account number in traditional banking. The private key is like an ATM PIN that can be used to both access the funds and authorize their transfer to another user. The private key holder is the only one who can transfer or spend the Bitcoins associated with a given public key. Whereas your bank can reset your ATM PIN, losing your private keys results in an irreversible loss of the funds.

Someone using the Bitcoin marketplace who wants to transfer Bitcoins to another user must first associate the recipient’s address (public key) with the desired number of Bitcoins. This tells the Bitcoin network where the Bitcoins will go. Next, he signs the transaction with his private key, which must correspond to the public key at which he originally received the Bitcoins. This signing process ensures the Bitcoin network that the transaction is initiated by the current owner of the Bitcoins being traded. Finally, the sender broadcasts the signed transaction to the network where it will be propagated from one node to the next and integrated into the Bitcoin block chain (discussed below).

Public key cryptography

Public-key cryptography uses key pairs, a public and a private one, that can be generated by a user at any time. As the name implies, the public key gets distributed, while the private one remains in the possession of a single person or entity. The private key is meant to be kept safe and secure by the owner. The private key can be used to sign messages such that anybody with the key pair’s corresponding public key can verify with certainty that only the holder of the private key could have signed the message.

The wallet

The Bitcoin wallet is a local data file that contains all of your Bitcoins. Each wallet maintains a listing of transactions and a reserve of 100 unused key pairs, which can be used to receive and send future funds. The wallet also stores housekeeping items such as the client version number and your preferences. Without the wallet (and especially its key pairs) you cannot send or receive funds. And if you lose your wallet you lose all of your Bitcoins. Needless to say, it’s a good idea to have a good backup strategy for your wallet.

Backing up the wallet is made easier by the fact that it consists of a single data file, the location of which depends on your OS. The same rules for backing up any sensitive data apply here: create multiple backups, at least one of which should be offsite.

Encryption of Bitcoin wallets was introduced in version 0.4 of the standard client. Prior versions stored wallets in the clear, which was often somewhat of a surprise to many users considering the potentially valuable content.

Next we’ll look at the block chain, which is both a record of the Bitcoin economy and a pipeline for moving, as well as generating, Bitcoins.

The block chain

Bitcoin transactions are bundled together into blocks, which are then linked together in a chain that captures the complete history of the Bitcoin economy, from the oldest to the newest block. The block chain is a living record of which Bitcoins belong to which address. It is possible to verify every transaction from the creation of the Bitcoin economy to the present by traversing the block chain — something we’ll explore toward the end of this article.

A block header consists of six fields, one or more of which changes with each successive hash attempt. The fields are as follows:

  1. Client software version number
  2. Hash of the previous block; the last-known block in the block chain
  3. The Merkle root of the transactions in the body of the block (a 256-bit hash produced by hashing all transactions in the body of the block)
  4. A timestamp
  5. A 256-bit number that is shared by all nodes, called the target
  6. A counter that increases with each hash, called the nonce

At any given time, all nodes on the network may compete to create the next block. A block is only considered valid if the double SHA256 hash of the block header is below a 256-bit number (the target) which is shared by all nodes. The lower the target, the harder it is to find a valid block. Nodes search for the target by increasing a counter, or nonce, which slightly modifies the block header. The network adjusts the target such that a new block gets created roughly every 10 minutes.

Because it takes many attempts to find a hash that makes a valid block, the computational cost associated with creating a new block isn’t trivial. As of mid September 2011, the target required an average of 7,539,609,386,691,347 hash attempts to solve a new block. As a reward for finding and creating a valid block, the user receives 50 newly created Bitcoins. Thus this process is responsible for gradually releasing Bitcoins into the economy — currently at an average rate of 300 BTC per hour.

Bitcoin storage

You might think that storing the complete Bitcoin economy’s transaction history on your hard drive would take up too much space. Bitcoin creator Nakamoto has actually estimated that Moore’s Law will ensure that advances in storage and processing requirements will outpace the needs of the Bitcoin network and local clients.

Understanding how the block chain works is important if you want to do more than exchange coins in the Bitcoin economy. It’s also essential to understanding Bitcoin’s solution to double-spending — a problem that must be addressed by any real-world economy.

Double-spending in the Bitcoin network

Double-spending occurs when an owner tries to spend his funds twice. This is less of a problem with physical currency because once someone spends money by giving it to someone else, he cannot give it away again (that is, unless he makes a copy of it, which is illegal). A traditional bank solves this problem by record keeping: the bank keeps a ledger and adjusts your balance up or down depending on the inflow or outflow of money. Once money has left your account, you cannot spend it again. In other words, a check drawn on the account will clear if sufficient funds are present; otherwise, the check will bounce.

The Bitcoin economy faces the same problem. What is to prevent a user from promising the same Bitcoins, in the same amount, to two or more merchants at once? Bitcoin’s solution to this problem is a peer-to-peer distributed timestamp server, which works in conjunction with the idea of chained proofs of work.

How Bitcoin regulates transactions

Let’s assume a user tries to double-spend her funds. Assume that with Transaction A, she spends her funds by paying one merchant and with Transaction B she sends the same funds to a different merchant. All new transactions are broadcasted to the network, which poses this user’s first challenge: she cannot broadcast both transactions to the same network nodes. Doing so would give the nodes all the information needed to detect her double-spending and reject either of the transactions. For a given pair of transactions trying to double-spend the same Bitcoins, any given node will retain and propagate the transaction it received first and ignore the one received second. So she’ll send each transaction to a different set of nodes. Each node set will work on creating the next block in the block chain for its given transaction. Once a block is found, it will be broadcasted to other nodes as part of the block chain.

Now, it’s possible for two or more alternative versions of the block chain to exist. In this case, we could have a last block that contained Transaction A and an alternative chain where the last block contained Transaction B. You could picture this as a branch in the block chain. In this case, the deciding rule is that the longest chain wins. Each node set will continue extending what it perceives to be the longest chain. Even though alternate branches may grow at the same rate for some time, one branch will eventually outpace the other. When that happens, more and more nodes will join the longer branch and eventually the shorter branch will be abandoned and die off. Returning to our example, this race to create the longest branch ensures that only one transaction will have found its way into the main block chain, whereas the other transaction died off with the abandoned branch.

The time and CPU power required to find a valid block is considered a proof-of-work — meaning that each block functions as a discrete proof. While a significant amount of computing power was needed to find the next block, its validity can be checked with a simple hash operation. A recipient typically wait for a certain number of blocks to be created that incorporate the transaction before they consider it confirmed. For the original Bitcoin client, this number was 6, but it is up to each new recipient how many blocks to require before accepting a transaction as confirmed.

For our user to double-spend Bitcoins, she would have to wait until enough blocks were created for the recipient to be satisfied that the transaction was valid. After receiving the merchandise or service, she would then change her transaction by indicating a different recipient. But now she would have to recreate all of the prior proofs-of-work, because each subsequent block would incorporate a hash of the previous one. Solving all those proofs-of-work and overtaking the original branch would require tremendous computing power — more than the rest of the network combined.

Bitcoin mining

Mining for Bitcoins is the term used for clients trying to find the next valid block. Since this can be a rather lucrative activity many users try to participate. The most CPU and thus time-intensive part of that process is continuously calculating the SHA256 hash. It turns out that GPUs are roughly 50 to 100 times more efficient at this operation than CPUs. As a result, users have assembled specialized hardware consisting of banks of GPUs that do nothing but crunch through the SHA256-hashing algorithm. These setups are sometimes called mining rigs. Each hashing attempt can be thought of as a lottery ticket. The more hashes a user calculates, the higher their likelihood of solving the next block and collecting the coveted 50 Bitcoins.

Because individual users without specialized hardware have little hope of solving the next block, users have begun to pool their computing resources. Once a block has been solved, the reward is split among the participants of a mining pool. The amount of Bitcoins awarded to each participant typically depends on how many hashes he or she has contributed to the solving attempt.

Measuring transaction activity on the Bitcoin network

All transactions are public on Bitcoin, so it’s not hard to find out the volume or activity of transactions on the Bitcoin network. Bitcoin Block Explorer is a web-based interface that I used to browse the block chain. Because the network is transparent, I was able to drill down to the contents of individual blocks, and even look at the contents of some individual transactions.

Over the course of four hours on a Wednesday afternoon (UTC) in September 2011 I gathered the following statistics: A total of 20 blocks were created, which means that 1,000 new Bitcoins were generated. (That is, each person or group credited with first finding one of 20 new blocks was awarded 50 BTC.) On average, a block was found every 12.35 minutes, though at one point two blocks were found within one minute of each other. The last block that was found during that period was the 144,391st to be added to the end of the block chain. On average, each block contained 63.85 transactions with the smallest number of transaction in a block being 8 and the largest being 119. In terms of BTC amounts, the largest number of Bitcoins that changed hands in a block was 7,301.66 while the smallest was 457.83. The average transaction size was therefore 60.5 Bitcoins.

In four hours, close to 80 thousand (77,261.71) BTC were traded. Lastly, the storage space required by the various blocks ranged from 3.97 KB to 67.35 KB, with the average being 27.87 KB. That means that the data required to describe all activity in the Bitcoin economy can be stored in just over half a megabyte of disk space.

More about Bitcoin Block Explorer

Bitcoin Block Explorer displays data about the Bitcoin economy in near real-time with a delay of up to two minutes. However, the site also supports an HTTP-based query interface for several metrics, such as getblockcount (number of blocks in the longest chain), totalbc (the total number of Bitcoins in circulation) and latesthash (the latest block hash). I used this interface to do some quick calculations from a Bash command line. For instance, as shown in Figure 2, I used curl to retrieve the total amount of Bitcoins and block count, divide one by the other using bc, and display the output.

Figure 2. Bitcoin Block Explorer from the command line (click for larger image)

The results in Figure 2 confirm that there are exactly 50 BTC for each block in the main block chain. That is because the reward for finding a block has been 50 BTC since the onset of the economy, although that number will decrease over time.

Something else that you can do with the Bitcoin Block Explorer is to display the so-called genesis block — the first block in the main block chain. The genesis block contains only a single transaction, namely the 50 BTC reward for solving the block.

Conclusion to Part 2: Get ready for BitCoinJ

We’ve spent the past two articles getting to know Bitcoin, both from a user’s perspective and as a network system. In Part 3, we’ll do some hands-on work with Bitcoin. For that, we’ll use BitCoinJ, a Java implementation of the Bitcoin protocol and client. BitCoinJ is open source and aims to be more accessible and lightweight than the original C++ client. We’ll leverage BitCoinJ to create small sample applications that allow us to explore the data structures and network interactions involved in Bitcoin.

Dirk Merkel is the CTO at VivanTech Inc. He has been developing software in a variety of languages for over 25 years and has been getting paid for it for over 15 years. In his spare time, he likes to learn about new technologies and ruin perfectly good open-source projects by submitting unsolicited patches. He also writes about technology, software development, and architecture. He lives in San Diego with his lovely wife and two wonderful daughters. Dirk can be reached at dmerkel@vivantech.com.