Protocol documentation
Sources:
- Original Bitcoin client source
Type names used in this documentation are from the C99 standard.
Common standards
Hashes
Usually, when a hash is computed within bitcoin, it is computed twice. Most of the time SHA-256 hashes are used, however RIPEMD-160 is also used when a shorter hash is desirable (for example when creating a bitcoin address).
Example of double-SHA-256 encoding of string "hello":
hello 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round of sha-256) 9595c9df90075148eb06860365df33584b75bff782a510c6cd4883a419833d50 (second round of sha-256)
For bitcoin addresses (RIPEMD-160) this would give:
hello 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round is sha-256) b6a9c8c230722b7c748331a8b450f05566dc7d0f (with ripemd-160)
Merkle Trees
Merkle trees are binary trees of hashes. Merkle trees in bitcoin use Double SHA-256, and are built up as so:
hash(a) = sha256(sha256(a)) hash(a) hash(b) hash(c) hash(hash(a)+hash(b)) hash(hash(c)+hash(c)) hash(hash(hash(a)+hash(b))+hash(hash(c)+hash(c)))
They are paired up, with the last element being _duplicated_.
Signatures
Bitcoin uses Elliptic Curve Digital Signature Algorithm (ECDSA) to sign transactions.
For ECDSA the secp256k1 curve from http://www.secg.org/collateral/sec2_final.pdf is used.
Public keys (in scripts) are given as 04 <x> <y> where x and y are 32 byte strings representing the coordinates of a point on the curve. Signatures use DER encoding to pack the r and s components into a single byte stream (because this is what OpenSSL produces by default).
Transaction Verification
See also: OP_CHECKSIG
The first transaction of a block is usually the generating transaction, which do not include any "in" transaction, and generate bitcoins (from fees for example) usually received by whoever solved the block containing this transaction. Such transactions are called a "coinbase transaction" and are accepted by bitcoin clients without any need to execute scripts, provided there is only one per block.
If a transaction is not a coinbase, it references previous transaction hashes as input, and the index of the other transaction's output used as input for this transaction. The script from the in part of this transaction is executed. Then the script from the out part of the referenced transaction is executed. It is considered valid if the top element of the stack is true.
Addresses
A bitcoin address is in fact the hash of a ECDSA public key, computed this way:
Version = 1 byte of 0 (zero); on the test network, this is 1 byte of 111 Key hash = Version concatenated with RIPEMD-160(SHA-256(public key)) Checksum = 1st 4 bytes of SHA-256(SHA-256(Key hash)) Bitcoin Address = Base58Encode(Key hash concatenated with Checksum)
The Base58 encoding used is home made, and has some differences. Especially, leading zeroes are kept as single zeroes when conversion happens.
Common structures
Almost all integers are encoded in little endian. Only IP or port number are encoded big endian.
Message structure
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | magic | uint32_t | Magic value indicating message origin network, and used to seek to next message when stream state is unknown |
12 | command | char[12] | ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected) |
4 | length | uint32_t | Length of payload in number of bytes |
4 | checksum | uint32_t | First 4 bytes of sha256(sha256(payload)) (not included in version or verack) |
? | payload | uchar[] | The actual data |
The version and verack messages do not have a checksum, the payload starts 4 bytes earlier.
Known magic values:
Network | Magic value | Sent over wire as |
---|---|---|
main | 0xD9B4BEF9 | F9 BE B4 D9 |
testnet | 0xDAB5BFFA | FA BF B5 DA |
Variable length integer
Integer can be encoded depending on the represented value to save space. Variable length integers always precede an array/vector of a type of data that may vary in length.
Value | Storage length | Format |
---|---|---|
< 0xfd | 1 | uint8_t |
<= 0xffff | 3 | 0xfd + uint16_t |
<= 0xffffffff | 5 | 0xfe + uint32_t |
- | 9 | 0xff + uint64_t |
Variable length string
Variable length string can be stored using a variable length integer followed by the string itself.
Field Size | Description | Data type | Comments |
---|---|---|---|
? | length | var_int | Length of the string |
? | string | char[] | The string itself (can be empty) |
Network address
When a network address is needed somewhere, this structure is used. This protocol and structure supports IPv6, but note that the original client currently only supports IPv4 networking.
Field Size | Description | Data type | Comments |
---|---|---|---|
8 | services | uint64_t | same service(s) listed in version? |
16 | IPv6/4 | char[16] | IPv6 address. Network byte order. The original client only supports IPv4 and only reads the last 4 bytes to get the IPv4 address. However, the IPv4 address is written into the message as a 16 byte IPv4-mapped IPv6 address
(12 bytes 00 00 00 00 00 00 00 00 00 00 FF FF, followed by the 4 bytes of the IPv4 address). |
2 | port | uint16_t | port number, network byte order |
Hexdump example of Network address structure
0000 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0010 00 00 FF FF 0A 00 00 01 20 8D ........ . Network address: 01 00 00 00 00 00 00 00 - 1 (NODE_NETWORK? see services listed under version command) 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 - IPv6: ::ffff:10.0.0.1 or IPv4: 10.0.0.1 20 8D - Port 8333
Inventory Vectors
Inventory vectors are used for notifying other nodes about objects they have or data which is being requested.
Inventory vectors consist of the following data format:
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | type | uint32_t | Identifies the object type linked to this inventory |
32 | hash | char[32] | Hash of the object |
The object type is currently defined as one of the following possibilities:
Value | Name | Description |
---|---|---|
0 | ERROR | Any data of with this number may be ignored |
1 | MSG_TX | Hash is related to a transaction |
2 | MSG_BLOCK | Hash is related to a data block |
Other Data Type values are considered reserved for future implementations.
Block Headers
Block headers are sent in a headers packet in response to a getheaders message.
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | uint32_t | Block version information, based upon the software version creating this block |
32 | prev_block | char[32] | The hash value of the previous block this particular block references |
32 | merkle_root | char[32] | The reference to a Merkle tree collection which is a hash of all transactions related to this block |
4 | timestamp | uint32_t | A timestamp recording when this block was created (Limited to 2106!) |
4 | bits | uint32_t | The calculated difficulty target being used for this block |
4 | nonce | uint32_t | The nonce used to generate this block… to allow variations of the header and compute different hashes |
1 | txn_count | uint8_t | Number of transaction entries, this value is always 0 |
Message types
version
When a node creates an outgoing connection, it will immediately advertise its version. The remote node will respond with its version. No futher communication is possible until both peers have exchanged their version.
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | uint32_t | Identifies protocol version being used by the node |
8 | services | uint64_t | bitfield of features to be enabled for this connection |
8 | timestamp | uint64_t | standard UNIX timestamp in seconds |
26 | addr_me | net_addr | The network address of the node emitting this message |
version >= 106 | |||
26 | addr_you | net_addr | The network address seen by the node emitting this message (ie, the address of the receiving node) |
8 | nonce | uint64_t | Node random unique id. This id is used to detect connections to self |
? | sub_version_num | var_str | Secondary Version information (0x00 if string is 0 bytes long) |
version >= 209 | |||
4 | start_height | uint32_t | The last block received by the emitting node |
If the emitter of the packet has version >= 209, a "verack" packet shall be sent if the version packet was accepted.
The following services are currently assigned:
Value | Name | Description |
---|---|---|
1 | NODE_NETWORK | This node can be asked for full blocks instead of just headers. |
Hexdump example of version message (note the message header for this version message does not have a checksum):
0000 F9 BE B4 D9 76 65 72 73 69 6F 6E 00 00 00 00 00 ....version..... 0010 55 00 00 00 9C 7C 00 00 01 00 00 00 00 00 00 00 U....|.......... 0020 E6 15 10 4D 00 00 00 00 01 00 00 00 00 00 00 00 ...M............ 0030 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 ................ 0040 DA F6 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0050 00 00 00 00 FF FF 0A 00 00 02 20 8D DD 9D 20 2C .......... ... , 0060 3A B4 57 13 00 55 81 01 00 :.W..U... Message header: F9 BE B4 D9 - Main network magic bytes 76 65 72 73 69 6F 6E 00 00 00 00 00 - "version" command 55 00 00 00 - Payload is 85 bytes long - No checksum in version message Version message: 9C 7C 00 00 - 31900 (version 0.3.19) 01 00 00 00 00 00 00 00 - 1 (NODE_NETWORK services) E6 15 10 4D 00 00 00 00 - Mon Dec 20 21:50:14 EST 2010 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 DA F6 - Sender address info - see Network Address 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 02 20 8D - Recipient address info - see Network Address DD 9D 20 2C 3A B4 57 13 - Node random unique ID 00 - "" sub-version string (string is 0 bytes long) 55 81 01 00 - Last block sending node has is block #98645
verack
The verack message is sent in reply to version for clients >= 209. This message consists of only a message header with the command string "verack".
Hexdump of the verack message:
0000 F9 BE B4 D9 76 65 72 61 63 6B 00 00 00 00 00 00 ....verack...... 0010 00 00 00 00 .... Message header: F9 BE B4 D9 - Main network magic bytes 76 65 72 61 63 6B 00 00 00 00 00 00 - "verack" command 00 00 00 00 - Payload is 0 bytes long
addr
Provide information on known nodes of the network. Non-advertised nodes should be forgotten after typically 3 hours
Payload (maximum payload length: 1000 bytes):
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | count | var_int | Number of address entries |
30x? | addr_list | (uint32_t + net_addr)[] | Address of other nodes on the network. version < 209 will only read the first one. The uint32_t is a timestamp (see note below). |
Note: Starting version 31402, addresses are prefixed with a timestamp. If no timestamp is present, the addresses should not be relayed to other peers, unless it is indeed confirmed they are up.
Hexdump example of addr message:
0000 F9 BE B4 D9 61 64 64 72 00 00 00 00 00 00 00 00 ....addr........ 0010 1F 00 00 00 ED 52 39 9B 01 E2 15 10 4D 01 00 00 .....R9.....M... 0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF ................ 0030 FF 0A 00 00 01 20 8D ..... . Message Header: F9 BE B4 D9 - Main network magic bytes 61 64 64 72 00 00 00 00 00 00 00 00 - "addr" 1F 00 00 00 - payload is 31 bytes long ED 52 39 9B - checksum of payload Payload: 01 - 1 address in this message Address: E2 15 10 4D - Mon Dec 20 21:50:10 EST 2010 (only when version is >= 31402) 01 00 00 00 00 00 00 00 - 1 (NODE_NETWORK service - see version message) 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 - IPv4: 10.0.0.1, IPv6: ::ffff:10.0.0.1 (IPv4-mapped IPv6 address) 20 8D - port 8333
inv
Allows a node to advertise its knowledge of one or more objects. It can be received unsolicited, or in reply to getblocks.
Payload (maximum payload length: 50000 bytes):
Field Size | Description | Data type | Comments |
---|---|---|---|
? | count | var_int | Number of inventory entries |
36x? | inventory | inv_vect[] | Inventory vectors |
getdata
getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements.
Payload (maximum payload length: 50000 bytes):
Field Size | Description | Data type | Comments |
---|---|---|---|
? | count | var_int | Number of inventory entries |
36x? | inventory | inv_vect[] | Inventory vectors |
getblocks
Return an inv packet containing the list of blocks starting at hash_start, up to hash_stop or 500 blocks, whichever comes first. To receive the next blocks hashes, one needs to issue getblocks again with the last known hash.
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | uint32_t | for some reason, the protocol version |
1+ | start count | var_int | number of block locator hash entries |
32+ | block locator hashes | char[32] | block locator object. Newest back to genesis block (dense to start, but then sparse) |
32 | hash_stop | char[32] | hash of the last desired block. Set to zero to get as many blocks as possible (500) |
getheaders
Return a headers packet containing the headers for blocks starting at hash_start, up to hash_stop or 2000 blocks, whichever comes first. To receive the next blocks hashes, one needs to issue getheaders again with the last known hash. The getheaders command is used by thin clients to quickly download the blockchain where the contents of the transactions would be irrelevant (because they are not ours).
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | start count | var_int | number of hash_start entries |
32+ | hash_start | char[32] | hash of the last known block of the emitting node |
32 | hash_stop | char[32] | hash of the last desired block. Set to zero to get as many blocks as possible (2000) |
tx
tx describes a bitcoin transaction, in reply to getdata
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | uint32_t | Transaction data format version |
1+ | tx_in count | var_int | Number of Transaction inputs |
41+ | tx_in | tx_in[] | A list of 1 or more transaction inputs or sources for coins |
1+ | tx_out count | var_int | Number of Transaction outputs |
8+ | tx_out | tx_out[] | A list of 1 or more transaction outputs or destinations for coins |
4 | lock_time | uint32_t | The block number or timestamp at which this transaction is locked, or 0 if the transaction is always locked. A non-locked transaction must not be included in blocks, and it can be modified by broadcasting a new version before the time has expired (replacement is currently disabled in Bitcoin, however, so this is useless). |
TxIn consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
36 | previous_output | outpoint | The previous output transaction reference, as an OutPoint structure |
1+ | script length | var_int | The length of the signature script |
? | signature script | uchar[] | Computational Script for confirming transaction authorization |
4 | sequence | uint32_t | Transaction version as defined by the sender. Intended for "replacement" of transactions when information is updated before inclusion into a block. |
The OutPoint structure consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | hash | char[32] | The hash of the referenced transaction. |
4 | index | uint32_t | The index of the specific output in the transaction. The first output is 0, etc. |
The Script structure consists of a series of pieces of information and operations related to the value of the transaction.
(Structure to be expanded in the future… see script.h and script.cpp for more information)
The TxOut structure consists of the following fields:
Field Size | Description | Data type | Comments |
---|---|---|---|
8 | value | uint64_t | Transaction Value |
1+ | pk_script length | var_int | Length of the pk_script |
? | pk_script | uchar[] | Usually contains the public key as a Bitcoin script setting up conditions to claim this output. |
Example tx message:
000000 F9 BE B4 D9 74 78 00 00 00 00 00 00 00 00 00 00 ....tx.......... 000010 02 01 00 00 E2 93 CD BE 01 00 00 00 01 6D BD DB .............m.. 000020 08 5B 1D 8A F7 51 84 F0 BC 01 FA D5 8D 12 66 E9 .[...Q........f. 000030 B6 3B 50 88 19 90 E4 B4 0D 6A EE 36 29 00 00 00 .;P......j.6)... 000040 00 8B 48 30 45 02 21 00 F3 58 1E 19 72 AE 8A C7 ..H0E.!..X..r... 000050 C7 36 7A 7A 25 3B C1 13 52 23 AD B9 A4 68 BB 3A .6zz%;..R#...h.: 000060 59 23 3F 45 BC 57 83 80 02 20 59 AF 01 CA 17 D0 Y#?E.W... Y..... 000070 0E 41 83 7A 1D 58 E9 7A A3 1B AE 58 4E DE C2 8D .A.z.X.z...XN... 000080 35 BD 96 92 36 90 91 3B AE 9A 01 41 04 9C 02 BF 5...6..;...A.... 000090 C9 7E F2 36 CE 6D 8F E5 D9 40 13 C7 21 E9 15 98 .~.6.m...@..!... 0000A0 2A CD 2B 12 B6 5D 9B 7D 59 E2 0A 84 20 05 F8 FC *.+..].}Y... ... 0000B0 4E 02 53 2E 87 3D 37 B9 6F 09 D6 D4 51 1A DA 8F N.S..=7.o...Q... 0000C0 14 04 2F 46 61 4A 4C 70 C0 F1 4B EF F5 FF FF FF ../FaJLp..K..... 0000D0 FF 02 40 4B 4C 00 00 00 00 00 19 76 A9 14 1A A0 ..@KL......v.... 0000E0 CD 1C BE A6 E7 45 8A 7A BA D5 12 A9 D9 EA 1A FB .....E.z........ 0000F0 22 5E 88 AC 80 FA E9 C7 00 00 00 00 19 76 A9 14 "^...........v.. 000100 0E AB 5B EA 43 6A 04 84 CF AB 12 48 5E FD A0 B7 ..[.Cj.....H^... 000110 8B 4E CC 52 88 AC 00 00 00 00 .N.R...... Message header: F9 BE B4 D9 - main network magic bytes 74 78 00 00 00 00 00 00 00 00 00 00 - "tx" command 02 01 00 00 - payload is 258 bytes long E2 93 CD BE - checksum of payload Transaction: 01 00 00 00 - version Inputs: 01 - number of transaction inputs Input 1: 6D BD DB 08 5B 1D 8A F7 51 84 F0 BC 01 FA D5 8D - previous output (outpoint) 12 66 E9 B6 3B 50 88 19 90 E4 B4 0D 6A EE 36 29 00 00 00 00 8B - script is 139 bytes long 48 30 45 02 21 00 F3 58 1E 19 72 AE 8A C7 C7 36 - signature script (scriptSig) 7A 7A 25 3B C1 13 52 23 AD B9 A4 68 BB 3A 59 23 3F 45 BC 57 83 80 02 20 59 AF 01 CA 17 D0 0E 41 83 7A 1D 58 E9 7A A3 1B AE 58 4E DE C2 8D 35 BD 96 92 36 90 91 3B AE 9A 01 41 04 9C 02 BF C9 7E F2 36 CE 6D 8F E5 D9 40 13 C7 21 E9 15 98 2A CD 2B 12 B6 5D 9B 7D 59 E2 0A 84 20 05 F8 FC 4E 02 53 2E 87 3D 37 B9 6F 09 D6 D4 51 1A DA 8F 14 04 2F 46 61 4A 4C 70 C0 F1 4B EF F5 FF FF FF FF - sequence Outputs: 02 - 2 Output Transactions Output 1: 40 4B 4C 00 00 00 00 00 - 0.05 BTC (5000000) 19 - pk_script is 25 bytes long 76 A9 14 1A A0 CD 1C BE A6 E7 45 8A 7A BA D5 12 - pk_script A9 D9 EA 1A FB 22 5E 88 AC Output 2: 80 FA E9 C7 00 00 00 00 - 33.54 BTC (3354000000) 19 - pk_script is 25 bytes long 76 A9 14 0E AB 5B EA 43 6A 04 84 CF AB 12 48 5E - pk_script FD A0 B7 8B 4E CC 52 88 AC Locktime: 00 00 00 00 - lock time
block
The block message is sent in response to a getdata message which requests transaction information from a block hash.
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | uint32_t | Block version information, based upon the software version creating this block |
32 | prev_block | char[32] | The hash value of the previous block this particular block references |
32 | merkle_root | char[32] | The reference to a Merkle tree collection which is a hash of all transactions related to this block |
4 | timestamp | uint32_t | A timestamp recording when this block was created (Limited to 2106!) |
4 | bits | uint32_t | The calculated difficulty target being used for this block |
4 | nonce | uint32_t | The nonce used to generate this block… to allow variations of the header and compute different hashes |
? | txn_count | var_int | Number of transaction entries |
? | txns | tx[] | Block transactions, in format of "tx" command |
The SHA256 hash that identifies each block (and which must have a run of 0 bits) is calculated from the first 6 fields of this structure (version, prev_block, merkle_root, timestamp, bits, nonce, and standard SHA256 padding, making two 64-byte chunks in all) and not from the complete block. To calculate the hash, only two chunks need to be processed by the SHA256 algorithm. Since the nonce field is in the second chunk, the first chunk stays constant during mining and therefore only the second chunk needs to be processed. However, a Bitcoin hash is the hash of the hash, so two SHA256 rounds are needed for each mining iteration.
headers
The headers packet returns block headers in response to a getheaders packet.
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
? | count | var_int | Number of block headers |
77x? | headers | block_header[] | Block headers |
getaddr
The getaddr message sends a request to a node asking for information about known active peers to help with identifying potential nodes in the network. The response to receiving this message is to transmit an addr message with one or more peers from a database of known active peers. The typical presumption is that a node is likely to be active if it has been sending a message within the last three hours.
No additional data is transmitted with this message.
checkorder
This message is used for IP Transactions, to ask the peer if it accepts such transactions and allow it to look at the content of the order.
It contains a CWalletTx object
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
Fields from CMerkleTx | |||
? | hashBlock | ||
? | vMerkleBranch | ||
? | nIndex | ||
Fields from CWalletTx | |||
? | vtxPrev | ||
? | mapValue | ||
? | vOrderForm | ||
? | fTimeReceivedIsTxTime | ||
? | nTimeReceived | ||
? | fFromMe | ||
? | fSpent |
submitorder
Confirms an order has been submitted.
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | hash | char[32] | Hash of the transaction |
? | wallet_entry | CWalletTx | Same payload as checkorder |
reply
Generic reply for IP Transactions
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | reply | uint32_t | reply code |
Possible values:
Value | Name | Description |
---|---|---|
0 | SUCCESS | The IP Transaction can proceed (checkorder), or has been accepted (submitorder) |
1 | WALLET_ERROR | AcceptWalletTransaction() failed |
2 | DENIED | IP Transactions are not accepted by this node |
ping
The ping message is sent primarily to confirm that the TCP/IP connection is still valid. An error in transmission is presumed to be a closed connection and the address is removed as a current peer. No reply is expected as a result of this message being sent nor any sort of action expected on the part of a client when it is used.
alert
An alert is sent between nodes to send a general notification message throughout the network. If the alert can be confirmed with the signature as having come from the the core development group of the Bitcoin software, the message is suggested to be displayed for end-users. Attempts to perform transactions, particularly automated transactions through the client, are suggested to be halted. The text in the Message string should be relayed to log files and any user interfaces.
Payload:
Field Size | Description | Data type | Comments |
---|---|---|---|
? | message | var_str | System message which is coded to convey some information to all nodes in the network |
? | signature | var_str | A signature which can be confirmed with a public key verifying that it is Satoshi (the originator of Bitcoins) who has "authorized" or created the message |
The signature is to be compared to this ECDSA public key:
04fc9702847840aaf195de8442ebecedf5b095cdbb9bc716bda9110971b28a49e0ead8564ff0db22209e0374782c093bb899692d524e9d6a6956e7c5ecbcd68284 (hash) 1AGRxqDa5WjUKBwHB9XYEjmkv1ucoUUy1s
Source: [1]
Scripting
See script.
Wireshark dissector
A dissector for wireshark is being developed at https://github.com/blueCommand/bitcoin-dissector