German researchers have discovered unknown persons are using bitcoin’s blockchain to store and link to child abuse imagery, potentially putting cryptocurrencies in jeopardy. Not only Bitcoin but Ethereum, IOTA also in danger due to "smart contracts".
Researchers from the RWTH Aachen University, Germany found that around 1,600 files were currently stored in bitcoin’s blockchain. Of the files least eight were of sexual content, including one thought to be an image of child abuse and two that contain 274 links to child abuse content, 142 of which link to dark web services.
“Our analysis shows that certain content, eg, illegal pornography, can render the mere possession of a blockchain illegal,” the researchers wrote. “Although court rulings do not yet exist, legislative texts from countries such as Germany, the UK, or the USA suggest that illegal content such as [child abuse imagery] can make the blockchain illegal to possess for all users.”
A Quantitative Analysis of the Impact of Arbitrary Blockchain Content on Bitcoin
The Full Report (DOWNLOAD PDF):
Abstract
Blockchains primarily enable credible accounting of digital events, e.g., money transfers in cryptocurrencies. However, beyond this original purpose, blockchains also irrevocably record arbitrary data, rang- ing from short messages to pictures. This does not come without risk for users as each participant has to locally replicate the complete blockchain, particularly including potentially harmful content.
We provide the first systematic analysis of the benefits and threats of arbitrary blockchain content. Our analysis shows that certain content, e.g., illegal pornogra- phy, can render the mere possession of a blockchain illegal. Based on these insights, we conduct a thorough quantitative and qualitative analysis of unintended content on Bitcoin’s blockchain.
Although most data originates from benign extensions to Bitcoin’s protocol, our analysis re- veals more than 1600 files on the blockchain, over 99 % of which are texts or images. Among these files there is clearly objectionable content such as links to child pornography, which is distributed to all Bitcoin partic- ipants.
With our analysis, we thus highlight the importance for future blockchain designs to address the possibility of unintended data insertion and protect blockchain users accordingly.
Data Insertion Methods for Bitcoin
Beyond intended recording of financial transactions, Bitcoin’s blockchain also allows for injection of non-financial data, either short messages via special trans- action types or even complete files by encoding arbitrary data as standard trans- actions. We first briefly introduce Bitcoin transactions and subsequently survey methods available to store arbitrary content on the blockchain via transactions.
Bitcoin transactions transfer funds between a payer (sender) and a payee (receiver), who are identified by public-private key pairs. Payers announce their transactions to the Bitcoin network. The miners then publish these transactions in new blocks using their computational power in exchange for a fee. These fees vary, but averaged at 215 satoshi per Byte during August 2017 [4] (1 satoshi = 10−8 bitcoin).
Each transaction consists of several input scripts, which unlock funds of previous transactions, and of several output scripts, which specify who receives these funds. To unlock funds, input scripts contain a signature for the previous transaction generated by the owner of the funds. To prevent malicious scripts from causing excessive transaction verification overheads, Bitcoin uses transaction script templates and expects peers to discard non-compliant scripts.
2.1 Low-level Data Insertion Methods
We first survey the efficiency of the low-level data insertion methods w.r.t. to in- sertable payload and costs per transaction (Table 1). To this end, we first explain our comparison methodology, before we detail i) intended data insertion meth- ods (OP RETURN and coinbase), ii) utilization of non-standard transactions, and iii) manipulation of standard transactions to insert arbitrary data. Comparison Methodology.
We measure the payload per transaction (PpT), i.e., the number of non-financial Bytes that can be added to a single standard- sized transaction (≤ 100 000 B). Costs are given as the minimum and maximum costs per Byte (CpB) for the longest data chunk a transaction can hold, and for inserting 1 B. Costs are inflicted by paying transaction fees and possibly burning currency (at least 546 satoshi per output script), i.e., making it unspendable.
For our cost analysis we assume Bitcoin’s market price of 4748.25 USD as of August 31st, 2017 [14] and the average fees of 215 satoshi per Byte as of August 2017 [4]. Note that high variation of market price and fees results in frequent changes of presented absolute costs per Byte. Finally, we rate the overall efficiency of an approach w.r.t. insertion of arbitrary-length content. Intuitively, a method is efficient if it allows for easy insertion of large payloads at low costs.
OP RETURN. This special transaction template allows attaching one small data chunk to a transaction and thus provides a controlled channel to an- notate transactions without negative side effects. E.g., in typical implementa- tions peers increase performance by caching spendable transaction outputs and OP RETURN outputs can safely be excluded from this cache. However, data chunk sizes are limited to 80 B per transaction.
Coinbase. In Bitcoin, each block contains exactly one coinbase transaction, which introduces new currency into the system to incentivize miners to dedi-
cate their computational power to maintain the blockchain. The input script of coinbase transactions is up to 100 B long and consists of a variable-length field encoding the new block’s position in the blockchain [9]. Stating a larger size than the overall script length allows placing arbitrary data in the resulting gap. This method is inefficient as only active miners can insert only small data chunks. Non-standard Transactions.
Transactions can deviate from the approved transaction templates [48] via their output scripts as well as input scripts. In the- ory, such transactions can carry arbitrarily encoded data chunks. Transactions using non-standard output scripts can carry up to 96.72 KiB at comparably low costs. However, they are inefficient as miners ignore them with high probability. Yet, non-standard output scripts occasionally enter the blockchain if miners in- sufficiently check them (cf. Section 4.2).
Contrarily, non-standard input scripts are only required to match their respective output script. Hence, input scripts can be altered to carry arbitrary data if their semantics are not changed, e.g., by using dead conditional branches. This makes non-standard input scripts slightly better suited for large-scale content insertion than non-standard output scripts. Standard Financial Transactions. Even standard financial transactions can be (mis)used to insert data using mutable values of output scripts.
There are four approved templates for standard financial transactions: Pay to public-key (P2PK) and pay to public-key hash (P2PKH) transactions send currency to a dedicated receiver, identified by an address derived from her private key, which is required to spend any funds received [48].
Similarly, multi-signature (P2MS) transactions require m out of n private keys to authorize payments. Pay to script hash (P2SH) transactions refer to a script instead of keys to enable complex spending conditions [48], e.g., to replace P2MS [10]. The respective public keys (P2PK, P2MS) and script hash values (P2PKH, P2SH) can be replaced with arbitrary data as Bitcoin peers can not verify their correctness before they are referenced by a subsequent input script. While this method can store large amounts of content, it involves significant costs: In addition to transaction fees, the user must burn bitcoins as she replaces valid receiver identifiers with arbitrary data (i.e., invalid receiver identities), making the output unspendable.
Using multiple outputs enables PpTs ranging from 57.34 KiB (P2PKH) to 96.70 KiB (P2SH inputs) at CpBs from 1.03 ct to 1.87 ct. As they behave similarly w.r.t. data insertion, we collectively refer to all standard financial transactions as P2X in the following. P2SH scripts also allow for efficient data insertion into input scripts as P2SH input scripts are published with their redeem script. Due to miners’ verification of P2SH transactions, transaction are not discarded if the redeem script is not template-compliant (but the overall P2SH transaction is).
We now survey different services that systematically leverage the discussed data insertion methods to add larger amounts of content to the blockchain.
2.2 Content Insertion Services
Content insertion services rely on the low-level data insertion methods to add content, i.e., files such as documents or images, to the blockchain. We identify four conceptually different content insertion services and present their protocols.
CryptoGraffiti. This web-based service [30] reads and writes messages and files from and to Bitcoin’s blockchain. It adds content via multiple P2PKH output scripts within a single transaction, storing up to 60 KiB of content. To retrieve previously added content, CryptoGraffiti scans for transactions that either con- sist of at least 90 % printable characters or contain an image file.
Satoshi Uploader. The Satoshi Uploader [56] inserts content using a single transaction with multiple P2X outputs. The inserted data is stored together with a length field and a CRC32 checksum to ease decoding of the content. P2SH Injectors. Several services [35] insert content via slightly varying P2SH input scripts. They store chunks of a file in P2SH input scripts. To ensure file integrity, the P2SH redeem scripts contain and verify hash values of each chunk. Apertus.
This service [29] allows fragmenting content over multiple transac- tions using an arbitrary number of P2PKH output scripts. Subsequently, these fragments are referenced in an archive stored on the blockchain, which is used to retrieve and reassemble the fragments. The chosen encoding optionally allows augmenting content with a comment, file name, or digital signature.
To conclude, Bitcoin offers various options to insert arbitrary, non-financial data. These options range from small-scale data insertion methods exclusive to active miners to services that allow any user to store files of arbitrary length. This wide spectrum of options for data insertion raises the question which benefits and risks arise from storing content on Bitcoin’s blockchain.