# Burning Bitcoins for Censorship Resistance

15835 words (63 pages) Dissertation Proposal in Information Technology Proposal

Disclaimer: This work has been submitted by a student. This is not an example of the work produced by our Dissertation Proposal Service. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

BURNING BITCOINS FOR CENSORSHIP RESISTANCE

iii

Page

LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : v

LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi

ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii

CHAPTER 1. INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : 1

1.1 Scope : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4

1.2 Contribution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4

CHAPTER 2. BACKGROUND AND RELEVANT LITERATURE : : : : : 6

2.1 Existing censorship circumvention techniques : : : : : : : : : : : : : 6

2.1.1 DynaWeb : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7

2.1.2 UltraReach : : : : : : : : : : : : : : : : : : : : : : : : : : : 8

2.1.3 Message in the bottle : : : : : : : : : : : : : : : : : : : : : : 9

2.1.4 Tor : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10

2.2 Public Key Steganography : : : : : : : : : : : : : : : : : : : : : : : 12

2.3 Bootstraping Mechanism : : : : : : : : : : : : : : : : : : : : : : : : 14

2.3.1 TELEX : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14

2.3.2 Cirripedde : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15

2.4 Bitcoin Network : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17

2.4.1 Bitcoin Blockchain : : : : : : : : : : : : : : : : : : : : : : : 17

2.4.2 Bitcoin Transactions : : : : : : : : : : : : : : : : : : : : : : 19

2.5 Cryptography Assumptions : : : : : : : : : : : : : : : : : : : : : : 25

2.5.1 Cryptographic Hash Function : : : : : : : : : : : : : : : : : 26

2.5.2 Di_e-Hellman Assumption : : : : : : : : : : : : : : : : : : : 26

CHAPTER 3. CONCEPT : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28

3.1 System Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28

3.2 Threat Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28

3.3 Goals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 29

3.4 Design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30

CHAPTER 4. FRAMEWORK AND METHODOLOGY : : : : : : : : : : : 32

4.1 Storing Data in Bitcoin Transaction : : : : : : : : : : : : : : : : : : 32

4.2 Encryption Scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : 35

4.3 Implementation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 36

CHAPTER 5. SECURITY ANALYSIS : : : : : : : : : : : : : : : : : : : : : 41

iv

Page

CHAPTER 6. FUTURE WORK : : : : : : : : : : : : : : : : : : : : : : : : 43

6.1 Associating with existing anti-censorship systems : : : : : : : : : : 43

6.2 Other crypto-currency systems : : : : : : : : : : : : : : : : : : : : : 44

CHAPTER 7. SUMMARY : : : : : : : : : : : : : : : : : : : : : : : : : : : 46

BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 47

v

LIST OF TABLES

2.1 List of opcodes [53] : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 22

4.1 MultiSig Transaction: Parameter 2 [43] : : : : : : : : : : : : : : : : : : 33

4.2 MultiSig Transaction: Parameter 1 [43] : : : : : : : : : : : : : : : : : : 34

4.3 Comparison between data storing/hiding techniques : : : : : : : : : : : 37

vi

LIST OF FIGURES

1.1 Simple illustration of Censorship Circumvention : : : : : : : : : : : : : 2

2.1 Bitcoin Blockchain : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18

2.2 Bitcoin Transaction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 21

3.1 Protocol design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30

4.1 Transaction rate per second : : : : : : : : : : : : : : : : : : : : : : : : 39

vii

ABSTRACT

This thesis proposes a new technique to circumvent Internet censorship by using the bitcoin network. It provides a bootstrapping mechanism for existing circumvention tools, that enable access to the restricted information and communication over the Internet, at the cost of “burning” bitcoins. It deploys the concept of public key steganography to hide data in the bitcoin transactions and make them look as legitimate transactions hence, making it impossible for the censor to detect. It aims at designing an e_cient cryptographic scheme to store messages in the bitcoin blockchain with a strong focus on space e_ciency and implement the scheme to design protocol for steganographic communication.

1

CHAPTER 1. INTRODUCTION

The Article 19 of the Universal Declaration of Human rights adopted by the

United Nations General Assembly in the year 1948 states that – \Everyone has the

right to freedom of opinion and expression; this right includes freedom to hold

opinions without interference and to seek, receive and impart information and ideas

through any media and regardless of frontiers” [1]. Today, the invasion of Internet

over the information landscape has completely changed the way we seek, receive and

impart information. And with this growth of Internet there have been attempts

made to control the use of it. The authority who controls the Internet and the

censorship of its content can be wide ranging driven by various factors including

social, political or economic. A prominent example of political inuence is the

censorship of Internet in various countries of the Arab League after the events of

Arab Spring [2]. Meanwhile, China has been a strong propagator of Internet

censorship on its citizens since 1994 and had deployed a very e_ective tool called the

“Great Firewall of China” [3] for that purpose. Free exchange of information and

ideas through Internet is also a_ected by the religious ideologies of various

authorities. For example, in Saudi Arabia all Internet tra_c _rst goes through a

government run _lter which, according to the country’s Internet Services Unit,

blocks all material of an “o_ensive or harmful nature to the society, and which

violate the tenants of the religion or societal norms” [4]. Apart from this, blocking

various tools and applications of the Internet that can be used to assist users in

accessing and sharing sensitive material is common in many countries [5], this

and many more.

Despite all these restrictions imposed on the free use of Internet, there exist

numerous technologies that can be used under di_erent circumstances to circumvent

2

Figure 1.1. Simple illustration of Censorship Circumvention

The _gures shows the various components involved the process of censorship

circumvention.There is a user inside the censored network which is trying to access a

blocked destination with the help of a forwarder.

the censorship. Some of the simple and trivial methods include { cached pages,

mirror and archive sites, Web to E{mail services, RSS aggregators, Alternate IP

addresses and Domain names, Proxy websites, VPNs, etc.. Other forms of

sophisticated and more complex tools for circumventing Internet censorship include

encrypted tunnels and proxies, such as Dynaweb [6], Ultrasurf [7] and Tor [8].

Figure 1.1 gives a simple demonstration of how a censorship circumvention works.

Here, we have a user whose tra_c is being monitored and blocked by a censor. The

user attempts to access a blocked destination by _rst establishing a connection with

a forwarder, which is not blocked by the censor and can be accessed by the user.

While these designs can be quite e_ective at sneaking client connections past

the censor, these systems inevitably lead to a cat-and-mouse game in which the

censor attempts to discover and block these services [9]. A classic example of this

problem is the conict between TOR and the Great Firewall of China, which in

numerous ways, appropriately represents the larger world of censorship and

circumvention. A series of counter actions began when Tor- a tool initially meant to

3

be used for providing anonymity, became an e_ective tool for circumvention; And as

a result, it was being used to evade the Great Firewall; The _rewall in return,

blocked the Tor website and the servers that were part of its network. Tor

responded by using mirrors [10], bridges [11] and secret entry servers [12]; the

_rewall went a step ahead and started identifying Tor by protocol features. Tor

deployed protocol encapsulation; the _rewall started active scanning for Tor servers

to which Tor responded by introducing protocols immune to scanning [13].

To overcome the above mentioned problem, this thesis proposes a technique

to circumvent Internet censorship using the Bitcoin Network and public key

steganography which is based on providing a bootstrapping mechanism for existing

circumvention techniques. A similar technique has been previously used by a

circumvention tool called Telex [9]. Telex operates as an end-to-middle proxy that

has no IP address and is located withing the network infrastructure,e.g. in an ISP,

and uses the public key steganography to bypass the censor. Bitcoin maintains a

public append-only log called the block chain, which is essentially stored by every

full Bitcoin node in the world. This makes transactions in the Bitcoin blockchain

network comes with (economic) cost for the censor. Furthermore, data written to

the blockchain cannot be removed, because it is stored and replicated by a lot of

Bitcoin nodes.

This thesis proposes to design a cryptographic scheme that can be used to

store messages in the bitcoin blockchain. Keeping in mind the lack of space to store

data in the block chain, this scheme will give a strong emphasis on space-e_ciency.

Since the transaction involve actual money and cost fees, this scheme will also

inculcate an interesting trade-o_, i.e. burning bitcoins to get additional storage per

transaction. The fact that the bitcoin transactions use the secp256k1 [14] signatures

that are made by using ECDSA [15] cryptographic formula will help this scheme to

be further used to design protocols for steganographic communication over the

blockchain. This steganographic communication can be used in particular for

4

systems so that they cannot be detected by the censor and do not come under under

the threat of getting blocked.

1.1 Scope

This thesis proposes to design a cryptographic scheme that can be used to

store messages in the bitcoin blockchain. Keeping in mind the lack of space to store

data in the block chain, this scheme will give a strong emphasis on space-e_ciency.

Since the transaction involve actual money and cost fees, this scheme will also

inculcate an interesting trade-o_, i.e. burning bitcoins to get additional storage per

transaction. The fact that the bitcoin transactions use the secp256k1 [14] signatures

that are made by using ECDSA [15] cryptographic formula will help this scheme to

be further used to design protocols for steganographic communication over the

blockchain. This steganographic communication can be used in particular for

systems so that they cannot be detected by the censor and do not come under under

the threat of getting blocked.

1.2 Contribution

Michael Carl Tschantz, Sadia Afroz and Vern Paxson in their work

SoK:Towards Grounding Censorship Circumvention in Empiricism claim

that”system designers tend to emphasize censor capabilities that may become

important in the future, but not seen in practice today, with little assessment on

actual detection techniques used by current censors” [13]. Based on their research it

can be concluded that the attacks carried out by real censors target the process of

discovering and setting up channels , however, researchers tend to align their

5

research on channel usage. Also, the means adapted by the real sensors include

cheap passive monitoring or active probing, contrary to it, the researchers have been

digging deep into complex monitoring and tra_c manipulations. In simple words,

censors(almost) only try to block identity distribution (e.g. bridge addresses, public

keys)and connection establishment but academic work often focuses on all the other

complicated aspects. This indeed provides a very strong motivation for our project

as our bootstrapping mechanism aims at providing a very e_ective solution to the

problem of setting up a channel for censorship free exchange of information over the

Internet.

6

CHAPTER 2. BACKGROUND AND RELEVANT LITERATURE

2.1 Existing censorship circumvention techniques

There is a list of 7 dimensions for an anti-censorship de_ned in ‘A Taxonomy

of Internet Censorship and Anti Censorship’ [16]. These include:

_ Availability: The anti-censorship technology should be readily accessible to

the users.

_ User friendliness: This is an often under-explored dimension [16] as many a

times we tend to ignore the large population of users that are not

technology-savvy.

_ Veri_ability: the user should be able to verify that the software that they are

using is not a monitoring tool.

_ Scope: modes of communication that are being covered.

_ Security: The success of an anti-censorship technology is highly dependent on

the level of security it provides.

_ Deniability: the user should be able to deny his involvement, if caught.

_ Performance: throughput and delay caused by the use of this technology.

Apart from these seven dimensions, the goals of an anti-censorship

technology should include unobservability and undetectability. Undetectability of

an item of interest (IOI) from an attacker’s perspective means that the attacker

cannot su_ciently distinguish whether it exists or not [17]. Unobservability of an

item of interest (IOI) means:

7

_ Undetectability of the IOI against all subjects uninvolved in it and

_ Anonymity of the subject(s) involved in the IOI even against the other

subject(s) involved in that IOI [17].

Anonymity of a subject from an attacker’s perspective means that the attacker

cannot su_ciently identify the subject within a set of subjects, the anonymity

set [17].

In this Section, we discuss some of the most prominent tools that are build

upon these dimensions and goals and are being used to circumvent Internet

censorship. These include Dynaweb, UltraReach, Tor and TELEX. We look at how

these existing tools work and what are their advantages and disadvantages over each

other.

2.1.1 DynaWeb

DynaWeb FreeGate is an HTTP proxying tool designed speci_cally for

circumvention use in China. It is a product of Dynamic Internet Technology Inc.

(DIT). DIT was founded originally in 2001 to provide email delivery services to

China for U.S. government agencies and NGOs. In 2002, DIT started to provide

anti-censorship services under the framework of DynaWeb and became a top

contender of the Great Firewall penetration e_ort [18]. The DynaWeb project hosts

its own collection of proxy servers and uses a variety of proprietary methods to keep

its proxy servers unblocked in the face of aggressive, ongoing attempts to block

them. DynaWeb FreeGate provides an easy to use interface that successfully

bypasses most types of _ltering in the face of active blocking attempts. Today

DynaWeb o_ers the widest range of options for users to access Internet freely, and

supports more than 50 million web hits per day on average from Chinese users

alone [18]. It is a web-based portal and requires no use of any software, nor any

changes to the settings on the user’s system. The DynaWeb needs to work in a very

dynamic way in order to avoid the interference by Chinese authorities. They keep a

8

continuous watch on the Dynaweb’s portal websites and block them as soon as they

are identi_ed. DynaWeb also tackles the issue of IP blocking and DNS hijacking by

maintaining hundreds of mirror sites at any time, with each one of them having a

di_erent IP and DNS domain name. To keep users connected to such a dynamic

infrastructure, DynaWeb has a variety of channels to keep users updated. For

example, a user can send a message to one of DynaWeb’s instant messenger (IM)

portals. Similar things are being done with emails. By these many, dynamic

channels, DynaWeb outsmarts any attempt to collect all DynaWeb addresses by the

censors, because each user receives only a (di_erent) subset of DynaWeb’s

addresses [18]. Having said that, DynaWeb is de_nelty not the perfect tools

available to us as it has a very poor performance for image heavy sites but okay for

simpler text oriented sites. Most importantly, the DynaWeb tool uses an error prone

encryption mechanism that fails to encrypt some of its tra_c, allowing snooping by

the network and making the tool vulnerable to keyword block _lters. [19]

2.1.2 UltraReach

UltraReach has been developing anticensorship technologies since the year

2002. It uses its unique Global Internet Freedom platform to expand its o_erings

and its anticensorship tool is called UltraSurf. UltraSurf also uses HTTP proxying

and is designed speci_cally for circumvention use in China. More commonly known

by its Chinese name ‘Wujie’- which means borderless, it has become very popular

among the Internet users in China. UltraSurf provides an easy to use interface that

is able to access all content in the face of active blocking attempts. UltraSurf’s

performance is the best of any tool tested in _ltering countries, the only tool to

display okay speed for both image heavy and simple, text oriented sites [19].

UltraSurf has its own criteria for blocking o_ensive material and for that purpose

uses its own _lters. UltraSurf can be recommended for widespread use as the best

9

performing of all the tested tools, though users concerned about anonymity should

be warned to disable browser support for active content.

UltraSurf has evolved from an enduring conict between GreatFirewall and

UltraReach and is considered to be one of the most robust tools available today.

The freely available software has been analyzed, mutilated and spoofed, and the

supporting network infrastructure has been constantly attacked. Without doubt,

these facTors have accelerated UltraSurf reaching its level of sophistication and

fame [18]. The latest version of UtraSurf, deploys a complex proxy system which

provides thorough transparency and uses a high level encryption but can only be

used on Microsoft Internet Explorer Platform. It has strong support for load

balancing and fault tolerance, and it even employs a decoying mechanism to thwart

any tracing e_ort of its communication with its infrastructure. [18]

2.1.3 Message in the bottle

“Message In A Bottle: Sailing Past Censorship” [20] follows a very di_erent

approach to circumvent Internet censorship. If A is a user inside the censored

network and B is the blocked destination it wants to communicate to, then the

protocol is designed in such a way that A knows the least about B and there is no

rendezvous point between them. This is done to prevent the censor from discovering

the entry points, which is a very fundamental problem faced by existing

censorship-resistant tools. The protocol uses the concepts of blog pings [21] and

digital photo metadata [22] to circumvent censorship.

Blog ping is a mechanism which is used to notify the server of updated or

new content on a blog. These pings are messages sent to the centralized network

server from a blog. Introduced in 2001, its popularity has been on a rise and had

lead to the development of various protocols that compete for performance and

scalability [20]. Some of these include FriendFeed’s SUP [23] , Google’s

PubSubHubbub [24] and rssCloud [25]. Blog pings provide an e_ective way for the

10

search engines to index new content in real time. Hence, the use of pings helps the

blog writers increase their exposure. As a result, almost all the blogging platforms

support this mechanism. Digital Photo Metadata is information that Digital

cameras have embedded within each photo. This metadata keeps records of a

variety of information, including the condition of the shot, GPS coordinates, user

components, The message in the bottle (MIAB) protocol works as follows [20]:

_ A authors a blog post of arbitrary content, trying to keep the content as

innocuous as possible.

_ A snaps one or more photos to include in the post.

_ A uses the MIAB client software to embed a message M into the photos. The

message is hidden using shared-key steganography [26]. The shared key is

identi_ed as Ks and is chosen arbitrarily by A.

_ Ks is encrypted with the public key of B Kp. This cipher-text is hidden in the

photos’ metadata, using tags that appear inconspicuous to the Censor.

_ A publishes the blog post, containing the processed photos. Alice can choose

the blog arbitrarily, provided it supports blog pings.

_ The blog emits a ping to some ping servers.

_ Meanwhile, B is moniToring some of the ping servers, looking for

steganographic content encrypted with his public key. Within minutes, he

discovers A’s post, and decrypts the message.

_ Bob reads the message, and acts upon its content.

2.1.4 Tor

Tor is an anonymity tool that uses randomized, encrypted re-routing to hide

the combination of the source and destination of each data packet from all parts of

11

the network, including the individual proxies. Tor randomly assigns a set of routers

to each connection. No single Tor proxy knows the origin, the destination, or even

what other proxies are involved in the network other than the ones immediately

connected to it. Even though the primary purpose of the tool is to provide

anonymity for the user, the tool also is an e_ective circumvention tool, since hiding

of source and destination and encrypting the content e_ectively prevents _ltering.

The Tor re-routing servers are hosted by volunteers. Anyone can host a Tor server {

the anonymity of the connection relies not on the trustworthiness of the individual

servers but rather on the network design, which prevents a given router from

knowing both the origin and the destination or even which other routers it would

need to cooperate with to get that information. Tor provides an easy to use

interface that allows the user to bypass all tested forms of _ltering, though the

performance for international connections is very poor for both image intensive and

simpler text oriented sites [19]. Tor provides strong anonymity only if the user is

careful to submit data to HTTPS protected servers. Tor can be recommended for

These tools have so far managed to stay ahead of the sensors. However, there is a

worry that such systems will not be able to withstand continued research and

development on the part of censors (e.g., Sybil attacks for proxy IP discovery) [9].

The Great Firewall of China has been making continuous e_orts to stop Tor

services in China. It took the _rst action against Tor in the year 2008, when it

blocked the Tor website- “www.Torproject.org” [27]. In late 2009, the Great

Firewall also blocked the IP addresses of direcTory authorities and relays [28] by

using very simple measures – _rstly, download the list of all relays, which is publicly

available, and add their addresses to a blacklist and then, dynamically block any

responded to this by introducing Bridges [11] and Pluggable tranports [29]. Tor

Bridges are Tor relays that have not been listed in the public direcTory. The ISP

will mostly target and _lter the connection to all the known Tor relays and probably

12

won’t be able block the bridges as these are not listed in a public direcTory. A

typical bridge entry consists of three parts: IP address of the bridge, Port number

on which it is operating, and a unique identi_er of the bridge, often known as the

_ngerprint of the bridge. Even though Tor bridges have provided a good alternative

when the Tor relays have been blocked by the censor. But over the last few years,

the continuous research and e_orts on the part of censors have enabled them to _nd

new ways to block the bridges as well. They usually do this by installing special

boxes at ISPs that peek into network tra_c, often called deep Packet

Inspection [30], and detect Tor; when Tor is detected they block the tra_c ow [11].

The Great Firewall went after the bridges, enumerating them from the centralized

bridge database one by one [31]. In order to tackle suck level of sophisticated

censorship, Tor went a step ahead and started manipulating all the tra_c between

the user and the _rst hop by using Pluggable Transports. These manipulate the

tra_c in such a way that it is not identi_ed as Tor connection and is able to trick

the censor/ISP. As a result, the censor cannot decide if the tra_c belongs to a Tor

connection and is less likely to block it. However, even the use of pluggable

transports is not the perfect solution as these are also vulnerable to detection. In

the past, obfs and obfs2 were promoted as safe transports but these are now

deprecated and were replaced by obfs3, scramblesuit, fte, and obfs4 [11].

Sadly, pluggable transports are not immune to detection, if a censor is given

enough time. In the past, obfs and obfs2 were promoted as safe transports. These

are now deprecated and were replaced by obfs3, scramblesuit, fte, and obfs4.

2.2 Public Key Steganography

Steganography refers to the problem of sending messages hidden in \innocent

looking” communications over a public channel so that an adversary eavesdropping

on the channel cannot even detect the presence of the hidden messages [32].

13

De_nition 2.2.1 Public-key Steganography: A public-key stegosystem is a triple of

probabilistic algorithms S = (SG, SE, SD). SG(1k) generates a key pair (PK, SK) 2

PK x SK. SE takes a (public) key PK 2 PK, a string m 2 f0; 1g(the hiddentext),

and a message history h. SE also has access to a channel oracle for some channel C,

which can sample from Ch for any h. SE(PK, m, h) returns a sequence of documents

s1; s2; :::; sl (the stegotext) from the support of Clh

. SD takes a (secret) key SK 2 SK,

a sequence of documentss1; s2; :::; sl, and a message history h, and returns a

hiddentext m. Additionally, for every polynomial p there must exist a negligible _

such that

8m 2 f0; 1gp(k) : Pr(PK;SK) SG(1k)[SD(SK; SE(PK; m; h); h) = m] _ 1 _(k),

where the randomization is also over any coin tosses of SE, SD, SG and the oracle

to Ch.

The de_nition above is the _rst important contribution in this _eld, which can be

traced back to the model for steganography by Cachin [33] and some similar

models [34{36]. However, these frameworks were restricted in a way that performing

steganography between two parties was not possible without a shared key. Hopper,

Langford, and von Ahn [37] came up with a theoretical framework that was based

on computational security that addressed only the shared-key settings. Later

Hopper and von Ahn [32] came up with a formal framework to prove that public key

steganography is practically possible. They show that the public key steganographic

key exchange is possible under the Decisional Di_e-Hellman [38] assumption and

provide a protocol that is secure in the random oracle model against adversaries

that have access to a decoding oracle. The de_nition given by them is stated below.

De_nition 2.2.2 Stegosystem: A steganographic protocol, or stegosystem, is a pair

of probabilistic algorithms S = (SE, SD). SE takes a key K 2 f0; 1gk, a string m

2 f0; 1g* (the hiddentext), a message history h, and an oracle M(h) which samples

blocks according to a channel distribution Cbh

. SEM(K, m, h) returns a sequence of

blocks c1 k c2 k ::: k cl (the stegotext) from the support of Cl_b

h . SD takes a key K, a

14

sequence of blocks c1 k c2 k ::: k cl, a message history h, and an oracle M(h), and

returns a hiddentext m. There must be a polynomial p(k) > k such that SEM and

SDM also satisfy the relationship:

8m; jmj < p(k) : Pr(SDM(K; SEM(K; m; h); h) = m) _ 2

3 ,

where the randomization is over any coin tosses of SEM; SDM; and M.

2.3 Bootstraping Mechanism

As mentioned earlier, our scheme aims to provide a bootstrapping

mechanism for anti-censorship systems. Here we have a look at two systems –

Telex [9] and Cirripede [39] which use the concept of stegnography and provide an

e_ective mechanism to allow access to blocked websites by working at the ISP level.

2.3.1 TELEX

Telex operates in the network infrastructure at any ISP between the censor’s

network and non-blocked portions of the Internet. Unlike following the typical

\end-to-end” approach, it is an \end-to-middle” proxy with no IP address, located

within the network infrastructure. Clients invoke the proxy by using public-key

steganography to \tag” otherwise ordinary sessions destined for uncensored

websites [9]. Telex applies this idea to a widely used encrypted transport, such as

TLS, and carefully avoiding observable deviations from the behavior of non-proxied

connections and can construct a service that allows users to robustly bypass

network-level censorship without being detected. To test functionality of Telex on a

real censor’s network, a Telex client was ran on a PlanetLab [40] node located in

Beijing and attempted connections to each of the Alexa top 100 websites [41] using

the model Telex station located at the University of Michigan. As a control, _rst

these sites were loaded without using Telex and apparent censorship behavior for 17

of them was noted, including 4 from the top 10: facebook.com, youtube.com,

15

blogspot.com and twitter.com. The blocking techniques that were observed included

forged RST packets, false DNS results, and destination IP black holes, which are

consistent with previous _ndings [42] . Telex was able to load all 100 sites [9].

2.3.2 Cirripedde

Cirripede is an anti-censorship system that is used for unobservable

communication with the blocked destinations. Similar to Telex, it is also deployed

at the ISPs. It works by intercepting the connections to the allowed destinations,

and then redirecting it to the blocked destination requested by the user. The

concept of steganograhy is used here as well and the data is encoded in such a way

that it is impossible for the censor to distinguish these communications from normal

ones, without having the master secret key. The use of public key cryptography also

ensures that that there is no need to share any secret information with the

Cirripede user. It is a scalable system, which is designed to handle large volumes of

tra_c and at the same time ensuring minimal overhead on ISPs and not disrupting

the day to day tra_c. This allows Cirripede proxies to be strategically deployed at

central locations, making access to Cirripede very di_cult to block. The following

are the main components of the Cirripede anti-censorship system [39]:

1. Client(C): The user who wants to access a covert destination(CD).

2. Warden ISP: Network provider hosting C.

5. participating ISP: an ISP who participates in Cirripede by deecting network

tra_c to Cirripede servers. This ISP must be on the (forward) network path

from C to OD.

16

6. Cirripede registration server (RS): it is used for registering clients who wish to

avail Cirripede service.

7. Cirripede service proxy (SP): : a server that connects with CD and proxies

communications with it over the overt communication stream between C and

OD.

8. Deecting router (DR): an Internet router, owned by a participating ISP, that

deects tra_c from a registered client C to the service proxy SP.

The entire process that Cirripede follows consists of three major operations [39]:

1. Client registration: Before a client starts using Cirripede, he need to register

_rst with Cirripede. The client uses a covert channel in TCP headers to show

its willingness to register and establish a secret key. This is done because the

tra_c is constantly monitored by the warden. The key generated is shared

with Cirripede and is used later for communication. DR mirrors part of all

TCP tra_c it sees to the registration server RS, which detects the covert

registration message and instructs DR to redirect HTTPS tra_c from C to

the service proxy SP.

2. Cover connection:C makes a TCP connection with the overt destination OD,

which is deected by DR to SP.

3. Covert communication: C performs a TLS handshake with OD, which SP

interposes upon. After the handshake, SP disconnects from OD and takes over

the TLS connection, which is used to tunnel communication between C and

CD.

17

2.4 Bitcoin Network

In this section, we have a look at how the bitcoin network operates. We shed

a light on two of its most important components- the bitcoin blockchain and the

bitcoin transactions. Out of the two, the later one is of utmost signi_cance for us.

2.4.1 Bitcoin Blockchain

The blockchain is Bitcoin’s public ledger, which is record of transactions

maintained in an orderly manner with timestamps. Each full node in the Bitcoin

network independently stores a block chain containing only blocks validated by that

node. The nodes are considered to be in consensus all of them have the same blocks

in their block chain, The validation rules these nodes follow to maintain consensus

are called consensus rules. This section describes many of the consensus rules used

by Bitcoin Core. Every block must include one or more transactions. The _rst one

of these transactions must be a coinbase transaction, also called a generation

transaction, which should collect and spend the block reward (comprised of a block

subsidy and any transaction fees paid by transactions included in this block). All

transactions, including the coinbase transaction, are encoded into blocks in binary

raw transaction format. The raw transaction format is hashed to create the

transaction identi_er (txid). From these txids, a merkle tree is constructed by

pairing each txid with one other txid and then hashing them together.If there are an

odd number of txids, the txid without a partner is hashed with a copy of itself.The

resulting hashes themselves are each paired with one other hash and hashed

together. Any hash without a partner is hashed with itself. The process repeats

until only one hash remains, the merkle root [43].A Merkle tree is a hash based data

structure that is a generalization of the hash list. It is a tree structure in which each

leaf node is a hash of a block of data, and each non-leaf node is a hash of its

children. Typically, Merkle trees have a branching factor of 2, meaning that each

node has up to 2 children [44]. In case of blockchain, the leaves are almost always

18

Figure 2.1. Bitcoin Blockchain

The _gure illustrates a simpli_ed version of a block chain. A block of one or more

new transactions is collected into the transaction data part of a block. Copies of each

transaction are hashed, and the hashes are then paired, hashed, paired again, and

hashed again until a single hash remains, the merkle root of a merkle tree [43]

transactions from a single block. The root node of a merkle tree is called a merkle

root, a descendant of all the hashed pairs in the tree. Block headers must include a

valid merkle root descended from all transactions in that block [45].

Proof-of-work: Bitcoin uses the concept of proof-of-work as a requirement

for adding new sets of transactions to the transaction database. Proof-of-work is

basically data that is di_cult to produce, in terms of cost and time, but easy to

verify. Bitcoin uses the Hashcash [46] proof of work system. Hashcash proofs of

work are used in Bitcoin for block generation. In order for a block to be accepted by

network participants, miners must complete a proof of work which covers all of the

data in the block. The di_culty of this work is adjusted so as to limit the rate at

which new blocks can be generated by the network to one every 10 minutes. Due to

19

the very low probability of successful generation, this makes it unpredictable which

worker computer in the network will be able to generate the next block [47]. The

proof of work used in Bitcoin takes advantage of the apparently random nature of

cryptographic hashes. A good cryptographic hash algorithm converts arbitrary data

into a seemingly-random number. If the data is modi_ed in any way and the hash

re-run, a new seemingly-random number is produced, so there is no way to modify

the data to make the hash number predictable [43]. New blocks will only be added

to the block chain if their hash is at least as challenging as a di_culty value

expected by the consensus protocol.

2.4.2 Bitcoin Transactions

A bitcoin is a chain of signed messages. Each owner transfers the coin to the

next by digitally signing a hash of the previous transaction and the public key of the

next owner and adding these to the end of the coin. A payee can verify the

signatures to verify the chain of ownership [48]. Each transaction is constructed out

of several parts which enable both simple direct payments and complex

transactions. Each transaction has at least one input and one output. Each input

spends the bitcoins paid to a previous output. Each output then waits as an

Unspent Transaction Output (UTXO) until a later input spends it [43]. As shown

in Fig 2.1, a transaction consists of four main parts:

_ Version: There exist di_erent set of rules on which a transaction is validated.

Version is a four-byte number, pre_xed to the transaction, which indicates the

peers which set of rules to follow for that particular transaction. This is useful

in the way thay new rules can be created for future transactions without

invalidating the previous transactions.

_ Inputs: refers to the data that is fed into the transaction. It further has three

di_erent _eld:

20

1. Outpoint: A data structure used to refer to a particular transaction

output, consisting of a 32-byte TXID and a 4-byte output index number

(vout) [49]

2. Signature Script: Data generated by a spender which sets the conditions

that must be ful_lled for the satoshis to be spent. This is always used as

variables to satisfy a pubkey script. Signature Scripts are called scriptSig

in code. [50]

3. Sequence Number: It is a number that was intended to allow an

uncon_rmed time-locked transaction to be updated before being _nalized.

However, currently its only use is to disable locktime in a transaction. [51]

A number intended to allow uncon_rmed time-locked transactions to be

updated before being _nalized; not currently used except to disable

locktime in a transaction

_ Outputs: The ouput of an transaction consists of two _elds:

1. Value _eld: It speci_es the amount of satoshis intended to be transfer via

the transaction.

2. Pubkey Script: A script included in outputs which sets the conditions

that must be ful_lled for those satoshis to be spent. Data for ful_lling the

conditions can be provided in a signature script. Pubkey Scripts are

called a scriptPubKey in code. [52]

_ Locktime: Locktime indicates the earliest time a transaction can be added to

the blockchain. This allows the users to create transactions which are

time-locked and only become valid in the future. Hence, proving the signers a

scope to cancel the transaction if they change their minds.

A user must _rst generate a private/public key pair before a transaction cab

be created. Bitcoin uses the Elliptic Curve Digital Signature Algorithm

(ECDSA) [15] with secp256k1 [14] curve. Secp256k1 private keys are 256 bits of

21

Figure 2.2. Bitcoin Transaction

The _gure above shows the structure of a bitcoin transaction, consisting of four main

parts- version, inputs, outputs and locktime. The outputs of one transactions are

used as inputs for the next transaction. [43]

random data. A copy of that data is deterministically transformed into a secp256k1

public key. Because the transformation can be reliably repeated later, the public

key does not need to be stored [43]. After this, the public key is cryptographically

hashed. There is no need to store this public key hash as well, as it can also be

reliably repeated later. The hash shortens and obfuscates the public key, making

manual transcription easier and providing security against unanticipated problems

which might allow reconstruction of private keys from public key data at some later

point [43]. Public key hashes are always sent encoded as Bitcoin addresses, which

are base58-encoded strings containing an address version number, the hash, and an

error-detection checksum to catch typos. The address can be transmitted through

any medium, including one-way mediums which prevent the spender from

communicating with the receiver, and it can be further encoded into another

format [43].

22

Table 2.1.

List of opcodes [53]

Opcode Description

OP RIPEMD160 The input is hashed using RIPEMD-160.

OP SHA1 The input is hashed using SHA-1.

OP SHA256 The input is hashed using SHA-256.

OP HASH160 The input is hashed twice: _rst with

SHA-256 and then with RIPEMD-160.

OP HASH256 The input is hashed two times with SHA-256.

OP CODESEPARATOR All of the signature checking words will only

match signatures to the data after the most

recently-executed OP CODESEPARATOR.

OP CHECKSIG The entire transaction’s outputs, inputs, and

script are hashed. The signature used by

OP CHECKSIG must be a valid signature

for this hash and public key. If it is, 1 is

returned, 0 otherwise.

OP CHECKSIGVERIFY Same as OP CHECKSIG, but OP VERIFY

is executed afterward.

OP CHECKMULTISIG Compares the _rst signature against each

public key until it _nds an ECDSA match.

OP CHECKMULTISIGVERIFY Same as OP CHECKMULTISIG, but

OP VERIFY is executed afterward.

Bitcoin transactions use a scripting system. Much like Forth [54], these

scripts are also very simple, stack-based and are processed from left to right. A

script is recorded with each and every transaction and is nothing but a list of

instructions that describe how the next person wanting to spend the bitcoins being

transferred can gain access to them [53]. The script included in a bitcoin

transaction restricts the future spending of the bitcoins by asking the spender to

provide the following two things:

1. A public key, which is basically the destination address, after it is hashed.

2. A signature as a proof of the private key corresponding to the public key just

provided.

23

The scripting system incorporated by bitcoin transaction works in a very exible

way, allowing the spender to change the required perimeters(for a successful

transaction) within the script. A transaction is valid if nothing in the combined

script triggers failure and the top stack item is true (non-zero). The party who

originally sent the Bitcoins now being spent, dictates the script operations that will

occur last in order to release them for use in another transaction. The party

wanting to spend them must provide the input(s) to the previously recorded script

that results in those operations occurring last leaving behind true (non-zero) [53]. A

script uses various ‘words’, commonly known as opcodes to perform actions. A list

of commonly used cryto opcodes is presented in table 2.1. Based on the type of

script, there exist four di_erent types of standard bitcoin transaction:

_ Pay-To-Public-Key-Hash (P2PKH): The most commonly used

transaction type is the common form of the standard Pay-To-Public-Key-Hash

(P2PKH) transaction type. When spending a UTXO, a signature script is

created, which is a collection of data parameters which satisfy the conditions

placed in the previous output’s public key script. Public key scripts and

signature scripts then combine secp256k1 public keys and signatures with

conditional logic, creating a programmable authorization mechanism [43]. The

public key hash ,i.e the address always starts with 1. The format of the

transaction is as follows:

Pubkey script: OP DUP OP HASH160 <PubKeyHash> OP EQUALVERIFY

OP CHECKSIG

Signature script: <sig> <pubkey>

_ Pay To Script Hash (P2SH): Instead of sending sending transactions to

public key hash, these allow transactions to be sent to a script hash, which is

similar to public key hash but starts with 3. To spend bitcoins sent via P2SH,

the recipient must provide a script matching the script hash and data which

makes the script evaluate to true. Using P2SH, you can send bitcoins to an

24

address that is secured in various unusual ways without knowing anything

about the details of how the security is set up. You just send bitcoins to the

34-character P2SH address. The recipient might need the signatures of several

people to spend these bitcoins, or a password might be required, or the

requirements could be completely unique [55]. Each of the standard pubkey

scripts can be used as a P2SH redeem script, but in practice only the multisig

pubkey script makes sense until more transaction types are made

standard [43]. The format of the transaction is as follows:

Pubkey script: OP HASH160 <Hash160(redeemScript)> OP EQUAL

Signature script: <sig> [sig] [sig…] <redeemScript>

_ Multisig: Multisig or Multi-signature transactions are those which require

more than one key to authorize a transaction. In multisig pubkey scripts,

called m-of-n, m is the minimum number of signatures which must match a

public key; n is the number of public keys being provided [43]. The format of

the transaction is as follows:

Pubkey script: <m> <A pubkey> [B pubkey] [C pubkey…] <n>

OP CHECKMULTISIG

Signature script: OP 0 <A sig> [B sig] [C sig…]

_ Null Data: These transactions are used for the purpose of adding arbitrary

data to a provable unspendable public key script so that the full nodes don’t

have to store any data in their database. Current rules by consensus allow null

data outputs up to the maximum allowed pubkey script size of 10,000 bytes

provided they follow all other consensus rules, such as not having any data

pushes larger than 520 bytes [43]. The format of the transaction is as follows:

Pubkey Script: OP RETURN <0 to 40 bytes of data>

There is no signature script for null data transactions as these are

unspendable.

25

2.5 Cryptography Assumptions

De_nition 2.5.1 Negligible Function: We call a function _ : N ! R negligible if

for every positive polynomial p(_) there exists an N such that for all n > N,

_(n) < 1

p(n)

A one-way function is a function that is easy to compute but hard to

invert. However, there are several de_nition of a one-way function.

De_nition 2.5.2 Strong one-way function: A function f : f0; 1g*!f0; 1g* is called

(strongly) one-way if the following two conditions hold:

1. Easy to compute: There exists a (deterministic) polynomial-time algorithm A

such that on input x algorithm A outputs f (x) (i.e., A(x) = f (x)).

2. Hard to invert:For every probabilistic polynomial-time algorithm A, every

positive polynomial p(_), and all su_ciently large n’s,

Pr[A'(f(Up),1n) 2 f 1(f(Un))] < 1

p(n)

De_nition 2.5.3 Weak one-way function: A function f : f0; 1g*!f0; 1g* is called

a weakly one-way if the following two conditions hold:

1. Easy to compute: There exists a (deterministic) polynomial-time algorithm A

such that on input x algorithm A outputs f (x) (i.e., A(x) = f (x)).

2. Slightly Hard to invert: There exists a polynomial p(_) such that for every

probabilistic polynomial-time algorithm A and all su_ciently large n’s,,

Pr[A'(f(Up),1n) =2 f 1(f(Un))] > 1

p(n)

26

2.5.1 Cryptographic Hash Function

A cryptographic Hash function(h: X !Y) is a function that maps a message

of an arbitrary length(x) to a m-bit output(y) with the following security

requirements:

_ Pre-image resistant (one-way): if given y2Y it is computationally

infeasible to _nd a value x2X such that h(x) = y

_ 2nd Pre-image resistant (weak collision resistant): if given x2X it is

computationally infeasible to _nd a value x’2X, such that x’6=x and h(x’) =

h(x)

_ Collision resistant (strong collision resistant): if it is computationally

infeasible to _nd two distinct values x’,x 2 X, s.t. h(x’) = h(x)

2.5.2 Di_e-Hellman Assumption

The security of Di_e-Hellman [38] is based on three hard problems:

_ Discrete Log Problem: Let G be a multiplicative group and for g 2 G, let

hgi be a cyclic subgroup generated by g. The discrete logarithm problem for

group G is stated as:

Given g 2 G and a 2 hgi, compute an integer x such that gx= a

_ Computational Di_e-Hellman problem is stated as:

Given g 2 G and (ga,gb), where (a, b) 2 hgi, compute gab

_ Decision Di_e-Hellman problem is stated as:

Given g 2 G, distinguish (ga,gb,gab) from (ga,gb,gc), where (a, b, a) 2 hgi and

are chosen randomly and independently.

27

In addition to these three problems, these is a new class of problems de_ned

by Okamoto and Pointcheval [56], called the Gap problem. The Gap Di_e

Hellman problem is stated as:

Given g 2 G and (ga,gb), compute gab with the help of Decision Di_e Hellman

28

CHAPTER 3. CONCEPT

3.1 System Model

Bitcoin network is a decentralized payment system that maintains a public

ledger in a distributed manner. The ledger is maintained by participants, whom we

call \miners” and whose identity is anonymous. These miners execute a protocol

that maintains and extends a distributed data structure { block chain. It is required

for the miners to solve a \proof of work” which essentially amounts to brute-forcing

a hash inequality based on SHA-256, in order to generate new blocks for block

chain. Miners receive bitcoins as rewards for maintaining the block chain. Apart

from miners, there are other participants who use the bitcoin network to spend or

receive bitcoins. Our systems consists of the miners and users, inside and outside of

the censored network.

3.2 Threat Model

The adversary in this case is the censor that represents the authority that

holds the power to inhibit access to online information and communication. This

can be achieved by IP and DNS blacklisting as well as blocking connections based on

the observed content.Bitcoin maintains a public append-only log (the blockchain),

which is essentially stored by every full Bitcoin node in the world. This makes

transactions in the Bitcoin blockchain di_cult to censor under the assumption that

censor. Hence we assume that this economic costs provides some motivation for the

censor for connecting to the internet all. Furthermore, it is ensured that the censor

29

has no control over the data in the block chain as data written to the block chain

cannot be removed, because it is stored and replicated by a lot of Bitcoin nodes.

Along with that, the censor is in control of the entire infrastructure of the

network and has the ability to eavesdrop, block, alter or inject tra_c in between.

Although the censor is subjected to realistic constraints while performing such

activities. Another general assumption taken into account is that the end hosts

within the network completely abide to their user’s commands and the censor has

no control over them. Also, the supremacy of the censor is limited outside the scope

of the network and it has absolutely no control over any external entities including

the bitcoin network and bitcoin tansactions.

3.3 Goals

Our protocol should satisfy the following properties:

Space-e_cient Given the fact that Bitcoin transactions have a very less space to

store any arbitrary data, the protocol should allow to store and send as much useful

information as possible in a bitcoin transaction.

Con_dentiality As per the de_nition by ISO “Con_dentiality is a characteristic

that applies to information. To protect and preserve the con_dentiality of

information means to ensure that it is not made available or disclosed to

unauthorized entities. In this context, entities include both individuals and

processes” [57]. In accordance with this de_nition, our protocol should be able to

store and transmit data in such a way that the transaction looks like any other

bitcoin transaction and and the data in the transaction should not be available or

disclosed to the censor.

Availability The sensor should not be able to deny the ow of information through

this protocol. Since the protocol stores the data in the bitcoin transactions, it

becomes practically impossible for the censor to block our information from passing

through it.

30

Figure 3.1. Protocol design

The _gure illustrates the working of our protocol in context to Figure 1.1. The user

here aims to reach a blocked destination with the help of an anti censorship system.

By hiding the data in the bitcoin transactions the user is able to bypass the censor

and further establish a connection to an anti censorship system(in this case Tor).

Undetectability The data should be hidden in such a way that the censor is not

able to distinguish between and normal transaction and transaction storing the data.

Easy to use The protocol should not require the user to perform tasks that are

di_cult to execute and hinder with normal network or system operations.

3.4 Design

The protocol needs to be designed keeping in mind all the constraints

imposed by the threat model and also ful_lling all the goals mentioned in the

previous section. Figure 3.1 shows an overview of the design as the steps proceed as

follows:

_ The user, inside the censored network, uses the bitcoin network and creates a

new bitcoin transaction.

_ This transaction contains the data that is needed to establish a connection to

an anti censorhip system. The data is encrypted and then stored in the

31

transaction by the user. The details of storing data have been explained in

section 4.1.

_ This transaction is now on the blockchain and is publicly accessible.

_ There is forwarder outside the censored network, which extracts the hidden

data from the transaction and decrypts it.

_ The forwarded then uses the credentials of the TOR bridge to create a

connection to it.

32

CHAPTER 4. FRAMEWORK AND METHODOLOGY

In this Chapter, we discuss the three steps on which our protocol is based

upon. First, we ananlyze the bitcoin transaction to search for possible _elds for

storing/hiding our data. Then, we explain the encryption algorithm that we will use

to encrypt our data and analyze how it suits for our purpose. At last, we provide a

detailed comparison of the four techniques that can be used to send our data.

4.1 Storing Data in Bitcoin Transaction

As shown in _g 3.2, the bitcoin transaction consists of an input and an

output _eld. The input is something that it receives from the previous transactions

but the output is something which it generates itself. Hence, this output _eld

provides us with the best options to store or hide our data as a part of the bitcoin

transaction. In this section, we try to explore the _elds in a bitcoin transaction that

can o_er us a space e_cient and a con_dential way of storing our data.

1. The value _eld: Every bitcoin transaction has a value _eld where the

amount of bitcoins(BTC) being spent is speci_ed. The length of this _eld is 8

bytes. Although very small, this space can be used to store some arbitrary

data. The data in value _eld is stored as signed integer type with width of

exactly 64 bits with no padding bits and using 2’s complement for negative

values,i.e.int64 t. It is possible to use 3 bytes from this 8 byte _eld, if we are

willing to burn 0.16 BTC (aprox. $117)1. This is the maximum amount one will have to burn to get the space of 3 bytes. 1Coversion rate of 1 BTC =$732.60 as on 11/28/2016

33

2. Public Key: Every bitcoin transaction has the bitcoin address of the

replace this address with some arbitrary data that looks similar to the

address, the transaction will be considered a valid transaction, however, the

bitcoins will not be received by any one. The actual size of the key is 65 bytes,

compressed to a size of 33 bytes.

Bitcoin address = version + RIPEMD-160(SHA-256( Public Key )) +

checksum

In a standard “Pay-to-Pubkey hash” transaction, the key is hashed using

RIPEMD 160 (SHA 256) to base58 check encoding. The length of the hashed

key is 25 bytes, where the _rst byte is always 1 and the last four bytes are the

checksum. This gives us a total of 20 bytes to store data.

3. MultiSig transactions:A multisig transaction, referred to as m-of-n,

requires a minimum of m signatures to match from n public keys that are

provided. If we have m valid keys to match with m signatures, then we can

use the remaining (n-m) keys to store our data. The key size here is 33 bytes

as it is not hashed rather they are in hexadecimal form. Therefore, using the

most basic 2-of-3 multisig transaction provides us with a storage space of 33

bytes. Here is an example of multisig transaction from the bitcoin develepor’s

guide: [43]

Table 4.1.

MultiSig Transaction: Parameter 2 [43]

Name Type Presence Description

Keys or

array Required

(exactly 1)

An array of strings with

each string being a public

Keys or

string Required

(1 or more)

A public key against which

signatures will be checked.

34

Table 4.2.

MultiSig Transaction: Parameter 1 [43]

Name Type Presence Description

Required Number

(int)

Required

(exactly 1)

The minimum (m) number

of signatures required to

spend this m-of-n multisig

script

Multisig outputs have two parameters, the minimum number of signatures

required (m) and the number of public keys to use to validate those

signatures. This is called m-of-n, and in this case we’ll be using 2-of-3. First

we need to create three P2PKH [58] addresses, but in this case and address

cannot be used to create an output thus, we need to provide full public key.

Then a multisig transaction is initiated using 2 parameters – the number (m)

of signatures required and a list of addresses or public keys as described in

table 4.1 and 4.2

4. Null-data transactions: In 2013, the bitcoin developers introduced an

entirely di_erent type of transaction into the blockchain. This new type of

transaction called the “Nulldata transaction” provided an option to add

additional data to the bitcoin blockchain by incurring the basic transaction

fee. It uses the script OP RETURN which accepts an input of 40 bytes. The

format of the transaction is as follows:

Pubkey Script: OP RETURN <0 to 40 bytes of data>

This storage space of 40 bytes can be used to add any arbitrary data into the

blockchain and this data is completely secured just like the data in other

standard transactions. However, what separates such transactions from other

standard transactions is that the output of transactions using OP RETURN is

unspendable. Hence, the OP RETURN output bytes from these transactions are

35

not stored in the UTXO set and the value in the amount _eld of these

transactions is set to zero.

4.2 Encryption Scheme

The fact that we have very less space to store our data, creates the need to

deploy a very space e_cient encryption scheme with short ciphertexts. The

public-key encryption schemes with the shortest ciphertexts are typically those

where the ciphertext contains a group element (i.e.,some point on an elliptic curve)

and a random-looking bitstring in the length of the message. Hiding an elliptic

curve point in a random-looking bitstring is not trivial. But as Bitcoin transactions

already contain elliptic curve points, it becomes easy for us to hide our elliptic curve

point there. This plays to our advantage as we do not need to put in additional

e_ort to hide the group element. One such space e_cient encryption scheme that

caters to our needs is the “Miniature CCA2 PK Encryption” proposed by Xavier

Boyen. [59] This encryption scheme has a very simple algebraic structure and is

secure under the Gap Di_e-Hellman assumption in the random-oracle model. This

scheme uses ElGamal one-time pads, homomorPhically related to the same secret

decryption key, to blind the message twice and hence generates a ciphertext having

no explicit redundancy. This is because it allows to construct the second key from

the _rst key without having to include any inforamtion about it. [59]. the

encryption scheme is as follows:

Let _ and   be two random oracles,

_ be collision resistant function,

_ 2 N be an arbitrary security parameter,

G = hUi be a cyclic prime-order group, generated by U, of prime order p,

such that 22_ 1 < p < 22_+1,

Fp be the _nite _eld of size p,

F_p

= Fp nf0g denote its multiplicative group of order p – 1,

36

M = f0; 1gl be the set of all bit strimgs of length l, for any _xed l _ 2_

_: M! F_p

be an arbitrary injection or collision resistant hash function.

: G _ G !M be two cryptograPhic hash functions.

Key generation: Draw a secret random exponent s 2 F_p

, and calculate V  Us

The public key is Pk   (U,V) 2 G2.

The private key is Sk   s 2 F_p

.

Encryption: Given Pk and plaintext Msg 2M, pick a randomizer r 2\$ F_p

, and let,

A   Vr

B    (A)_Msg

C   Vr=_(B)

D   Ur=_(B)

E   _(D,C) _ B

Decryption: Given Sk and a ciphertext Ctx = (_D

; _ C), check that 1 6= _D

2 G, and

let

_ C   _D

Sk

_B

_E

_ _(_D

; _ C)

_ A   _ C_(_B

)

_M

_B

_  (A_)

The decrypted plaintext is _ Msg = _M

2M

4.3 Implementation

As mentioned previously, this protocol to store data in bitcoin transactions

acts as a bootstrapping mechanism for existing circumvention techniques. Section

4.1 discussed various techniques to store data in a bitcoin transaction. In this

section, we expands over to discuss the scenarios of practical implementations of the

mentioned data storing techniques and try to answer the question that which data

storing technique provides us with the most e_ective and e_cient way of storing

data in the transactions.

37

In section 2.3, we de_ned the parameters of evaluation for our protocol.

Based on all those parameters, we present a detailed analysis and comparison of

each data storing technique. This comparison will not only allow us to highlight the

pros and cons of each technique but will also be helpful in determining the best and

worst case scenarios for using a particular technique.

Table 4.3.

Comparison between data storing/hiding techniques

Data Storing

Technique

Storage

Space

Detectability Storing

Data

Implement-

ation

Value/Amount

_eld

Less Di_cult to

detect

Easy Easy

Public key Good Di_cult to

detect

Moderately

Di_cult

Di_cult

MultiSig

Transaction

Good/

Excellent

Di_cult to

detect

Easy Easy

Null-Data

Transaction

Excellent Can be

suspicious

Very Easy Moderately

Di_cult

The table above provides a comparison between the data storing techniques

and uses the following parameters for comparison- the amount of storage space

available, how di_cult it is for the sensor to detect hidden data, the ease of storing

data and practical implementation of the technique or ease of sending/receiving

data. Here, the second parameter does not indicate the ability of sensor to extract

the hidden data from the transaction but its ability to infer if the transaction is

being used for hiding data. Before we discuss the practical implementation of these

techniques, we would like to discuss the two possible scenarios that will be used.

After we have successfully stored our data in the transaction, an important part of

this protocol would be to make that data available to the forwarder so it can

facilitate the connection to TOR bridge. One way of doing it would be to make the

forwarder the recipient of our bitcoins. This way, we can directly send the data to

it, without much complications. The other way, where its not possible to send data

directly, would be to initiate a transaction and publish it to the blockchain. Then

38

the forwarder will have to scan through all the transaction in the block chain to _nd

the transaction with the hidden data, we call this as the indirect way. An important

thing to mention here is that we limit our study to the process of hiding and

sending data. The process of scanning and _nding transactions is beyond our scope.

Storing data in the amount/value _eld of the transaction provides us with a

very easy of achieving our goal. The data is stored as a part of the value of bitcoins

to be spent hence, resulting in a transaction that makes it makes it impossible for

the sensor to di_erentiate our transaction from a standard transaction. Another

bene_t this technique provides is the ease of storing data as we just need to add our

data bits to the amount _eld. As far its implementation is concerned , it would

follow the direct approach wherein the forwarder will be the recipient in the

transaction. This way the forwarder will have an easy access to the transaction and

hence, can extract the hidden data from it. This technique convincingly satis_es

three parameters of evaluation but it falls short when it comes to the storage space

it provides. It provides us with a very less space that too with spending a

signi_cantly large amount of bitcoins.

The second option for hiding data is in the public key _eld of the

transaction. As mentioned before, it provides us with a storage space of 20 bytes

which is signi_cantly more than the previous scheme. However, process of storing

data is much more complicated than the other schemes we discuss. Since we are

storing our data as the address, we need to ensure that the address we produce is a

valid address else it will not be accepted by the client. The term “valid” here means

that the address that is being generated should adhere to the proper format of the

address as mentioned in section 4.1,i.e The Bitcoin network only veri_es that an

address is in the right form, length, and has the right checksum when “validating

it”. The existence of the address is not a criteria to be considered as the address

would be considered as valid as long as it follows the proper format. If the

generated address does not exists then any coins sent to the address cannot be spent

and are e_ectively destroyed. For sending the data, we will have to follow the

39

Figure 4.1. Transaction rate per second

The _gure shows the number of bitcoin transactions that are generated per second.

Such a high number of transactions makes it di_cult to send data in the public key

_eld [60].

indirect method in this case. Since we are using the address for hiding the data we

cannot make forwarder the recipient of the transaction. Therefore, the forwarder

needs to scan through all the transaction in the block chain to _nd the transaction.

Figure 4.1 shows the rate of transactions that are being generation per second.

Here, we can see that the average is about 2-4 transactions per second and with

transactions being added at such an enormous rate the task of scanning and _nding

becomes really tedious for the forwarder.

Multisig transactions o_er a very interesting way of storing data. As

mentioned before, these transactions use a criteria of matching m out of n public

keys to validate a transaction. This can be exploited in a way that we use m valid

keys to ensure that the transaction does take place and use the remaining keys(n-m)

to store our data. This provides us with a very good space of 33 bytes if we perform

a 2-of-3 transaction. Which is more than what the previous schemes over. This

space can be further increased if we use a higher order of transaction, e.g 3-of-5 , in

40

which we get 66 bytes of data. However, in the present scenario, 2-of-3 transaction

is the one which is used commonly. In fact, many wallets do not even support any

other multisig transaction except 2-of-3. If in future, the use of higher order multisig

transaction increases then this scheme would be very useful for our purpose , but for

now it can o_er only 33 bytes of data, which still is much better compared to other

options. This also comes with the ease of storing data since the keys are stored as

raw decimals , they do not need to adhere to any correct format as was the case

with public key addresses. Even sending data is easier with this option as it follows

the direct approach and the forwarder will be the recipient in the transaction.

Storing data in null data transactions is perhaps the most convenient and a

space e_ective way to achieve our goal. This method provides us with a storage

space of 40bytes surpassing the space provided by all the other methods. Not only

that, the process of storing data is as simple and straight forward as it could be.

This is so because these transactions were created for the purpose of storing

arbitrary data in the block chain. However, this characteristic of such transactions

proves to be both advantageous as well as disadvantageous for us. It provides us

with an easy to to store data but at the same time makes it more suspicious to the

sensor that some data is being sent through the bitcoin transactions. Considering

the fact that the ratio of null data transactions to the total number of bitcoin

transactions is very less, it becomes easier for the sensor to look out for null data

transactions and a sudden increase in the number of such transactions is bound to

get the sensor suspicious. This scheme also follows the indirect way of sending data.

But since the number of null data transactions is relatively much less, it makes it

easier for the forwarder to scan and _nd the relevant transactions.

41

CHAPTER 5. SECURITY ANALYSIS

In this section, we analyze the security of our proposed bootstrapping

mechanism under the threat model described in section 2.2. We start with the

security of the bitcoin network. Being one of the most prominent distributed crypto

currency network, it has maintained strong security record over all these years. But

we know that no system is perfect and many security aws have been found in the

bitcoin network as well. Bad actors are someone who are inevitable on any network

therefore, the security model of the bitcoin network does not exclude them and

depends entirely on computation through proof of work.

With respect to our threat model, we are more concerned about the security

of data in the bitcoin transactions. In tranactions, Bitcoin uses the Elliptic Curve

Digital Signature Algorithm (ECDSA) with the secp256k1 curve; secp256k1 private

keys are 256 bits of random data. A copy of that data is deterministically

transformed into an secp256k1 public key [43]. Because the transformation can be

reliably repeated later, the public key does not need to be stored. The public key is

then cryptographically hashed. This pubkey hash can also be reliably repeated

later, so it also does not need to be stored. The hash shortens and obfuscates the

public key, making manual transcription easier and providing security against

unanticipated problems which might allow reconstruction of private keys from

public key data at some later point [43]. Even in the signature script, An secp256k1

signature is made by using the ECDSA cryptographic formula to combine certain

transaction data with sender’s private key. This secp256k1 signature doesn’t just

prove the sender controls his private key; it also makes the non-signature-script

parts of his transaction tamper-proof so sender can safely broadcast them over the

peer-to-peer network. [43]

42

Considering the scenario where the sensor is somehow able to get hold of the

information that we had hidden in the transaction, we analyze the security of

encryption algorithm that we have used to see how our data is protected from the

sensor. The security of the ‘Miniature CCA2 PK Encryption’ scheme has been

proven and is a very standard sense of IND-CCA2 security or in simple words it is

indistinguishable under an adaptive chosen-ciphertext. The author in the original

work has proved that “This crypto system is IND-CCA2 secure in the random

oracle model, provided that GAP di_e-Hellman assumption holds in G.” [59]

Almost all CCA2-secure cryptosystems that are based on redundancy,

perform an integrity check while decrypting the data so as to secure againt any

active attacks. Since the Miniature CCA2 PK Encryption scheme does not include

any redundancy, the ciphetext is not authenticated in this case. As a result, if any

ciphertext gets tampered with, it will not be able to detect that and would decrypt

to a useless plaintext. However, the security mechanisms implied in the bitcoin

network ensure that the transaction data is fully secure against any external

tampering. Hence, covering up for a major drawback that this encryption scheme

has and making it a perfect _t for our protocol.

43

CHAPTER 6. FUTURE WORK

Our work proposes a bootstrapping mechanism for existing anti censorship

system that face a very fundamental problem of adversary-resistant communication

bootstrapping. Despite having, what looks like an e_cient scheme, we are yet to

prove its practical e_ciency and e_ectiveness. So the _rst and foremost things that

needs to be accomplished is the practical implementation of the mechanism. The

processes of storing data and later extracting it from the bitcoin transactions might

seem easy on paper but could actually cause some concerns when it is actually

being done. Once this has been achieved we need move on to the next step of

associating it with existing anti censorship networks.

6.1 Associating with existing anti-censorship systems

Once we are successfully able to store and extract data from bitcoin

transactions, the next step in the implementation process is to connect to an

existing anti censorship system. Considering the present scenario, one of the better

candidates for this would be Tor. Adversary-resistant communication bootstrapping

is a fundamental problem faced by Tor. Censoring regimes actively harvest and

block published Tor entry points and bridge nodes. More recently, some authorities

have resorted to reactive probing of the destination hosts of outbound encrypted

tra_c to identify unpublished Tor nodes [61].

Connecting to a Tor bridge requires some essential information. TOR bridge

entry consists of three di_erent elements – IP address of the bridge, Port Number,

Fingerprint of the bridge. It looks like as following:

Here,

44

111.111.11.11 is the IP address of the Bridge,

000 is the Port Number,

‘4352e58420e68f5e40bf7c74faddccd9d1349413’ is the unique identi_er or the

_ngerprint of the bridge.

In order to connect to a TOR bridge, we need to know the _rst 2 elements,i.e

the IP address and port number. The size of the IP address is 32 bits and the port

number is a 16 bit unsigned integer. Hence the total amount of space needed to for

these two elements to be stored in a bit coin transaction would be 48 bits. Once

these credentials are know, these can be encrypted using the encryption algorithm

explained in section 4.2 and then the data can be stored and sent through a bitcoin

transaction using one of the methods.

6.2 Other crypto-currency systems

Bitcoin was the _rst crypto-currency to be introduced and since then it has

maintained its supremacy other currencies in all aspects. Bitcoin is a very

sophisticated system and its _rm platform allows its use beyond currency. It has

been proven to be fairly secure, has the biggest market capitalization, has a massive

fanbase for its true decentralization and is established and accepted by thousands of

merchants. However, the things that concerns us the most is its ever increasing

value. The value of bitcoin has been increasing and is expected to continue to rise in

future, at a high rate due to multiple reasons [62]:

_ Bitcoin value increases over time by design.

_ Fiat currency fatigue.

_ Big Business hasn’t jumped onto the Bitcoin bandwagon yet.

_ Cash is leaving the scene and will be replaced by digital payments anyway.

_ The Global Reserve Currency keeps losing value, which inates Bitcoin value,

and many more.

45

The high popularity of bitcoins provide an ideal setup for our scheme.

However, our scheme involves spending bitcoins to get storage space in the

transaction. And as the value of bitcoins keep rising, the users will end up loosing a

lot of money if they opt to use our scheme. This leads to possibility of transporting

our scheme to other crypto-currencies like Ethereum [63] and Ripple [64].

46

CHAPTER 7. SUMMARY

In this thesis, we try to propose a bootstrapping mechanism for the existing

censorship circumventing tools that is unique in many ways. We divert ourselves

from the existing pattern of research going on in this _eld to develop a technique

that gives more emphasis on the process of setting up a channel that is really

di_cult for the censor to detect. We analyse various _elds inside a bitcoin

transaction and came out with four possible ways of hiding our data in the

transaction – In the Value/Amount _eld, in the public key, through a MultiSig

transaction and through null data transaction. These four approaches were

evaluated on the parameters that we de_ned at the start, which are – space

e_ciency, detectability, ease of storing data and ease of sending/receiving data.

Each approach comes with its own advantages and disadvantages and at this point

it is very di_cult to conclude that which of them it best among all. As it clearly

depends on the situation we are trying to cater to – If one needs a very less space ,

he can use the value _eld approach, on the other hand, if one needs a lot of space ,

he will go for null data transaction approach. By proposing these techniques we

have delivered our goal of developing an e_ective bootstrapping mechanism that will

very useful in connecting to already existing circumvention tools. Having said that,

our proposal is still far from perfect and will need some more intensive research to

be practically implemented.

47

BIBLIOGRAPHY

[1] Universal declaration of human rights.

http://www.un.org/en/universal-declaration-human-rights.

[2] Wikipedia. Arab spring. https://en.wikipedia.org/wiki/ArabSpring.

[3] Wikipedia. Great _rewall. https://en.wikipedia.org/wiki/GreatFirewall.

[4] Barclay Ballard. Internet censorship: The worst o_enders.

https://betanews.com/2015/01/28/internet-censorship-the-worst-o_enders/.

[5] Robert Faris and Nart Villeneuve. Measuring global internet _ltering. 2008.

[6] Dynaweb. http://www.dongtaiwang.com/home en.php.

[7] Ultrasurf. http://ultrasurf.us.

[8] Tor. https://www.torproject.org.

[9] E. Wustrow, S. Wolchok, I. Goldberg, and J. Halderman. Telex :anticensorship

in network infratructure. 2011.

[10] Tor:mirrors. https://www.torproject.org/getinvolved/mirrors.html.en.

[11] Tor:bridges. https://www.torproject.org/docs/bridges.

[12] Tor:hidden service protocol. https://www.torproject.org/docs/hidden-services.

[13] Sadia Afroz Michael Carl Tschantz and Vern Paxson. Sok:towards grounding

censorship circumvention in empiricism. 2016.

[14] Secp256k1. https://en.bitcoin.it/wiki/Secp256k1.

48

[15] Wikipedia. Ecdsa. https://en.wikipedia.org/wiki/Elliptic Curve DSA.

[16] Mung Chiang Christopher S. Leberknight and Felix Ming Fai Wong. A

taxonomy of censors and anti-censors: Part i{impacts of internet censorship.

International Journal of E-Politics, (3(2)):52{64, 2012.

[17] Andreas P_tzmann and Tu Marit Hansen. A terminology for talking about

privacy by data minimization: Anonymity, unlinkability, undetectability,

unobservability, pseudonymity, and identity management. 2010.

[18] Global Internet Freedom Consortium. Defeat internet censorship:overview of

[19] H. Roberts, E. Zuckerman, and J. Palfrey. 2007 circumvention landscape

report. 2009.

[20] Christopher Kruegel Luca Invernizzi and Giovanni Vigna. Message in a bottle:

Sailing past censorship. Proceedings of the 29th Annual Computer Security

Applications Conference, pages 39{48, 2013.

[21] Wikipedia. Ping (blogging). https://en.wikipedia.org/wiki/Ping (blogging).

[22] Douglas Hackney. Digital photography: Meta data overview. 2008.

[26] N. Provos and P. Honeyman. Hide and seek: An introduction to

steganography. Security Privacy, IEEE, pages 32 { 44, 2003.

[27] Anonymous. Tor partially blocked in china.

https://blog.torproject.org/blog/tor-partially-blocked-china, 2009.

49

[28] A.Lewman. Torproject.org blocked by gfw in china: Sooner or later?

https://blog.torproject.org/blog/torprojectorg-blocked-gfw-china-sooner-or-later,

2008.

[29] Tor:pluggable transports.

https://www.torproject.org/docs/pluggable-transports.html.en.

[30] What is deep inspection?

http://www.ranum.com/security/computersecurity=editorials=deepinspect=:

[31] A.Lewman. China blocking tor: Round two. ”

https://blog.torproject.org/blog/china-blocking-tor-round-two, 2010.

[32] L. v. Ahn and N. J. Hopper. Public-key steganography. 2004.

[33] C. Cachin. An information-theoretic model for steganography. 1998.

[34] T. Mittelholzer. An information-theoretic approach to steganography and

watermarking. 2000.

[35] J. A. O’Sullivan, P. Moulin, and J. M. Ettinger. Information-theoretic analysis

of steganography. 1998.

[36] J. Zollner, H. Federrath, H. Klimant, A. Pftizmann, R. Piotraschke, and

A. West_eld. Modeling the security of steganographic systems. 1998.

[37] J. Langford N. Hopper and L. V. Ahn. Provably secure steganography. 2002.

[38] Whit_eld Di_e and Martin E. Hellman. New directions in cryptography. IEEE

Transactions on Information Theory, VOL. IT-22(No.6), 1976.

[39] Matthew Caesar Amir Houmansadr, Giang T. K. Nguyen and Nikita Borisov.

Cirripede: Circumvention infrastructure using router redirection with plausible

deniability. Proceedings of the 18th ACM conference on Computer and

communications security, pages Pages 187{200, 2011.

50

[40] Planetlab. https://www.planet-lab.org/.

[41] Alexa topsites. https://www.planet-lab.org/.

[42] G.I.F.Consortium. China great _rewall revealed.

http://www.internetfreedom.org/_les/WhitePaper/ChinaGreatFirewallRevealed.pdf.

[43] Bitcoin developer guide. https://bitcoin.org/en/developer-guide.

[44] Merkle tree. https://brilliant.org/wiki/merkle-tree/.

[45] Merkle root. https://bitcoin.org/en/glossary/merkle-root.

[46] Hashcash. https://en.bitcoin.it/wiki/Hashcash.

[47] Proof of work. https://en.bitcoin.it/wiki/Proof of work.

[48] S. Nakamoto. A peer-to-peer electronic cash system. 2008.

[49] Outpoint. https://bitcoin.org/en/glossary/outpoint.

[50] Signature script. https://bitcoin.org/en/glossary/signature-script.

[51] Sequence number(transactions).

https://bitcoin.org/en/glossary/sequence-number.

[52] Pubkey script. https://bitcoin.org/en/glossary/pubkey-script.

[53] Script. https://en.bitcoin.it/wiki/ScriptScripts.

[54] Wikipedia. Forth (programming language).

https://en.wikipedia.org/wiki/Forth(programminglanguage):

[55] Pay to script hash. https://en.bitcoin.it/wiki/Pay to script hash.

[56] Tatsuaki Okamoto and David Pointcheval. The gap-problems: a new class of

problems for the security of cryptographic schemes. International Workshop on

Practice and Theory in Public Key Cryptography, page pages 104{118., 2001.

51

[57] Archive of plain english iso 27001 2005 de_nitions.

http://www.praxiom.com/iso-27001-de_nitions.htmCon_dentiality, 2005.

[59] Xavier Boyen. Miniature cca2 pk encryption :tight security without

redundancy. 2007.

[60] Bitcoin stats. https://blockchain.info/.

[61] Phillip Porras Vinod Yegneswaran Zachary Weinberg Jeroen Massar William

Allen Simpson Paul Vixie Patrick Lincoln, Ian Mason and Dan Boneh.

Bootstrapping communications into an anti-censorship system. 2nd USENIX

Workshop on Free and Open Communications on the Internet, 2012.

[62] Evander Smart. 5 reasons why bitcoin value must increase in future.

https://cointelegraph.com/news/5-reasons-why-bitcoin-value-must-increase-in-future,

2016.

[63] Wikipedia. Ethereum. https://en.wikipedia.org/wiki/Ethereum.

[64] Wikipedia. Ripple (payment protocol).

https://en.wikipedia.org/wiki/Ripple(paymentprotocol):

## Related Services

View all

### DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please:

Prices from
###### £28

Undergraduate 2:2 • 250 words • 7 day delivery

Delivered on-time or your money back

(10 Reviews)