People have always copied things. In the past, most items of value were physical objects. Patent law and economies of scale meant that small scale copying of physical objects was usually uneconomic, and large-scale copying (if it infringed) was stoppable using policemen and courts. Today, things of value are increasingly less tangible: often they are just bits and bytes or can be accurately represented as bits and bytes. The widespread deployment of packet-switched networks and the huge advances in computers and codec-technologies has made it feasible (and indeed attractive) to deliver such digital works over the Internet. This presents great opportunities and great challenges. The opportunity is low-cost delivery of personalized, desirable high-quality content. The challenge is that such content can be distributed illegally. Copyright law governs the legality of copying and distribution of such valuable data, but copyright protection is increasingly strained in a world of programmable computers and high-speed networks.
The dramatic rise in the efficiency of the darknet can be traced back to the general technological improvements in these infrastructure areas. At the same time, most attempts to fight the darknet can be viewed as efforts to deprive it of one or more of the infrastructure items. Legal action has traditionally targeted search engines and, to a lesser extent, the distribution network. As we will describe later in the paper, this has been partially successful. The drive for legislation on mandatory watermarking aims to deprive the darknet of rendering devices. We will argue that watermarking approaches are technically flawed and unlikely to have any material impact on the darknet. Finally, most content protection systems are meant to prevent or delay the injection of new objects into the darknet. Based on our first assumption, no such system constitutes an impenetrable barrier, and we will discuss the merits of some popular systems.
We see no technical impediments to the darknet becoming increasingly efficient (measured by aggregate library size and available bandwidth). However, the darknet, in all its transport-layer embodiments, is under legal attack. In this paper, we speculate on the technical and legal future of the darknet, concentrating particularly, but not exclusively, on peer-to-peer networks.
The rest of this paper is structured as follows. Section 2 analyzes different manifestations of the darknet with respect to their robustness to attacks on the infrastructure requirements described above and speculates on the future development of the darknet. Section 3 describes content protection mechanisms, their probable effect on the darknet, and the impact of the darknet upon them. In sections 4 and 5, we speculate on the scenarios in which the darknet will be effective, and how businesses may need to behave to compete effectively with it.
2 The Evolution of the Darknet
We classify the different manifestations of the darknet that have come into existence in recent years with respect to the five infrastructure requirements described and analyze weaknesses and points of attack.
As a system, the darknet is subject to a variety of attacks. Legal action continues to be the most powerful challenge to the darknet. However, the darknet is also subject to a variety of other common threats (e.g. viruses, spamming) that, in the past, have lead to minor disruptions of the darknet, but could be considerably more damaging.
In this section we consider the potential impact of legal developments on the darknet. Most of our analysis focuses on system robustness, rather than on detailed legal questions. We regard legal questions only with respect to their possible effect: the failure of certain nodes or links (vertices and edges of the graph defined above). In this sense, we are investigating a well known problem in distributed systems.
2.1 Early Small-Worlds Networks
Prior to the mid 1990s, copying was organized around groups of friends and acquaintances. The copied objects were music on cassette tapes and computer programs. The rendering devices were widely-available tape players and the computers of the time – see Fig. 1. Content injection was trivial, since most objects were either not copy protected or, if they were equipped with copy protection mechanisms, the mechanisms were easily defeated. The distribution network was a “sneaker net” of floppy disks and tapes (storage), which were handed in person between members of a group or were sent by postal mail. The bandwidth of this network – albeit small by today’s standards – was sufficient for the objects of the time. The main limitation of the sneaker net with its mechanical transport layer was latency. It could take days or weeks to obtain a copy of an object. Another serious limitation of these networks was the lack of a sophisticated search engine.
There were limited attempts to prosecute individuals who were trying to sell copyrighted objects they had obtained from the darknet (commercial piracy). However, the darknet as a whole was never under significant legal threat. Reasons may have included its limited commercial impact and the protection from legal surveillance afforded by sharing amongst friends.
The sizes of object libraries available on such networks are strongly influenced by the interconnections between the networks. For example, schoolchildren may copy content from their “family network” to their “school network” and thereby increase the size of the darknet object library available to each. Such networks have been studied extensively and are classified as “interconnected small-worlds networks.”  There are several popular examples of the characteristics of such systems. For example, most people have a social group of a few score of people. Each of these people has a group of friends that partly overlap with their friends’ friends, and also introduces more people. It is estimated that, on average, each person is connected to every other person in the world by a chain of about six people from which arises the term “six degrees of separation”.
These findings are remarkably broadly applicable (e.g. ,). The chains are on average so short because certain super-peers have many links. In our example, some people are gregarious and have lots of friends from different social or geographical circles..
We suspect that these findings have implications for sharing on darknets, and we will return to this point when we discuss the darknets of the future later in this paper.
The small-worlds darknet continues to exist. However, a number of technological advances have given rise to new forms of the darknet that have superseded the small-worlds for some object types (e.g. audio).
2.2 Central Internet Servers
By 1998, a new form of the darknet began to emerge from technological advances in several areas. The internet had become mainstream, and as such its protocols and infrastructure could now be relied upon by anyone seeking to connect users with a centralized service or with each other. The continuing fall in the price of storage together with advances in compression technology had also crossed the threshold at which storing large numbers of audio files was no longer an obstacle to mainstream users. Additionally, the power of computers had crossed the point at which they could be used as rendering devices for multimedia content. Finally, “CD ripping” became a trivial method for content injection.
The first embodiments of this new darknet were central internet servers with large collections of MP3 audio files. A fundamental change that came with these servers was the use of a new distribution network: The internet displaced the sneaker net – at least for audio content. This solved several problems of the old darknet. First, latency was reduced drastically.
Secondly, and more importantly, discovery of objects became much easier because of simple and powerful search mechanisms – most importantly the general-purpose world-wide-web search engine. The local view of the small world was replaced by a global view of the entire collection accessible by all users. The main characteristic of this form of the darknet was centralized storage and search – a simple architecture that mirrored mainstream internet servers.
Centralized or quasi-centralized distribution and service networks make sense for legal online commerce. Bandwidth and infrastructure costs tend to be low, and having customers visit a commerce site means the merchant can display adverts, collect profiles, and bill efficiently. Additionally, management, auditing, and accountability are much easier in a centralized model.
However, centralized schemes work poorly for illegal object distribution because large, central servers are large single points of failure: If the distributor is breaking the law, it is relatively easy to force him to stop. Early MP3 Web and FTP sites were commonly “hosted” by universities, corporations, and ISPs. Copyright-holders or their representatives sent “cease and desist” letters to these web-site operators and web-owners citing copyright infringement and in a few cases followed up with legal action . The threats of legal action were successful attacks on those centralized networks, and MP3 web and FTP sites disappeared from the mainstream shortly after they appeared.
2.3 Peer-to-Peer Networks
The realization that centralized networks are not robust to attack (be it legal or technical) has spurred much of the innovation in peer-to-peer networking and file sharing technologies. In this section, we examine architectures that have evolved. Early systems were flawed because critical components remained centralized (Napster) or because of inefficiencies and lack of scalability of the protocol (gnutella) . It should be noted that the problem of object location in a massively distributed, rapidly changing, heterogeneous system was new at the time peer-to-peer systems emerged. Efficient and highly scalable protocols have been proposed since then ,.
Napster was the service that ignited peer-to-peer file sharing in 1999 . There should be little doubt that a major portion of the massive (for the time) traffic on Napster was of copyrighted objects being transferred in a peer-to-peer model in violation of copyright law. Napster succeeded where central servers had failed by relying on the distributed storage of objects not under the control of Napster. This moved the injection, storage, network distribution, and consumption of objects to users.
However, Napster retained a centralized database  with a searchable index on the file name. The centralized database itself became a legal target . Napster was first enjoined to deny certain queries (e.g. “Metallica”) and then to police its network for all copyrighted content. As the size of the darknet indexed by Napster shrank, so did the number of users. This illustrates a general characteristic of darknets: there is positive feedback between the size of the object library and aggregate bandwidth and the appeal of the network for its users.
The next technology that sparked public interest in peer-to-peer file sharing was Gnutella. In addition to distributed object storage, Gnutella uses a fully distributed database described more fully in . Gnutella does not rely upon any centralized server or service – a peer just needs the IP address of one or a few participating peers to (in principle) reach any host on the Gnutella darknet. Second, Gnutella is not really “run” by anyone: it is an open protocol and anyone can write a Gnutella client application. Finally, Gnutella and its descendants go beyond sharing audio and have substantial non-infringing uses. This changes its legal standing markedly and puts it in a similar category to email. That is, email has substantial non-infringing use, and so email itself is not under legal threat even though it may be used to transfer copyrighted material unlawfully.
2.4 Robustness of Fully Distributed Darknets
Fully distributed peer-to-peer systems do not present the single points of failure that led to the demise of central MP3 servers and Napster. It is natural to ask how robust these systems are and what form potential attacks could take. We observe the following weaknesses in Gnutella-like systems:
Lack of anonymity
2.4.1 Free Riding
Peer-to-peer systems are often thought of as fully decentralized networks with copies of objects uniformly distributed among the hosts. While this is possible in principle, in practice, it is not the case. Recent measurements of libraries shared by gnutella peers indicate that the majority of content is provided by a tiny fraction of the hosts . In effect, although gnutella appears to be a peer-to-peer network of cooperating hosts, in actual fact it has evolved to effectively be another largely centralized system – see Fig. 2. Free riding (i.e. downloading objects without sharing them) by many gnutella users appears to be main cause of this development. Widespread free riding removes much of the power of network dynamics and may reduce a peer-to-peer network into a simple unidirectional distribution system from a small number of sources to a large number of destinations. Of course, if this is the case, then the vulnerabilities that we observed in centralized systems (e.g. FTP-servers) are present again. Free riding and the emergence of super-peers have several causes:
Peer-to-peer file sharing assumes that a significant fraction of users adhere to the somewhat post-capitalist idea of sacrificing their own resources for the “common good” of the network. Most free-riders do not seem to adopt this idea. For example, with 56 kbps modems still being the network connection for most users, allowing uploads constitutes a tangible bandwidth sacrifice. One approach is to make collaboration mandatory. For example, Freenet  clients are required to contribute some disk space. However, enforcing such requirements without a central infrastructure is difficult.
Existing infrastructure is another reason for the existence of super-peers. There are vast differences in the resources available to different types of hosts. For example, a T3 connection provides the combined bandwidth of about one thousand 56 kbps telephone connections.
2.4.2 Lack of Anonymity
Users of gnutella who share objects they have stored are not anonymous. Current peer-to-peer networks permit the server endpoints to be determined, and if a peer-client can determine the IP address and affiliation of a peer, then so can a lawyer or government agency. This means that users who share copyrighted objects face some threat of legal action. This appears to be yet another explanation for free riding.
There are some possible technological workarounds to the absence of endpoint anonymity. We could imagine anonymizing routers, overseas routers, object fragmentation, or some other means to complicate the effort required by law-enforcement to determine the original source of the copyrighted bits. For example, Freenet tries to hide the identity of the hosts storing any given object by means of a variety of heuristics, including routing the object through intermediate hosts and providing mechanisms for easy migration of objects to other hosts. Similarly, Mnemosyne  tries to organize object storage, such that individual hosts may not know what objects are stored on them. It is conjectured in  that this may amount to common-carrier status for the host. A detailed analysis of the legal or technical robustness of these systems is beyond the scope of this paper.
In light of these weaknesses, attacks on gnutella-style darknets focus on their object storage and search infrastructures. Because of the prevalence of super-peers, the gnutella darknet depends on a relatively small set of powerful hosts, and these hosts are promising targets for attackers.
Darknet hosts owned by corporations are typically easily removed. Often, these hosts are set up by individual employees without the knowledge of corporate management. Generally corporations respect intellectual property laws. This together with their reluctance to become targets of lawsuits, and their centralized network of hierarchical management makes it relatively easy to remove darknet hosts in the corporate domain.
While the structures at universities are typically less hierarchical and strict than those of corporations, ultimately, similar rules apply. If the .com and .edu T1 and T3 lines were pulled from under a darknet, the usefulness of the network would suffer drastically.
This would leave DSL, ISDN, and cable-modem users as the high-bandwidth servers of objects. We believe limiting hosts to this class would present a far less effective piracy network today from the perspective of acquisition because of the relative rarity of high-bandwidth consumer connections, and hence users would abandon this darknet. However, consumer broadband is becoming more popular, so in the long run it is probable that there will be adequate consumer bandwidth to support an effective consumer darknet.
The obvious next legal escalation is to bring direct or indirect (through the affiliation) challenges against users who share large libraries of copyrighted material. This is already happening and the legal threats or actions appear to be successful . This requires the collaboration of ISPs in identifying their customers, which appears to be forthcoming due to requirements that the carrier must take to avoid liability  and, in some cases, because of corporate ties between ISPs and content providers. Once again, free riding makes this attack strategy far more tractable.
It is hard to predict further legal escalation, but we note that the DMCA (digital millennium copyright act) is a far-reaching (although not fully tested) example of a law that is potentially quite powerful. We believe it probable that there will be a few more rounds of technical innovations to sidestep existing laws, followed by new laws, or new interpretations of old laws, in the next few years.
All attacks we have identified exploit the lack of endpoint anonymity and are aided by the effects of free riding. We have seen effective legal measures on all peer-to-peer technologies that are used to provide effectively global access to copyrighted material. Centralized web servers were effectively closed down. Napster was effectively closed down. Gnutella and Kazaa are under threat because of free rider weaknesses and lack of endpoint anonymity.
Lack of endpoint anonymity is a direct result of the globally accessible global object database, and it is the existence of the global database that most distinguishes the newer darknets from the earlier small worlds. At this point, it is hard to judge whether the darknet will be able to retain this global database in the long term, but it seems seems clear that legal setbacks to global-index peer-to-peer will continue to be severe.
However, should Gnutella-style systems become unviable as darknets, systems, such as Freenet or Mnemosyne might take their place. Peer-to-peer networking and file sharing does seem to be entering into the mainstream – both for illegal and legal uses. If we couple this with the rapid build-out of consumer broadband, the dropping price of storage, and the fact that personal computers are effectively establishing themselves as centers of home-entertainment, we suspect that peer-to-peer functionality will remain popular and become more widespread.
2.5 Small Worlds Networks Revisited
In this section we try to predict the evolution of the darknet should global peer-to-peer networks be effectively stopped by legal means. The globally accessible global database is the only infrastructure component of the darknet that can be disabled in this way. The other enabling technologies of the darknet (injection, distribution networks, rendering devices, storage) will not only remain available, but rapidly increase in power, based on general technological advances and the possible incorporation of cryptography. We stress that the networks described in this section (in most cases) provide poorer services than global network, and would only arise in the absence of a global database.
In the absence of a global database, small-worlds networks could again become the prevalent form of the darknet. However, these small-worlds will be more powerful than they were in the past. With the widespread availability of cheap CD and DVD readers and writers as well as large hard disks, the bandwidth of the sneaker net has increased dramatically, the cost of object storage has become negligible and object injection tools have become ubiquitous. Furthermore, the internet is available as a distribution mechanism that is adequate for audio for most users, and is becoming increasingly adequate for video and computer programs. In light of strong cryptography, it is hard to imagine how sharing could be observed and prosecuted as long as users do not share with strangers.
In concrete terms, students in dorms will establish darknets to share content in their social group. These darknets may be based on simple file sharing, DVD-copying, or may use special application programs or servers: for example, a chat or instant-messenger client enhanced to share content with members of your buddy-list. Each student will be a member of other darknets: for example, their family, various special interest groups, friends from high-school, and colleagues in part-time jobs (Fig. 3). If there are a few active super-peers – users that locate and share objects with zeal – then we can anticipate that content will rapidly diffuse between darknets, and relatively small darknets arranged around social groups will approach the aggregate libraries that are provided by the global darknets of today. Since the legal exposure of such sharing is quite limited, we believe that sharing amongst socially oriented groups will increase unabated.
Small-worlds networks suffer somewhat from the lack of a global database; each user can only see the objects stored by his small world neighbors. This raises a number of interesting questions about the network structure and object flow:
What graph structure will the network have? For example, will it be connected? What will be the average distance between two nodes?
Given a graph structure, how will objects propagate through the graph? In particular, what fraction of objects will be available at a given node? How long does it take for objects to propagate (diffuse) through the network?
Questions of this type have been studied in different contexts in a variety of fields (mathematics, computer science, economics, and physics). A number of empirical studies seek to establish structural properties of different types of small world networks, such as social networks  and the world-wide web . These works conclude that the diameter of the examined networks is small, and observe further structural properties, such as a power law of the degree distribution , A number of authors seek to model these networks by means of random graphs, in order to perform more detailed mathematical analysis on the models ,,, and, in particular, study the possibility of efficient search under different random graph distributions ,. We will present a quantitative study of the structure and dynamics of small-worlds networks in an upcoming paper, but to summarize, small-worlds darknets can be extremely efficient for popular titles: very few peers are needed to satisfy requests for top-20 books, songs, movies or computer programs. If darknets are interconnected, we expect the effective introduction rate to be large. Finally, if darknet clients are enhanced to actively seek out new popular content, as opposed to the user-demand based schemes of today, small-worlds darknets will be very efficient.
3 Introducing Content into the Darknet
Our analysis and intuition have led us to believe that efficient darknets – in global or small-worlds form — will remain a fact of life. In this section we examine rights-management technologies that are being deployed to limit the introduction rate or decrease the rate of diffusion of content into the darknet.
3.1 Conditional Access Systems
A conditional-access system is a simple form of rights-management system in which subscribers are given access to objects based (typically) on a service contract. Digital rights management systems often perform the same function, but typically impose restrictions on the use of objects after unlocking.
Conditional access systems such as cable, satellite TV, and satellite radio offer little or no protection against objects being introduced into the darknet from subscribing hosts. A conditional-access system customer has no access to channels or titles to which they are not entitled, and has essentially free use of channels that he has subscribed or paid for. This means that an investment of ~$100 (at time of writing) on an analog video-capture card is sufficient to obtain and share TV programs and movies. Some CA systems provide post-unlock protections but they are generally cheap and easy to circumvent.
Thus, conditional access systems provide a widely deployed, high-bandwidth source of video material for the darknet. In practice, the large size and low cost of CA-provided video content will limit the exploitation of the darknet for distributing video in the near-term.
The same can not be said of the use of the darknet to distribute conditional-access system broadcast keys. At some level, each head-end (satellite or cable TV head-end) uses an encryption key that must be made available to each customer (it is a broadcast), and in the case of a satellite system this could be millions of homes. CA-system providers take measures to limit the usefulness of exploited session keys (for example, they are changed every few seconds), but if darknet latencies are low, or if encrypted broadcast data is cached, then the darknet could threaten CA-system revenues.
We observe that the exposure of the conditional access provider to losses due to piracy is proportional to the number of customers that share a session key. In this regard, cable-operators are in a safer position than satellite operators because a cable operator can narrowcast more cheaply.
3.2 DRM Systems
A classical-DRM system is one in which a client obtains content in protected (typically encrypted) form, with a license that specifies the uses to which the content may be put. Examples of licensing terms that are being explored by the industry are “play on these three hosts,” “play once,” “use computer program for one hour,” etc.
The license and the wrapped content are presented to the DRM system whose responsibility is to ensure that:
The client cannot remove the encryption from the file and send it to a peer,
The client cannot “clone” its DRM system to make it run on another host,
The client obeys the rules set out in the DRM license, and,
The client cannot separate the rules from the payload.
Advanced DRM systems may go further.
Some such technologies have been commercially very successful – the content scrambling system used in DVDs, and (broadly interpreted) the protection schemes used by conditional access system providers fall into this category, as do newer DRM systems that use the internet as a distribution channel and computers as rendering devices. These technologies are appealing because they promote the establishment of new businesses, and can reduce distribution costs. If costs and licensing terms are appealing to producers and consumers, then the vendor thrives. If the licensing terms are unappealing or inconvenient, the costs are too high, or competing systems exist, then the business will fail. The DivX “DVD” rental model failed on most or all of these metrics, but CSS-protected DVDs succeeded beyond the wildest expectations of the industry.
On personal computers, current DRM systems are software-only systems using a variety of tricks to make them hard to subvert. DRM enabled consumer electronics devices are also beginning to emerge.
In the absence of the darknet, the goal of such systems is to have comparable security to competing distribution systems – notably the CD and DVD – so that programmable computers can play an increasing role in home entertainment. We will speculate whether these strategies will be successful in the Sect. 5.
DRM systems strive to be BOBE (break-once, break everywhere)-resistant. That is, suppliers anticipate (and the assumptions of the darknet predict) that individual instances (clients) of all security-systems, whether based on hardware or software, will be subverted. If a client of a system is subverted, then all content protected by that DRM client can be unprotected. If the break can be applied to any other DRM client of that class so that all of those users can break their systems, then the DRM-scheme is BOBE-weak. If, on the other hand, knowledge gained breaking one client cannot be applied elsewhere, then the DRM system is BOBE-strong.
Most commercial DRM-systems have BOBE-exploits, and we note that the darknet applies to DRM-hacks as well. The CSS system is an exemplary BOBE-weak system. The knowledge and code that comprised the De-CSS exploit spread uncontrolled around the world on web-sites, newsgroups, and even T-shirts, in spite of the fact that, in principle, the Digital Millennium Copyright Act makes it a crime to develop these exploits.
A final characteristic of existing DRM-systems is renewability. Vendors recognize the possibility of exploits, and build systems that can be field-updated.
It is hard to quantify the effectiveness of DRM-systems for restricting the introduction of content into the darknet from experience with existing systems. Existing DRM-systems typically provide protection for months to years; however, the content available to such systems has to date been of minimal interest, and the content that is protected is also available in unprotected form. The one system that was protecting valuable content (DVD video) was broken very soon after compression technology and increased storage capacities and bandwidth enabled the darknet to carry video content.
The DRM-systems described above can be used to provide protection for software, in addition other objects (e.g. audio and video). Alternatively, copy protection systems for computer programs may embed the copy protection code in the software itself.
The most important copy-protection primitive for computer programs is for the software to be bound to a host in such a way that the program will not work on an unlicensed machine. Binding requires a machine ID: this can be a unique number on a machine (e.g. a network card MAC address), or can be provided by an external dongle.
For such schemes to be strong, two things must be true. First, the machine ID must not be “virtualizable.” For instance, if it is trivial to modify a NIC driver to return an invalid MAC address, then the software-host binding is easily broken. Second, the code that performs the binding checks must not be easy to patch. A variety of technologies that revolve around software tamper-re
Cite This Work
To export a reference to this article please select a referencing style below: