“Breaking Up a Digital Monopoly”
Communications of the ACM, June 2023, Vol. 66 No. 6, Pages 38-41
By Micah D. Beck, Terry R. Moore
“How to decompose a vertically integrated digital monopoly to enable competitive services based on a shared data structure.”
The dominating power of today’s global data monopolies—most prominently Google, Facebook, and Amazon—has alarmed people around the world. Governments are seeking ways to rein in such monopolies and establish reasonable conditions for competition in the services they offer. Their business models (such as targeted advertising), also raise major issues of personal security and privacy.9 Thus, measures that control their tendencies toward monopoly may help to address the threats they pose to civil and political liberties. In this Viewpoint, we propose a regulatory strategy that addresses the naturally monopolistic nature of these services by isolating the core acquired data collection and management functions. Acquired data is data derived from the discourse of society at large so the public retains a legitimate ownership interest in it. As described in this Viewpoint, our proposal requires companies to compete by innovation rather than through monopolistic control over data.
The dominant data platforms can arguably be characterized as natural monopolies within their respective type of service. According to Richard Posner’s classic account, “If the entire demand within a relevant market can be satisfied at lowest cost by one firm rather than by two or more, the market is a natural monopoly, whatever the actual number of firms in it.” He goes on to say that in such a market the firms will tend to “… shake down to one through mergers or failures.” Among producers, the costs of entry, such as necessary infrastructure investment, leads to large economies of scale when there are few producers, and this tends to give an advantage to the largest supplier in an industry. This phenomenon is captured by the concept of subadditivity, which is the basis for the modern theory of natural monopoly (see the accompanying sidebar). Among consumers, services that benefit from strong network effects also tend to dominate over time. Familiar examples of natural monopolies include public utilities such as water services, the electricity grid, and telecommunications.
In the case of the acquired data monopolies that societies struggle with today, there is a second structural factor. Such services are based on the collection and management of information intended for a wide or unrestricted audience. They reap huge profits by exploiting a remarkable ability to monetize this data at scale. We call the digital content that these platforms build on “acquired data” to indicate it is either collected from unrestricted public sources (for example, Web pages or street cameras), or that it is provided by users who relinquish ownership in order to have it managed and distributed to others. We introduce the term acquired data to help distinguish it from surveillance data, which is collected from users without their explicit consent or agreement. An example of surveillance data is user information derived from keystrokes during data entry or from tracking of online behavior using third-party cookies. A third category is inferred data, which can be derived from published content through statistical correlation. An example of inferred data is determining the author of an anonymous article through their use of words. Importantly, and in contrast to information collected solely through surveillance or inference, sharing what we term acquired data does not necessarily raise privacy or security issues since it is by definition either intentionally made public or submitted for publication.
We contend the public retains a legitimate ownership interest in acquired data in spite of the user having possibly assigned their rights to the distributor. This is akin to the idea that a contract entered into under duress is not necessarily enforceable. When the means of distribution is monopolistically controlled, users having few other means to express themselves publicly may be coerced into accepting unfair conditions. Moreover, while there is no explicit cost to users who hand over their content, there are implicit ones. One implicit cost is required juxtaposition of content with advertisements sold by the distribution service. Another type of implicit cost may be providing access only within a pay-to-play “walled garden.”
Our proposal rests on the notion that distribution of public discourse and other acquired content serves the common good of the community of content providers and consumers. Treating it as a private asset of the distribution service does not. The main goal of decomposing acquired data monopolies is to ensure the distribution of such content at reasonable cost, both implicit and explicit. This may require overcoming the naturally monopolistic nature of such a service through regulation.
We argue that one important aspect of the tendency to natural monopoly among acquired data services is the generic nature of the acquired data structure created in the collection and management phase. Multiple Web crawlers indexing the same publicly visible Web pages, or cameras surveying the same streets, will generate broadly equivalent data structures. Another factor is the cost of entry into the market for this phase. In the case of Web search, for example, the content acquisition phase accumulates in a data structure called a “search index” built by Web crawlers. Google reports their search index has on the order of 1018 entries, which is certainly many petabytes in storage size, making the cost of creation, maintenance and access a barrier to entry. Similarly, Facebook builds a massive “social graph” from contributed posts and other interactions with members who voluntarily exchange their “content” for unspecified distribution services. In the case of a social media service like Facebook, another salient factor lies in the network effects of having a single large social media provider.
Historically, companies faced with a standardized alternative to their core service have responded defensively. Adopting a more generic data management layer that can support competitive forms of distribution turns their proprietary asset into a commodity. Classical examples are the telephone system, and local area network vendors. In these cases, POSIX and the Internet were initially dismissed and adoption refused. Some famous cases resulted in the near or complete extinction of the companies that resisted them. In other cases, these standards were adopted or similar proprietary versions created to maintain market segmentation. The largest corporations in the world now base their information ecosystems on such standards.
Our proposal is to break up acquired data monopolies vertically, leveraging the hourglass design principle to create durable but weak common services that can be regulated as public utilities. Whether our society can come to terms with social media and acquired data depends on our willingness to assert a public interest in public discourse, as well as the slow but hopefully inexorable advantage of scalable standards over monopolistic strategies.
About the Authors:
Micah D. Beck is an associate professor at the Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA.
Terry R. Moore is an associate director (retired) at Innovative Computing Laboratory, University of Tennessee, Knoxville, TN, USA.