A Radical Proposal for Protecting Privacy: Halt Industry’s Use of ‘Non-Content’
Published by The Lawfare Institute
in Cooperation With
Editor’s Note: An expanded version of this article is now available in the Colorado Technology Law Journal as “Reversing Privacy Risks: Strict Limitations on the Use of Communications Metadata and Telemetry Information.”
Ten years ago, when the Guardian story broke that the National Security Agency (NSA) was collecting domestic communications metadata in bulk, the public expressed grave consternation. Then President Obama sought to calm the situation by explaining, “Nobody is listening to your telephone calls,” but that did not help matters. What Obama didn’t address was that the NSA didn’t need to listen to the content of people’s phone calls to learn a great deal about them. Even without content, communications metadata can reveal a person’s interests, activities, and behaviors. Comments by former NSA General Counsel Stewart Baker a few months later gave the game away. “Metadata absolutely tells you everything about somebody’s life,” he said. “If you have enough metadata you don’t really need content …. [It’s] sort of embarrassing how predictable we are as human beings.”
The initial U.S. government response to the Snowden disclosure was to hold on to the bulk metadata program. Two reports, however—the 2013 presidential committee report “Liberty and Privacy in a Changing World” and the 2014 Privacy and Civil Liberties Oversight Board report on the Section 215 telephone records program—recommended the program’s end, proposing that the government seek such records from service providers under already-existing legal authorities. The 2015 USA FREEDOM Act implemented that proposal. But within several years, the NSA found problems in its collection and deleted three years’ worth of records. The new program did not survive, nor was there an attempt to revive the old one. By the late 2010s, the NSA found the bulk metadata collection program was not worth the effort and, despite Trump administration interest in continuing it, ended it.
The failure to continue this surveillance program does not mean that call detail records (CDRs), which list caller, callee, time, date, and duration of call, are not valuable. Shortly after the Snowden disclosures, three Stanford researchers showed that from CDRs and publicly available information, it was possible to discern the nature of a person’s heart problems, their intent to start growing marijuana, and much other personal information. That social connections, whether through CDRs or email or other communications, are remarkably revelatory is not surprising. Indeed, it is much of how social media companies determine who your friends are and what you are likely to “like.”
The Snowden disclosures revealed government use of communications metadata. But the extent to which the private sector uses both communications metadata (the CDRs of phone calls and packet header information of internet communications) and software and device telemetry (information from smartphones and other devices that report details about the functioning of software and sensors) is largely unknown by consumers. Users have little idea that information whose collection and use they do not control can reveal highly personal information. In the just-published “Reversing Privacy Risks: Strict Limitations on the Use of Communications Metadata and Telemetry Information,” Patricia Vargas Leon and I propose some radical changes on private-sector collection and use of communications metadata and software and device telemetry. It’s time to limit the use of this data for the purposes for which it was collected—and for a handful of other purposes, such as tracking peoples’ locations during public emergencies (cell site location information is extremely useful for determining where groups of people have fled to during an earthquake or other disaster).
Private-sector collection of communications metadata is over a century old, starting when AT&T began storing data on customer calls. This precursor of CDRs was used for billing. AT&T collected other data too: volume of traffic served and traffic denied, usage of trunk systems, and time to obtain a dial tone. Collection and use of this information was used for such purposes as improving quality of service and planning for future business needs.
Communications metadata could serve other purposes as well. With the arrival of cell phones in the 1990s, AT&T found a new purpose for CDRs: helping to track fraudulent activity. By then, AT&T was not the only phone company. Other firms saw other opportunities in using the data. MCI, for example, began offering customers discounts based on their calling patterns, which the companies learned from the CDRs. Seemingly innocuous from the vantage point of 2023, such use of personal data marked a radical departure in service providers’ business practices. Private information collected for the delivery of content began to be used, not to deliver contracted services, but for companies’ own commercial purposes. Yet no one asked the users if the companies could take the information needed to connect their calls and then advertise to their friends and family cheaper calling rates for calls to their communication partners.
The 1990s marked the beginning of rapid change in communication technologies, which included the shift to cell phones and, later, smartphones. Widespread adoption of the latter meant not only that internet service providers (ISPs), operating systems, websites, apps, and advertisers could learn about user location simultaneously while tracking user interests but also that they could learn what the user was doing at that moment. Smartphones have sensors such as accelerometers, gyroscopes, magnetometers, proximity detectors, and more that enable the device to properly orient a screen, provide real-time location to a user on a mapping application, etc. But when sensor information is provided off the device, depending on what data they collect, the ISPs, operating systems, websites, apps, and advertisers can learn information about user activity: how a user is traveling (walking, biking, in a car or bus, on a train), whether two users are in the same vehicle, whether two users use the same networks (an indication of frequenting the same locations, including possibly work or home), and the like. The result is highly targeted advertising: By aggregating information from multiple apps and advertisers, data brokers know who a user is, what they’re doing, and much, much more.
These shifts helped usher in the current age of “surveillance capitalism” and its ubiquitous targeted advertising. What’s slipped in without public acknowledgement or understanding is this use of so-called non-content: communications metadata and software and device telemetry. Few consumers—or policy- or lawmakers—realize the extent of personal information revealed through this data that users unknowingly supply. The result is a privacy invasion that users have no ability to control.
When a user shares a news story on Facebook, views a video on TikTok, types a destination into a mapping application, or enters the dates of her period into a fertility app, she knows she is sharing information that can be used to create a portrait of who she is, how she spends her time, and even whether she would be likely to pay back a loan. Thoughtfully or not, she is consciously sharing personal information for a service. The same is not the case for metadata and telemetry. Short of not communicating, the user has no control over what metadata accompanies a call or internet transaction. As I noted, that information can be revelatory indeed. If a user is making a Voice over IP (VoIP) call—a call in which the entire communication is carried over an IP network rather than cellular or traditional phone lines—the carrier may learn some of the content of the call through the communications metadata even though the communication itself is encrypted. And aggregating communications metadata can provide remarkably revelatory information about a user’s activities and interests. In 2021, the Federal Trade Commission staff report noted that information ISPs collect could be shared with “property managers, bail bondsmen, bounty hunters, or those who would use it for discriminatory purposes.”
Many users know to shut off the collection of GPS location data if they want their destination private. But few users are aware that data from a combination of the accelerometers, gyroscopes, and magnetometers on their smartphones can still track them not only to the medical building, but also inside, revealing whether they went to the dermatologist’s office or the abortion clinic. Or that those technologies could also reveal whether two people left the bar together and ended up in the same bedroom later in the evening.
It’s hypothetically possible, but are communications metadata and software and device telemetry actually used in this way? Companies keep very close to the chest the methods they use for targeting ads, but there are some strong hints that metadata and telemetry information play a role. In 2020, the Norwegian Consumer Council contracted with a cybersecurity company, Mnemonic, which examined the data flow from 10 popular smartphone apps. Mnemonic found that the MyDays app, used by women to track their periods, tracked “detailed GPS location data, WiFi access point data, cell tower data, and Bluetooth properties from the MyDays app[.]” This allowed the data collector Places to determine a phone’s location to a particular floor within a building.
Invasive? Without question. Potentially harmful? Absolutely. And it is not just apps collecting potentially harmful information. The FTC report raised concerns about the data that ISPs were collecting on their users. Vargas Leon and I also found a number of industry patents that used smartphone metadata and telemetry information to do such things as monitor a user’s network accesses to recommend social networks (AT&T Mobile), their location and activity to know when to send them an update (eBay), or who’s traveling nearby them day after day to recommend exchanging contact information with “someone you may know” (Facebook).
The Netherlands has acted on aspects of a related situation. A 2018 data protection impact assessment found that an enterprise version of Microsoft Office used by the Dutch government included personal data, but there was no documentation or ability for users to control that data. The government informed Microsoft that it would stop using the product unless the situation was rectified. Microsoft responded by adding user controls. The government is also acting similarly on educational software—and observing that compliance with Dutch requirements means observance of General Data Protection Regulation (GDPR) requirements. Some U.S. high-tech companies are complying with the Dutch government requirements.
But the solution that worked for educational software in the Netherlands is unlikely to be effective in the case of smartphone metadata and telemetry information. The problem is that the usual method of protecting user privacy, reliance on the Fair Information Practice Principles of notice of data collection and choice regarding collection, will fail to be effective here. First, each communication, whether downloading a web page, sending an email or a text, or conducting a VoIP call, consists of multiple packets that include communications metadata to enable delivery of content. If the user were to prevent other use of the metadata, she’d have to do so repeatedly for each communication. We know how that story ends: Instead of weighing each communication’s metadata and whether to allow use, the consumer will quickly move to allowing approval of all uses. The second, and more problematic, issue is the consumer’s inability to understand how the metadata and telemetry can provide a profile of her. After all, knowing which way a phone is facing or how quickly it is shifting direction does not sound like particularly private information. But that, plus information from other sensors, can reveal where the user is, what she’s doing, and whether there is another person with her—and even who that person is.
Vargas Leon and I recommend a much stronger action. Following the spirit of consumer protection laws such as those requiring that cars must have seatbelts, we urge that, with narrow exceptions, regulations or legislation limit the uses of metadata and telemetry information to the purposes for which they were designed: delivery of content and better user experience on the device (or, in the case of augmented reality or virtual reality, for only those purposes off the device). We recommend allowing use for investigating fraud, ensuring security, including device and user identification (for security purposes only), and modeling to understand future business needs; these purposes are analogous to the business purposes to which AT&T put metadata in the pre-1990s age. Then allow two more purposes. First, for a limited period during a public health emergency, we recommend the use of data to provide information on public movement in aggregate. We also recommend allowing such information to be used for public or peer-reviewed research projects in the public interest such as for urban planning, including appropriate de-identification methods so that personal information is not exposed.
These recommendations may sound radical, but that’s only the case if one’s vision of reality is limited to the past two decades. What has actually been radical is the past 20 years of private-sector acquisition and use of data that individuals do not knowingly supply and whose utilization consumers are unable to control. Our proposal is carefully crafted; it does not dismantle the internet advertising economy but, instead, limits itself to the situation where users are unable to prevent use. Thus, we are proposing a way of resetting the privacy balance so that consumers might actually exercise meaningful privacy control over data they cannot help supplying.
There are a number of ways to go about this. One good vehicle would be for greater action by the FTC, which has shown interest in this direction with its report on ISPs. This carries risks; to be effective—and to avoid a U-turn under a later administration—it would be necessary for legislative action to bolster the agency’s authority and budget on privacy issues. Another route for action would be through the American Data Privacy and Protection Act, currently languishing in Congress but with interest on both sides of the aisle.
Privacy is essential for life, liberty, and the pursuit of happiness. It is time to restore this back to Americans.