There was no lack of data security scandals in 2018. Right after Facebook admitted that it had a massive security breach that affected over 50 million users’ data, Google shut down its Google Plus service since it exposed user profiles to outside developers. With companies in every sector racing to acquire and monetize user data, privacy concerns have risen to the center of public scrutiny. How do companies take advantage of user data and what are the repercussions of data monetization that users and regulators should be aware of?
Selling user data enables technology startups to provide customers with free services. For example, some free mobile apps require access to users’ address books and camera rolls. App developers then sell users’ contacts and photos to online data brokers. These brokers receive data from up to hundreds of mobile apps and website cookies; each provides insight into a different aspect of users’ digital activities. Data brokers then create comprehensive profiles of each consumer and sell these profiles to advertising companies and other interested parties. Unsurprisingly, a famous saying has arisen: “If you don’t have to pay for a product, then you yourself are the product.”
Selling user data is also prevalent among large technology companies. This becomes a particular concern when tech giants combine their databases to probe deeper into consumers’ private lives. According to Bloomberg, Google bought users’ credit card transaction data from MasterCard to record if users make offline purchases after seeing online ads. Microsoft, Adobe and SAP have also recently announced the Open Data Initiative. Under this initiative, businesses are now able to consolidate their user databases run by each of the three tech giants, which massively expands their knowledge about their consumers.
Users may not even notice that their data is being collected. For example, many shopping malls offer free Wi-Fi services. Even if shoppers don’t connect with in-store Wi-Fi, as long as they have turned on Wi-Fi service in their cellphones, their cellphones will constantly send out probing requests to communicate with available Wi-Fi nearby. Every probing request contains not only a unique identifier (MAC address) of the cellphone, but also all the past Wi-Fi networks that cellphone has used. Therefore, by installing in-store Wi-Fi, shopping malls can track shoppers’ shopping routes by identifying the locations that shoppers visited before. London’s subway system once adopted this strategy to track subway traffic during peak hours. As a response, many people turned off their Wi-Fi in order not to be tracked. Regardless, the moral of the story is clear: data collection is everywhere.
Companies usually justify their acquisition of private data by claiming they have made users anonymous. They say that user privacy would not be violated as all personal data has been pseudonymized, i.e., personal identifiable information (PII) like name and credit card number have been replaced by pseudonyms. However, research suggests that it is not as difficult as people may think to re-identify PII from pseudonymized data.
Let’s begin with an “anonymous” collection of payment data, where each credit card number is represented by a pseudonym (e.g., some randomly generated letters). If we know at least four pieces of information about a person (e.g., he went to a particular grocery store on Monday, a bakery on Tuesday, a shopping mall on Saturday and a pharmacy on Sunday), research shows that there is a 90% chance that we can precisely identify that person, because it is unlikely that someone else would appear in the same four places on the same dates. Consequently, we can deduce this individual’s entire credit card transaction history by searching transactions under the same pseudonym. In the same vein, even if websites and mobile apps pseudonymize their user profiles before selling them to online data brokers, data brokers may find it easy to de-anonymize these profiles and re-identify users’ private information.
Although data protection remains a tough battle, legislative deliberations on privacy have been gradually catching up. In 2012, Congress successfully stopped Apple’s app developers from sharing users’ unique device identification numbers (UDID) with third parties. In recent months, proposals to strengthen data protection emerged from both sides of the aisle. Democratic Congressman Ro Khanna proposed an Internet Bill of Rights that covers a wide range of topics including net neutrality and the “right to be forgotten.” Republican Senator John Thune, chairman of Senate Committee on Commerce, Science and Transportation, openly called for putting consumer data protection into law. Still, there is currently no single, overarching federal law that lays out the foundation for nationwide personal data regulation. Industry-specific laws and regulations govern data practices for each sector. Various federal and state laws proposed under different Congressional terms sometimes contradict each other, creating additional challenges for law enforcement.
We also wait in anticipation to see the Supreme Court’s role in data protection. In City of Ontario vs. Quon, then Justice Anthony Kennedy justified the Supreme Court’s restraint from drawing sweeping conclusions in technology cases. He noted that the role of technology and the corresponding social expectation of privacy were still rapidly evolving. Therefore, the Supreme Court should act with caution when it comes to technology. Will the Supreme Court continue to show restraint in the years to come? Everyone is waiting to find out.