The Big Data Bind

July 18, 2018
Articles, Commentary, Critical Infrastructure, Law Enforcement, Science & Technology, Transportation
Daniel M. Gerstein

The use of genealogy websites to find the alleged Golden State killer, Cambridge Analytica’s use of Facebook data to develop targeted ads for the 2016 presidential campaign, and the loss of privacy resulting from the sharing of information on social media bring into focus some of the unintended consequences of the collection, storage, and proliferation of personal information. The use of data in novel and unexpected ways pits users’ demand for privacy against their desire to take advantage of the many benefits today’s technology has to offer.

Increasingly, people willingly give up personal information and leave digital footprints that can be used in new and innovative ways for other than intended purposes. In some cases, personal privacy and even security are compromised. Personal data is sold as a commodity. Targeted “fake news” and advertisements are generated that infringe on users’ online space. Users’ daily movements and online activities are tracked and become part of digital profiles. Some personal data even makes it to the dark web, as one account offers, “The sale of stolen personally identifiable information is a growing industry on the dark web.”

For users and policymakers, this sometimes murky digital landscape sets up a series of choices: if too much personal information is surrendered, privacy could be at risk; too little, and the benefits of emerging data technology could be lost. Striking the right balance could be the key to setting policy to maintain security while encouraging the safe deployment of existing technologies and the development of new ones.

By using free services such as Facebook, Twitter, and Google, users enter into grand bargains laid out in “terms of use” in which they consent to the use of their data for secondary purposes in exchange for access to the sites and the benefits that can be derived from social media and search functions. Such actions may incite the ire of average citizens when their personal information is compromised through big data applications, but informed users are willing to make the tradeoff nonetheless.

At the same time, though, promising opportunities could be squandered if restrictions are too strict or users become too restrictive with their information. For example, future autonomous vehicles – which could have potential to improve safety, ease traffic congestion, reduce the cost of transportation, and alter entire industries and car ownership trends – will likely depend on the constant exchange of information on location and position data with other vehicles, passenger requests and preferences, and traffic trends.

Personalized medicine potentially leading to improved quality of life depends on understanding the relationship between the human genome and the signals that regulate gene expression. It also depends on voluminous data sets on hundreds of thousands to millions of human DNA sequences. Armed with such knowledge, targeted drug delivery and perhaps gene therapies could contribute to the elimination of certain diseases.

Smart cities offer opportunities to create efficiencies for governments, industry, and individuals by eliminating redundancies, anticipating requirements, and improving city management strategies. But smart cities also rely on information technology networks, which allow for continuous exchange of information through the linking of sensors, flow models, and decision support tools to provide improved transportation, improved utility distribution, crime prevention, and traffic control to name a few applications.

To be clear, concerns about personal data privacy are not new, but they have become more pronounced as the magnitude of the potential loss of privacy comes into view. Congress and several states are taking up the issue and crafting legislation that would impose new regulations on companies that collect user data. The European Union has already acted on the issue with its General Data Protection Regulation, which requires important changes as a condition of use.

Understanding the limits on information sharing could be a first step. Identifying what data should be shared and placing limits on how long it can be accessed could be important. And identifying limits on the transmission of and uses of data by third parties could also be considered.

A balanced approach would be beneficial. To give some hypothetical examples: Autonomous vehicle data could be shared to ensure safety on the road, but not for developing personal profiles. DNA information could be used for assisting in personalized medicine, but not for making insurance coverage decisions. Smart city technologies could provide greater efficiency in people’s daily lives, but not be used to infringe on individual privacy.

Ultimately, limits on collecting, storing, and proliferating personal information could be undertaken, but only with an understanding of the full costs and benefits of the use of such information. Identifying limits on unintended uses of personal data should be balanced against the potential societal benefits accrued through using this data. No doubt, difficult issues will be identified and will need to be addressed along the way. The first steps are to:

Understand the acceptable limits of the uses of data; and
Establish mechanisms to ensure that boundaries are not violated.

However, this dialogue should begin in earnest given recent trends.

Daniel M. Gerstein

Daniel M. Gerstein works at the nonprofit, nonpartisan RAND Corporation. He formerly served as the undersecretary (acting) and deputy undersecretary in the Science and Technology Directorate of the Department of Homeland Security from 2011-2014. Gerstein’s latest book, “The Story of Technology: How We Got Here and What the Future Holds,” was published in October 2019.