The digital world runs on data. In almost every industry, artificial intelligence (AI) now sits at the center of information processing, fueling everything from predictive analytics to smart automation. AI has matured to the point where it can churn through mountains of data to recognize patterns, draw connections and even make judgment calls that previously required human input. But this leap forward hasn’t come without cost — particularly to privacy.

Today’s AI systems aren’t just automating workflows or summarizing data. They’re capable of something deeper: Data entanglement. That’s when AI takes data from multiple, disparate sources, pulls it all together and draws out insights that aren’t apparent in any single dataset alone. The integration may be automated or manual. The AI tool may “pull” data from sources you have authorized it to access, or it may grab data from “public” sources. Often this “public” data was created long before the capabilities of AI were well known. For example, putting up a “public” photo on MySpace (ask your grandparents) does not imply consent to add that photo to facial recognition or AI deepfake generation programming.

Data integration is where things get legally complicated. Privacy regulations have long relied on the idea that data exists in isolated “silos,” each with its protections, but AI smashes through those boundaries. What happens when privacy laws built for isolated data sets meet AI that thrives on bringing it all together?

Data Entanglement: The Erosion of Privacy in the Digital Age

First, let’s define data entanglement. Simply put, data entanglement happens when data from multiple sources gets mixed together in AI systems, making it impossible to isolate or delete any one piece without tearing the whole model apart. I hate generating legal bills. I have to either record everything I did on a particular day or try to remember what I did a week or a month ago. In reconstructing what I did (including what I spent, where I went, who I met with, etc.) I rely on digital data stored in multiple locations. Say you’re using an AI system to generate legal invoices. This AI might need access to everything from your call logs, emails and document edits to your GPS location data, even picking up data from wearable devices like smart glasses or health monitors. Each of these sources represents a “silo” under privacy law, but as the AI pulls them together, they stop existing as isolated entities.

The beauty of AI is that it can analyze data from each source, cross-reference it and produce conclusions that a human analyst might not think to make. The danger, however, is that once all this data is entangled, privacy rights tied to each original data silo become ineffective. Take the GDPR, for instance: one of its core principles is data minimization (Article 5(1)(c)), which mandates that data controllers only collect what’s “necessary” for a specified purpose. But when you give AI access to such varied data, that principle quickly breaks down. AI thrives on data diversity, not minimalism, and the broader and deeper the data pool, the more valuable the AI becomes.

The very structure of privacy law presumes data will stay in its lane. It assumes that a phone record stays in a phone record database, an email in an email server, and a GPS log in a mapping service. But AI connects these dots, constructing detailed personal profiles that are much harder to protect—and that’s before we even consider self-surveillance.

Self-Surveillance: How Our Gadgets Turn Us Into Data Generators

Modern technology has transformed each of us into walking data sources. Every time we strap on a smartwatch, enable location tracking or interact with a smart home device, we’re collecting and contributing to the enormous pool of data that feeds AI. The more we integrate wearable devices, smart assistants, and IoT systems into our lives, the more personal information becomes available. By default, this data can be compiled, cross-referenced and analyzed, enabling AI systems to build an intricate view of our actions, habits and routines. In effect, we are conducting surveillance on ourselves.

The law has tried to adapt to surveillance, but it’s still in the dark ages when it comes to self-surveillance. Privacy statutes, including the GDPR, try to limit data collection by requiring companies to get consent and be transparent about data usage. Yet, as individuals choose to use these devices, they often consent without understanding the potential for data aggregation. Once that data is fed into AI systems, it’s fair game for integration with other data streams. This moves us toward a situation where privacy becomes theoretical. You may agree to let your fitness tracker collect health data, but you didn’t consent to the inferences an AI might draw when that health data combines with your browsing history or purchase patterns.

This dilemma is most visible in the right to be forgotten (GDPR Article 17), which promises individuals the ability to delete their data. But with AI-driven data entanglement, that right becomes elusive. Once data is absorbed and intertwined with other data in an AI model, separating or deleting it is practically impossible. This isn’t a minor oversight — it’s a fundamental failure of the current privacy framework to deal with what AI can do.

Purpose Limitation: AI’s Shifting Goals and the Trouble With Transparency

The GDPR and CCPA emphasize purpose limitation as a way to prevent data misuse. Under purpose limitation (GDPR Article 5(1)(b)), data controllers are supposed to use data only for its specified, original purpose. Yet AI doesn’t operate with a singular purpose in mind; it’s built to evolve based on new data, finding connections as they emerge. This means the original purpose of data processing is rarely static. For instance, an AI trained to analyze employee productivity data might later be repurposed to assess broader organizational behaviors, utilizing the same data to draw insights far beyond its original scope.

Moreover, as AI systems learn and adapt, they often use the data they process to improve their models — leading to unintended, secondary applications. Consider the example of United States v. Jones (2012), where the Supreme Court ruled that long-term tracking of an individual’s movements constituted a breach of privacy expectations. Now imagine that in an AI context, where every movement, every call and every transaction gets entangled. Once AI’s applications expand to unanticipated uses, enforcing purpose limitation becomes exceedingly difficult. How can we realistically define a purpose for data usage if the AI system itself is designed to discover new uses?

A similar challenge exists with transparency. Transparency mandates in privacy law assume data processing can be explained to users in a clear, straightforward way. However, the black-box nature of AI makes transparency challenging, if not impossible. Most users are unaware that the data they feed into an AI system doesn’t remain static but becomes an evolving, potentially uncontrollable part of the model’s foundation. The complexities of AI often make it impossible for even the developers to fully understand how specific data points influence an outcome, let alone explain it to users.

Data Minimization and the Demand for Big Data in AI

Data minimization, another cornerstone of GDPR, requires organizations to limit data collection to what’s strictly necessary. The premise is that data collection should be as “light” as possible, reducing exposure and minimizing the risk of misuse. Yet AI fundamentally clashes with this principle, as its very function relies on consuming large and diverse datasets. For AI, the more data, the better. Every additional data point strengthens its ability to make accurate predictions and refine its insights.

AI-driven tools, like predictive text models, image recognition systems, or financial forecasting algorithms, typically draw on datasets that include everything from browsing history and purchase records to location data and communication logs. In the absence of this data depth, AI simply doesn’t function at full capacity. And it’s here that AI’s demand for big data cuts against the grain of data minimization, exposing the gaps in laws like GDPR that were designed with a far narrower view of data collection. AI’s need for data isn’t incidental — it’s fundamental, raising a question privacy law hasn’t been able to answer: How do we reconcile AI’s hunger for data with laws built to minimize it?

Contractual Limitations and AI’s Unpredictable Data Use

Most companies seek to manage data usage through contractual limitations, specifying terms within service agreements or end-user licenses that define what’s permitted and prohibited. While effective for simpler data uses, these contractual guardrails begin to erode in AI applications. For instance, a service agreement might permit an AI tool to process customer feedback, yet it may fail to specify what happens when that feedback is used for secondary insights unrelated to customer service. And what is “quality assurance” anyway?

Some years ago, when people called “411” for directory assistance, phone companies got the great idea of charging a dollar for each 411 call. Google then added a new service called “GOOG411.” From your cell phone, you could call 1-800-466-4411 (Goog-411) for, say “Sal’s Pizza” in The Bronx. The service would tell you there were 3 “Sal’s Pizza’s” near you, and offer to connect you (for free) to any of them. Kewl. Of course, any time someone offers you something for free and that thing is useful, you are not the customer — you are the product. To get connected to Sal’s Pizza, you are telling Google your name, phone number and location. You are telling them that you like pizza and that you like a specific kind of pizza. You are also telling them that you like pizza at 4:45 in the afternoon – minutes after a Sal’s Pizza ad on the radio (remember the radio?) You are also telling them that you are willing to travel 1.7 miles for Sal’s Pizza, but not 2.2 miles. And, with data entanglement, you are also telling them that, in addition to liking pizza, you enjoy Yankees baseball and White Castle hamburgers (in short, that you are from The Bronx). You may also be sharing other biographical data – all voluntarily. Kind of.

But, you are also sharing a voice exemplar. So Google could collect your voice and sell your voice exemplar as an authentication system for banks. Or use it to train a voice recognition software program. Or use it to generate an AI doppelganger of you. In short, it’s really tough to decide what data is being collected and why. And even if Google doesn’t use the collected data for that purpose, some AI program with access to that data might do so.

The issue isn’t just one of legal technicality — it’s practical. As AI tools evolve, secondary and tertiary uses become inevitable, yet most contracts are silent on how these emergent uses fit within the original permissions granted. This silence leaves a vacuum that can expose users to unanticipated privacy intrusions and data misuse. With data entanglement, it’s not only challenging to specify limits on data processing within the terms of use but nearly impossible to ensure those limits are observed once the data is embedded in a model. Courts have hinted at similar concerns in cases like Carpenter v. United States (2018), where the Court recognized that new technologies fundamentally alter our understanding of privacy expectations.

Data Ownership and Entanglement: A Sticking Point for Privacy Law

Entanglement challenges the very concept of data ownership. Once AI ingests and processes a piece of data, that data is no longer an isolated unit but part of an interconnected web. Removing one data point from this entangled structure risks compromising the entire model. This poses a huge problem for data deletion rights, a cornerstone of GDPR compliance. The “right to be forgotten” cannot function effectively in a system where deleting one piece of data requires altering an entire model that might serve hundreds or thousands of users.

Even more troubling, data entanglement creates residual influence, meaning that even if data is removed, its influence remains embedded in the AI’s algorithms and predictions. Privacy laws were never built to handle this situation. The legal system’s current understanding of data ownership — an idea that personal data is something individuals control and can withdraw from public or private databases — becomes meaningless in the context of AI-driven entanglement.

New Ways of Thinking About Privacy

The data revolution driven by AI has immense potential, but it comes at a profound cost to privacy. Through data entanglement, AI merges previously siloed data, challenging long-standing privacy principles and creating a web of interconnected information that current laws simply cannot regulate. Data minimization, purpose limitation, transparency — these foundational privacy rules start to look archaic when faced with the demands of modern AI.

The future of privacy in the AI age demands more than patchwork updates to existing laws. To address the fundamental issues AI poses, regulators will need to rethink privacy from the ground up, developing frameworks that recognize and adapt to the realities of data entanglement. Until then, as data continues to fuel AI’s evolution, the fight for privacy will become more complex and more urgent than ever before.

Share.
Leave A Reply