Personal Data

Any information relating to an identified or identifiable natural person.

Also: Personal information · PII (US usage)

Personal data under GDPR Article 4(1) is "any information relating to an identified or identifiable natural person." The definition has three components, each broader than people often assume:

  • Any information. Not just structured database fields. A photograph, a voice recording, an email, a chat transcript, a click stream, a CCTV frame, a vehicle registration plate, an IP address — all qualify.
  • Relating to. The information has to relate to a person. Information about an object (a building's floor plan) usually doesn't qualify; information about a building containing identifying details about its occupants does.
  • Identified or identifiable. A name is the obvious case. But "identifiable" extends to any combination of attributes that allows identification, directly or indirectly, including by combining with other available data.

Hard cases

Several categories of data are routinely contested:

  • IP addresses. Breyer (CJEU 2016) established that dynamic IP addresses are personal data when held by an entity (such as a website operator) with the legal means to obtain identifying information from a third party (such as an ISP).
  • Pseudonymised data. Pseudonymisation is a security measure, not a statutory escape hatch. As long as the data can be re-identified by combining it with the key, it remains personal data of the data subject.
  • Anonymised data. Genuinely anonymous data is outside GDPR's scope. The standard is high — the Article 29 Working Party Opinion 05/2014 requires that singling out, linkability, and inference all be impossible. Most "anonymised" datasets fail at least one of these tests.
  • Aggregate data. "Average revenue per user across our customer base" is not personal data. "Revenue for each customer in our top 10" — depending on whether the customers are companies or individuals, and whether they're identifiable — may be.
  • Inferred data. Predictions, scores, segments. The YS and Others judgment confirmed that inferences about a data subject are personal data of that subject.

Why the boundary matters

Whether something is personal data determines whether GDPR applies at all. Vendors that claim a particular data flow is "anonymised" or "aggregated" should be asked to explain their methodology — most of the time, the answer reveals that the data is pseudonymised at best, and the controller's GDPR obligations follow.

Related terms