Data has become an inseparable part of day-to-day functioning. Organizations across industries collect users’ personal information for numerous reasons, like providing services, security, verification, selling it, or understanding how their consumers behave online to improve customer experience. Many refer to data as the new oil, and PII is the ultimate jackpot, which is why data breaches have become increasingly common. According to Security Magazine, in 2021 alone, data breaches rose by 10%, with the top three most targeted forms of data including names, social security numbers, and addresses – all forms of PII.
Understanding PII Terminology
PII stands for Personally Identifiable Information. To understand this term, we need to focus on the word “identifiable” and understand how data can identify a person. For the sake of simplicity, let’s focus on two ways it can be done:
- The first way is a direct (also called linked) reference, in which the data provides an unmistakable link to an individual identity—for example, full address, SSN, passport information, photo ID, and more.
- The second way is an indirect (also called linkable) reference, in which a single piece of information by itself doesn’t provide a link to an identity, but several pieces of information combined will provide an unmistakable link to an individual identity. Example: A study has found that date of birth, gender, and zip code can be combined to identify over 60% of US citizens.
As privacy is a domain that evolves rapidly, so does the definition of PII. As such, it is important to keep track of definition changes and court rulings that provide further guidance.
PII is a broad domain that can be broken into many subdomains, each with its specific regulations. For example, protected health information (PHI) is the demographic information, medical histories, tests and laboratory results, mental health conditions, insurance information, and other data that a healthcare professional collects to identify an individual and determine appropriate care. The Health Insurance Portability and Accountability Act (HIPAA) of 1996 is the primary law that oversees the use of, access to, and disclosure of PHI in the United States.
Diving further into the PII subdomains can help understand the complexity of this domain, the number of regulations, and the knowledge that is required to answer what seems to be a simple question: “What is PII?”
Each of the sensitive data types mentioned above has a lot of monetary value to criminals, as it can be resold or exploited in many ways, such as for insurance fraud, identity theft, and impersonation, making it an attractive target for hackers. In this article, we’ll focus on the broader definition of PII.
Why Collecting Data Is Necessary – And Dangerous
In today’s modern world, data is often referred to as the new oil or the new gold. We believe that data is power – the power to better understand your customers’ needs and habits, the power to build better products and offer better personalization, or the power to build more accurate ML/AI models. Regardless of how we perceive data, we can all agree that it has changed the world and will continue for years to come.
From businesses to official government legislations, various bodies must store and collect user data for various reasons. While the increase in data collection has added convenience to our day-to-day lives and how we run our businesses, it has also created a new valuable commodity that motivates cybercrime. Over the past decade, cyberattacks have nearly doubled. Today, there is an industry consensus that data breaches are inevitable. As such, companies must prepare accordingly and protect their most vulnerable and sensitive assets.
The 2022 DBIR report reveals that 71% of breaches in large organizations are motivated by financial or personal gain, and this statistic rises to 96% for organizations of all sizes. PII is an especially valuable commodity; just a single linked PII is enough to steal a user’s identity, commit financial fraud, or blackmail a user with personal information. Despite its sensitivity, PII is also one of the most necessary to collect. Details such as SSNs can be used to verify users’ identities to open bank accounts, get a loan, or even perform background checks.
Privacy Regulations and PII
The rising concerns over the necessity of storing private data while facing growing threats have led to the development of regulations designed to protect consumer privacy. Over 130 countries have jurisdictions in place to regulate consumers’ privacy and safeguard their PII. The best known of these regulatory bodies is Europe’s GDPR (General Data Protection Regulation).
The golden standard is GDPR, and while there are a lot of commonalities to how PII is defined and enforced in various privacy regulations, there are also differences. As such, understanding which privacy regulations apply to you is crucial.
Examples of differences:
- The GDPR PII definition applies to “identifiable natural person”, whereas the California Consumer Privacy Act (CCPA) applies to “..with a particular consumer or household.”
- PIPL (the Chinese Personal Information Protection Law) refers to the impact of an individual’s dignity as part of the PII definition.
- In India, the long-awaited report for recommendations to the Personal Data Protection Bill, 2019 argues that it is impossible to distinguish between personal and non-personal information in mass data collection or transport. Because of this, the recommendations also have clauses applicable to non-personal information.
GDPR and PII
The European Union enacted the GDPR in 2016, and it is still considered one of the golden standards for regulatory compliance. The standard places strict rules and requirements on how companies that operate in the EU or handle the data of EU citizens need to manage their PII. This includes delineating the precautions companies must take to ensure their users’ data remains safe from hackers. In addition, the standard includes the RTBF (right to be forgotten) act, which requires companies to delete EU citizens’ data upon request, and the DSAR (data subject access request), which gives citizens the right to know what personal information the organization is collecting and storing, and how it’s being used.
The GDPR protects a broad spectrum of data, from basic PII such as name, address, and ID numbers (like SSN) to web data such as IP addresses, cookie data, and user location. Failure to comply with GDPR standards can result in harsh fines and legal action.
NIST PII Standards
Although the US doesn’t have an all-encompassing standard like the GDPR (though there are state-specific regulations like the CCPA in California), the NIST (National Institute of Science and Technology) has created a Guide to Protecting the Confidentiality of PII that can serve as a guideline for PII security. Many federal agencies use the guide, and its definition of PII includes:
- Names (full name, maiden name, mother’s maiden name, or alias)
- Address information, both physical and online (such as email address)
- Personally identifiable numbers, including SSN, passport number, or credit card number
- Physical characteristics, including images (particularly of the face or other identifiable features), fingerprints, and other biometric data
- Individual information that can be used with other information to identify an individual (such as race, religion, weight, D.O.B., place of birth, zip code, or the like)
While NIST standards are not as official as the GDPR, failure to comply with its regulations can result in reputational damage and legal action from parties injured by any failure to comply.
Which PII Should I Protect?
PII can be divided into two categories:
- Linked PII – Information that can be directly linked to an individual, such as:
- Social security or I.D. number
- Full name
- Phone number
- Driver’s license
- Passport information
- Email address
- Medical or financial information
This is far from the complete list, and many companies require at least some if not all of the above information.
- Linkable PII – Includes indirect identifiers that alone cannot identify a person, but a combination can. Some examples include:
- Place of birth
While it’s important to know the difference between both types of identifiers (linked and linkable), both represent personal data. Privacy regulations apply to them equally, so both must be prioritized and protected. With the vast amounts of data passing through our systems today, it’s essential to discover and distinguish the most sensitive data that has the most potential to cause damage if intercepted. If your organization wants to improve its protection methods for the PII it collects, you’ll need to prioritize your work and start with linkable PII before unlinkable PII, regardless of the solution you use.
How To Protect PII
Does being compliant with regulations ensure your data is protected? Although compliance with regulatory bodies is essential and can create a foundation for data security, it is not necessarily a sufficient strategy by itself. Holistic security & privacy risk assessment should be the foundation for identifying potential posture gaps that compromise sensitive information. Privacy compliance assessment should be a subset of the overall risk assessment process.
Data minimization (i.e., collecting only what is needed) is a key aspect that can assist in reducing PII digital footprint, reducing the overall effort required to protect PII. Collecting unnecessary data increases the protection challenge further. Deleting obsolete data by implementing measures that follow retention policies by automatically deleting data after a predetermined period. That leaves you only with the data you need, reducing the amount of data you need to protect and the risk. The PII data you keep should be concentrated in one secure location.
Privacy and security solutions such as encryption and tokenization can also mitigate the risk and increase data protection. One of the most effective and emerging methods to protect PII is storing it in a PII vault. A PII vault allows you to store all your sensitive data in one highly secure location, making it accessible only to parties with access permission and pseudonymizing the data to prevent outsiders from connecting it to any PII.
PII Protection Remains a Priority
Data protection, in general, is critical, but investing an equal amount of security resources into each piece of data is both a financial and time-wasting endeavor. Data security should be prioritized by the most at-risk information, and that’s PII. As the risk of experiencing a breach has become nearly inevitable and the black market for stolen data continues to grow, PII protection will likely remain a hot-button issue for the foreseeable future. Focusing your protection measures on PII allows you to keep up with growing cyber threats and ensure your users’ privacy remains secure.