The Right to be Forgotten is defined as “the right to silence on past events in life that are no longer occurring.” The right to be forgotten leads to allowing individuals to have information, videos, or photographs about themselves deleted from certain internet records so that they cannot be found by search engines.
As so many different compliance regulations roll out across the world, it’s important to understand the requirements from an organizational perspective as well as differences between regulations that may slightly shift those requirements. While there are many regulations that present the Right to be Forgotten option in some capacity, for the purposes of this blog we are going to focus on the European Union General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).
First, let’s do some definitions.
What is General Data Protection Regulation (GDPR)?
GDPR is a far-reaching compliance regulation laid out by the European Union that went
- Organizations are broken up into two categories: those who control and collect the personal information (Controllers) and those third-party groups that may analyze or store the data for the Controllers (Processors). Each group has a different, but similar, set of requirements associated with them.
- Any person currently in the European Union has the right to know who has their personal information and what they are doing with it.
- Very strict regulations are in place to control how often and by what methods that personal data moves across geographic borders.
- Applicable organizations need to appoint a Data Privacy Officer internally to manage all security oversight regarding the GDPR.
- A covered person can request to have a copy of their personal data delivered to them, to correct any personal information that the Controller has, or to request that all personal information be deleted.
- Covered users need to provide their explicit consent for data to be collected or processed.
- Data breaches need to be reported to appropriate regulatory authority within 72 hours of identification. Data breaches carry the risk of fines based on severity.
- Controllers and Processors are required to follow the best practices of Privacy by Design, meaning that the intent is always to keep content safe and secure rather than to respond reactively.
What is the California Consumer Privacy Act (CCPA)?
The CCPA is a relatively new compliance regulations being implemented by California that aligns relatively closely to the GDPR. It’s trailblazing regulation since this is the first state in the US that is introducing a privacy regulation in a country that has no far-reaching federal regulation. This makes sense however, as California has a GDP larger than most countries. Some of the major impacts of the CCPA includes:
- For-profit organizations over a certain size have new regulations they have to follow regarding the control of the personal information for impacted users. The impacted consumers are those who are identified as California residents who are currently physically located either in or outside of California.
- Businesses must notify the consumers on what type of personal information is being collected on those consumers, what that information is being used for, and whether or not that information is being sold to third-parties.
- Business must provide a public way via their website or a toll-free number for consumers to opt-out of the same of their personal information.
- Unless there is a different regulatory reason to retain it, businesses must delete the personal information collected on a consumer upon request.
- A business must provide a listing of personal information collected in the past twelve months upon request.
- Encryption and pseudonymization are strongly encouraged, as a breach of data that has not been impacted by either process risks fines.
What is the Right to be Forgotten, Right to Erasure, and Right to Delete?
While there are many overlaps between these regulations, one glaring requirement is around the ability to request that an organization delete all data associated with a person. The GDPR refers to this as the Right to Erasure, an advancement of a pre-GDPR phrase called Right to be Forgotten. The CCPA is less formal about naming conventions, but it is generally referred to as the Right to Delete.
While the GDPR doesn’t necessarily give any timelines, the CCPA states that this request is only relevant from the previous twelve months. Regardless, this request means that all collected personal information (barring exceptions) be removed from the organization in a timely manner, and this confirmation be provided to the data subject/consumer. It’s encouraged to read other blog posts from STEALTHbits for clarification on what constitutes personal information per regulations.
How do I Comply with the CCPA Right to Delete and the GDPR Right to Erasure?
The biggest challenge with remaining compliant is generally around locating the information in an environment to ensure that the consumer right is being implemented appropriately. While there’s a few ways this can be done, I want to open with the biggest requirement: finding all content repositories in an environment.
In a perfect world, content is stored in controlled storage locations that are well classified and use a well-structured taxonomy. These locales are purpose-driven, and no personal information sits outside of these repositories. In the real world, that’s frequently not the case. As the need for content collaboration increases, many organizations find themselves victims of Shadow IT. This results in multiple different content collaboration platforms being raised up within a network or being utilized in the cloud. New platforms without oversight and governance lead to inappropriate information – even personal information – being stored in an unacceptable location leading to a risk of compliance breach. Locating and understanding all of these platforms, including the unknown, is the first challenge with any regulation.
Once the data has been located, it’s important to be able to identify where the personal information is that is relevant; locating the information that relates to specific consumers and data subjects becomes important. There are multiple ways that this data can be tracked down, but for our purposes we’ll talk about the two most common.
Full-Text indexing is the process by which a system reviews all unstructured or structured data within a specific scope, indexes this content to a location (a flat file, a database, a similar cache, etc.), and searches through this information when looking for specific phrases. These processes come out of box in a lot of systems including SharePoint, Windows Servers, multiple cloud providers, etc. Full-Text indexing can have a few pros and cons to going through this process.
- Can search for exact phrases within documents that can potentially return faster than if files were reviewed individually
- Indexing content takes a large portion of the
storage space of the actual unstructured and structure data itself – usually
from 20 – 40% depending on the method.
- This is also a resource intensive process that requires a dedicated amount of hardware and other infrastructure to support in volume.
- Large amounts of space are taken up to index content that may be unrelated to the major use case involving identifying content that needs to be purged
- Indexing content takes a large portion of the storage space of the actual unstructured and structure data itself – usually from 20 – 40% depending on the method.
Pattern Matching and Recognition
Pattern Matching and Recognition is the process of looking for certain words and phrases within data and identifying if words match specific words, or if patterns match specific patterns; regular expressions (RegEx) are a great way to analyze this. Going this way also has its pros and cons.
- The compliance regulations focus heavily on the concept of identities. These identities generally follow specific patterns that can be matched with the right patterns.
- Knowing specific words or phrases allows for bulk searching for those terms in multiple locales
- Targeted searches take comparatively fewer resources than full-text indexing
- Retention of matches within content is targeted to only relevant information rather than targeting all potential verbiage regardless of relevancy
- If patterns, words, and phrases are not identified ahead of time, new scans against live content will need to be initiated
Regardless of the direction people go in, having a constructive method to identify information for targeted deletion requests is important. Knowing where these terms are is also useful for when data subjects and consumers request exports of their data!
How Can STEALTHbits Help?
STEALTHbits includes a variety of different solutions to help users with their compliance methods including the following major benefits:
- Host Discovery to identify the different platforms within the network that may contain various unstructured and structured data repositories to ensure as comprehensive platform coverage as possible.
- Sensitive Data Discovery capabilities where STEALTHbits solutions can analyze known content repositories for certain patterns or specific words that match in-built criteria based on personal identities across multiple different regulations.
- Downstream Actions that can be taken in an automated fashion based off of the responses found when analyzing the sensitive data, including the relocations, securing, or deletion of data deemed personally identifiable.
Overall these compliance regulations mean that organizations need to identify personal information within a data context and plan for destruction, provisioning, or securing that content for end-users in a manner that is as automated and touch-free as possible.
For a free trial or a demo of STEALTHbits, please contact us, today!
As a VP of Product Strategy at STEALTHbits, Ryan is responsible for the vision and strategy of their Data Access Governance solutions. Ryan has a tenure of thirteen years in the technology space across multiple different areas. Prior to joining STEALTHbits he most recently served as the Director of Product Management at Metalogix Software helping to lead them to acquisition by Quest software. He has also previously held positions in R&D, Presales Engineering, and Technical Support.