Data Mapping in the age of GDPR – Unknown Application Workflows

Data Mapping in the age of GDPR – Unknown Application Workflows

When the enemy is already inside

Security breaches is a fact of life. Employees click on links in phishing emails, web applications get compromised, weak passwords get guessed, and insiders misuse their privileges. As a matter of fact, internal actors play a role in every 4th breach according to the latest 2017 Data Breach Investigations Report from Verizon (http://www.verizonenterprise.com/verizon-insights-lab/dbir/). Once the enemy is within the external defenses it is critical to protect the internal data and the business operations. Not surprisingly, the same report recommends to “only keep data on a need to know basis” and the year earlier report recommends to “know your data” among the most important recommendations (http://www.verizonenterprise.com/resources/reports/rp_dbir-2016-executive-summary_xg_en.pdf). Knowing your data allows you:

  1. to protect what is critical for the business using the most efficient defenses and
  2. restricting access to data and other IT assets on the need-to-know basis.

Knowing where the critical data is stored and how it is being transferred is the requirement of the security standards. GDPR may be the newest such regulation to enforce this level of accountability for data but it is far from the first. The basic PCI DSS standard which has been around since the early 2000s requires to scope the cardholder data environments (IT areas where credit card data is stored or transferred) and version 3 of the standard requires diagramming of the data flows.

Unknown unknowns

Unfortunately, this type of IT analysis is easier said than done accurately. IT staff makes undocumented IT changes, employees create shadow IT environments, applications have complex operations: create unexpected data flows or store information in unmanaged internal databases and other locations. Most importantly, however, IT knowledge gets lost when IT staff members leave or forget parts of the “tribal knowledge”. This results in the situation that while most of the information about the applications, data, and data transfers is known nobody knows what is unknown. One of the manifestations of the problem is the presence of unused IT assets in every corporate IT environment because nobody knows or is sure enough that these servers are unused.

(For example, 10%-20% of the servers are typically identified as unused during cloud migrations: see AWS Global Head of Enterprise Strategy Stephen Orban’s comments at the end of this post: https://www.linkedin.com/pulse/6-strategies-migrating-applications-cloud-stephen-orban/). It is not a coincidence that we refer to cloud migrations here. Cloud migrations and data center consolidations require that not only the known servers and applications be migrated but every server and every application dependency be mapped. Migration teams have to go after these last 10%-20%-30%-40%-… percent that are unknown. This is a stark difference between security-related applications, data, and data flows mapping where typically only the known (and partially incorrectly known) information gets collected – any errors and omissions do not manifest themselves before used by attackers or until never.

Discover first – ask later

The only sure way to discover these unknown unknowns is to map out the operation of every server and every network flow in an organization. This is the goal of modelizeIT application topology mapping system. The system automatically identifies logical groups of software components working together and the data flows that form environments. It does so even with no inputs from the IT staff. Only after that the information from IT staff is used for the analysis. The system also pre-classifies the data and data flows to filter out most of them as irrelevant for security analysis but keeping these that can potentially carry sensitive data. Therefore, applications, data, and data flows get identified even if they are unknown or forgotten by IT staff. As the next step, the StealthAUDIT data analysis functions take over and look at the identified and pre-filtered data sources to classify the data deeper: be it personally identifiable information or credit card numbers.

Don’t miss a post! Subscribe to The Insider Threat Security Blog here:

Dr. Joukov is the CEO of modelizeIT Inc. The application mapping software created by modelizeIT Inc is used by large corporations to make their information technology environments more secure, reliable, efficient, better aligned with the business needs, and to safer and faster migrate to clouds.

Prior to that, Dr. Joukov was doing scientific research at IBM T.J.Watson Research Center where his work was recognized by IBM and multiple other organizations for business, research, and strategy impacts. At IBM Research he was also chairing Professional Interests Community on Storage Systems.

Dr. Joukov’s PhD research results were published at the topmost operating systems research venue OSDI and the topmost storage research venue FAST.

Overall, Dr. Joukov published 39 research papers. Five of his research works won best paper awards including from the top European systems venue Eurosys.

Prof. Joukov shares his unique experiences by teaching advanced classes at NYU and Columbia universities.

IEEE, the world largest technical professional organization, recognized Dr. Joukov’s Chairmanship of IEEE Computer Society Committee and Community on Operating Systems with an award for excellent service.

Website: www.modelizeIT.com

Leave a Reply

Your email address will not be published. Required fields are marked *

*