Comparing Dynamic Data Masking Proxy to Data Privacy Vault

Engineering

Gil Dabah

CEO & Co-founder

October 25, 2024

min read

Join our newsletter

Your privacy is important to us, privacy policy.

In today's data-driven world, safeguarding sensitive customer information in the backend system is paramount. As companies struggle with increasing regulatory compliance and the ever-evolving threat landscape, two primary approaches have emerged: dynamic data masking proxies and data privacy vaults. This blog post will delve into the pros and cons of each solution, helping you make an informed decision for your organization's data protection needs.

Understanding the threats on storing sensitive data

To effectively protect customer data, we must identify potential threats. We'll start by analyzing the environment where data is stored and accessed. Consider a simple web/mobile application: it might use serverless functions (like AWS Lambda) or traditional servers (like EC2 or containers) to run the code. A database, typically storing user information (which often includes Personally Identifiable Information or PII), would be central to such an application.

Even a basic architecture, comprising a login box, application code, a database, and additional services (such as text messaging, email, or AI), represents a common scenario for many web and mobile applications. So we will assume that’s the case for this analysis.

Database threats

Databases pose significant risks, especially when storing sensitive customer data in plaintext. This exposes data to anyone with database access. Additionally, the lack of logs for forensics and monitoring, a common default setting in cloud platforms like AWS and GCP, makes it difficult to detect and respond to data breaches. Compromised credentials and insecure backups further exacerbate these threats. The risk is compounded when multiple databases hold sensitive data.

Storing sensitive customer data in plaintext: Exposes data to anyone with database access.
No logs for forensics or monitoring: Makes it difficult to detect and respond to data breaches.
Compromised credentials: Can grant attackers access to protected data.
Insecure backups: Pose a significant risk of data exposure.

The risk is doubled if there is more than one database holding sensitive data.

Application code threats

At the code level, compromised servers can potentially access all data, especially if there are no safeguards in place. The direct connection between the application and the SQL database, without proper validation, exposes the data to various threats. SQL injection attacks remain a persistent risk, as developers may not implement adequate filtering mechanisms to prevent malicious input from affecting database queries. Additionally, the ability to retrieve all data without pagination through a single API call can pose a security risk. Even with filtering in place, a compromised database can still pose significant risks. A fully compromised server can bypass application-level security measures and directly access and manipulate data, highlighting the importance of comprehensive application security.

Compromised servers: Can access all data if not properly secured.
Direct connection to SQL database: Exposes data to potential attacks.
SQL injection: Malicious input can manipulate database queries.
Unrestricted data retrieval: A single API call can expose all data. The notorious “select * where 1=1”…
Vulnerabilities in code: Will always expose servers to outsiders.

A security-by-design layer is required because patching vulnerabilities is never enough when there are 0d’s (unknown security vulnerabilities to the vendor). A good threatening model should assume that no software is bugfree, and therefore an extra data protection layer makes bugs redundant.

To focus on the specific advantages and disadvantages of data privacy vaults and dynamic data masking proxies, we won't delve into other security threats such as compromised identities, supply chain attacks, code vulnerabilities, or logging PII in plaintext. It's important to acknowledge that these threats are prevalent and will significantly impact data security. While we often see organizations implement measures like secret managers or identity and access management (IAM) to protect credentials and secure direct database access, stealing data is still possible. So we need to protect data differently.

There are a few threats that we should consider in a ‘traditional architecture’, no security-by-design at all.

Data privacy vault

A data privacy vault operates by storing data in its own dedicated database, where all fields are automatically encrypted at the individual field level (field-level encryption). This eliminates the need for developers to manage encryption keys or understand complex cryptographic algorithms. Designed with developer ease-of-use in mind, the vault aims to minimize data exposure by centralizing and protecting sensitive information. And it’s API based, giving more control over it and making it tech-stack agnostic.

Sensitive data is protected inside the vault. Other personal data can stay in the SQL database. Introducing a data protection layer of defense to the sensitive data.

Dynamic data masking proxy

A dynamic data masking (DDM) proxy operates as an intermediary between your application code and the database, positioned at the network level. It intercepts SQL protocol traffic and masks specific data fields based on predefined policies. This masking process occurs in real-time, preventing sensitive information from being exposed to unauthorized users.

Now given you have a functioning application and want to enhance its data security. You're faced with the decision of whether to embed a data privacy vault or a dynamic data masking proxy.

Introducing a proxy looks like. Using a reverse-proxy on the DB might work too.

Advantages of a data privacy vault

Developer-friendly: The API-based solution provides developers with full control over data management.
Strong encryption: Field-level encryption ensures robust protection of sensitive data, mitigating risks associated with storing data in plaintext.
Searchable encrypted data: Functionality is preserved as you can still search over encrypted data.
Data theft prevention: Vaults can leverage zero-trust principles (as discussed in our JWT documentation) to prevent data theft even if credentials or the web application are compromised.
Enhanced security and privacy: Vaults provide comprehensive security and privacy controls, giving you greater visibility and control over your sensitive data.
Data Isolation: Vaults can isolate and segregate sensitive data from other systems, reducing the risk of unauthorized access or data breaches.
Regulatory Compliance: Many vaults are designed to comply with data privacy regulations like GDPR and CCPA, simplifying compliance efforts. Offering features like data retention policies, localization, traceability, and DSAR (Data Subject Access Request) support.
Detailed logging: Activity logs provide visibility into all data interactions.
SaaS deployment: Offloading data to a SaaS vault can help reduce compliance and maintenance burdens and minimize risks.

And a few disadvantages naturally:

Data migration: Migrating existing data to the vault can be a complex and potentially risky process, especially for large datasets or organizations without proper testing and staging environments.
Without proper CI/CD, test automation and staging environment it’s going to be very challenging to achieve this transformation.
Code changes: Integrating the vault into your application requires modifying code to interact with the new data source. While this is generally manageable, especially when using an ORM, it can be more challenging for complex applications.
PII identification: Determining which fields contain personally identifiable information (PII) can be a challenge, but resources like guides and legal counsel can assist in this process. It's important to note that this is a consideration for both vaults and proxies.
Performance: Vaults generally have a minimal impact on performance with the field-level encryption in place. However, encrypting all personal data might not be optimal. By prioritizing encryption for the most sensitive PII that isn't heavily used for computations, you can achieve a balance between security and performance. Less sensitive personal data should still be stored in a traditional database.
Vendor Lock-in: Relying on a third-party vault vendor can create vendor lock-in, potentially limiting flexibility and increasing costs in the long run.

While vendor lock-in can be a concern for both data privacy vaults and dynamic data masking proxies, the impact can vary. Proxies may be perceived as easier to remove, as shutting them down generally doesn't require data migration or code changes. However, with vaults, migrating data to a different solution can be complex and time-consuming, as discussed in the previous points.

Advantages of a dynamic data masking proxy

Easy deployment: Proxies are generally straightforward to deploy, requiring minimal setup, but do require lots of testing to make sure nothing gets broken.
Flexible configuration: You can easily configure which fields to mask, how to mask them, and when the masking should occur.
No code changes (weeeee): Proxies can be implemented without modifying your application's code, providing a quick and non-disruptive solution.
Avoids data migration (wooooo): Unlike vaults, proxies don't require data migration, reducing complexity and potential risks.
Cost-effective: Proxies can be a relatively cost-effective solution compared to vaults, especially for organizations with limited budgets.

And some limitations:

Increased TCO: Implementing a proxy adds another component to your infrastructure, potentially increasing costs and maintenance efforts. While SaaS options exist, routing traffic outside your LAN will surely introduce performance issues.
Performance impact: Inline proxies can slow down network traffic and introduce some latency, especially challenging in high-traffic environments. However, strategies like limiting the proxy's scope to specific IP addresses or ports can mitigate this impact when working with modern cloud platforms that enable this.
Incomplete data protection: Even with a proxy in place, data remains vulnerable in the database if it's compromised. Direct access to the database bypasses the proxy's protection. Not to mention that we are talking only about data transformations, but nothing that helps with complying with privacy regulations or stronger security techniques.

An attacker (or anybody with infrastructure access, to be fair) lurking in your VPC or production network can still access the database directly, hence the data.

Limited security: While proxies can help reduce exposure, they don't address underlying security vulnerabilities. Additionally, protocol tricks or advanced attacks can potentially circumvent the proxy's masking capabilities. Side note: similar research shows attacks on HTTP proxies opening huge security holes, the key vulnerability is to remember that proxies will try passing all data to preserve functionality even if they can’t parse it well, and there a bypass can be found.
Compatibility issues: Proxies may not always be compatible with the latest database features or application updates, potentially limiting their effectiveness. So releasing a new feature might be stalled until the proxy supports it.
Scaling challenges: As your application grows and new microservices are added, configuring them to go through the proxy can increase complexity.
‍Troubleshooting difficulties: Proxies can introduce additional complexity into troubleshooting database issues, as they add another layer to the environment.

To Summarize

While a dynamic data masking proxy may seem attractive for existing applications due to their ease of implementation, they offer limited security compared to data privacy vaults. Vaults provide comprehensive security and privacy features, but they require more significant upfront effort, including data migration and code changes.

DevSecOps teams may initially favor proxies due to their transparent nature and avoidance of code changes. However, many organizations are concerned about the potential for proxies to introduce network latency and performance issues. While the no-code solution is appealing, it's important to address the risk of developers being blamed for bugs caused by the proxy, as it can alter the underlying behavior of the application. A proxy might give a false sense of security in certain circumstances!

For new applications, a vault is generally the preferred choice. However, migrating existing applications to a vault can be challenging, depending on the complexity of the codebase.

Considering the increasing frequency of data breaches and the growing importance of data privacy regulations, transitioning to a vault is the right strategic long-term decision. The key is to determine the optimal timing and approach for your specific organization.

As an interim measure, a proxy can be a valuable tool to address immediate data protection needs while gaining stakeholder buy-in and preparing for a future vault implementation. By demonstrating the benefits of improved data security and compliance, organizations can build a stronger case for investing in a vault solution going forward.

What do you think?

About the author

Gil Dabah

CEO & Co-founder

Gil is a software ninja who loves both building software (companies too) and breaking code. Renowned for his prowess in security research, including notable exploits of the Microsoft Windows kernel that have earned him unusual high bounty awards. He has written a couple of very successful open source libraries. And he likes to talk publicly in conferences.

# Tags:

No items found.

Powering Data Protection

Skip PCI compliance with our tokenization APIs

Skip PCI compliance with our tokenization APIs

hey

h2

dfsd

link2

It all begins with the cloud, where applications are accessible to everyone. Therefore, a user or an attacker makes no difference per se. Technically, encrypting all data at rest and in transit might seem like a comprehensive approach, but these methods are not enough anymore. For cloud hosted applications, data-at-rest encryption does not provide the coverage one might expect.

John Marcus

Senior Product Owner

const protectedForm = 
pvault.createProtectedForm(payment Div, 
secureFormConfig);

This is some text inside of a div block.

Continue your reading

See all articels

Text Link

Engineering

Substring Matching over Field-Level Encrypted Data

Nir Haas

25 mins

min read

January 1, 2025

Engineering

Substring Matching over Field-Level Encrypted Data

5 Tokenization Types Demonstrated Using Piiano Vault

Ariel Shiftan

min read

September 19, 2024

Engineering

5 Tokenization Types Demonstrated Using Piiano Vault

Why Spilling PII to Logs Is Bad and How To Avoid It

Ariel Shiftan

min read

September 2, 2024

Engineering

Why Spilling PII to Logs Is Bad and How To Avoid It

Golang Timeout Handling: 5 Practical Lessons

Imri Goldberg

min read

June 20, 2022

Engineering

Golang Timeout Handling: 5 Practical Lessons

Imri Goldberg

June 20, 2022

Piiano offers developer-friendly privacy and security products.

Piiano Privacy Solutions US, Inc

135 W. 50th St. Suite 200 
New York, NY 10020

Product

Piiano Vault PCI Tokenization

Company

About us

Resources

Docs Blog Privacy by Design Data Tokenization

What is PII PII Protection Column - Level Encryption 101

PII By Design ^TM Cheat Sheet

Compared to Hashicorp Vault

Compare to other data vaults

Piiano offers developer-friendly privacyand security products.

Piiano offers developer-friendly privacy and security products.

Piiano Privacy Solutions US, Inc

135 W. 50th St. Suite 200 
New York, NY 10020

Comparing Dynamic Data Masking Proxy to Data Privacy Vault

Understanding the threats on storing sensitive data

Database threats

Application code threats

Data privacy vault

Dynamic data masking proxy

Advantages of a data privacy vault

Advantages of a dynamic data masking proxy

To Summarize

Skip PCI compliance with our tokenization APIs

h2

Continue your reading

Substring Matching over Field-Level Encrypted Data

Substring Matching over Field-Level Encrypted Data

5 Tokenization Types Demonstrated Using Piiano Vault

5 Tokenization Types Demonstrated Using Piiano Vault

Why Spilling PII to Logs Is Bad and How To Avoid It

Why Spilling PII to Logs Is Bad and How To Avoid It

Golang Timeout Handling: 5 Practical Lessons

Golang Timeout Handling: 5 Practical Lessons

Get security and privacy best practices, tips and news.