Skip to main content

Is Database Access Broken by Design?

· 9 min read
Dan Nguyen
Jascha Beste
Nils Borrmann

Creating Awareness for Overlooked Insecurities in Data Practice

A digital vault as an analogy to a database, treasuring the most valuable assets of an organization. Created with Stable Diffusion A digital vault as an analogy to a database, treasuring the most valuable assets of an organization. Created with Stable Diffusion

This was our first blog article, originally published on LinkedIn

In today's digital age, database access remains a vulnerable aspect for many organizations. Think of databases as the vaults storing a company's most valuable assets - personal data (PII), proprietary data, crypto keys, and other sensitive information. Yet, the reality is that the way organizations handle database access is far from secure.


Table of content:


Past Data Breaches Highlight Importance of Secure Data Access

Historically, the compromise of credentials and privileged accounts has been the most prominent reason for notable data breaches. Google researchers found that 41% of compromises in 2022 were to blame on weak passwords, while Mandiant found that credential theft as an entry vector rose from 9% to 14% between 2021 and 2022 [1].

In 2018 the Marriott hotel group detected a data breach that serves as a case in point: Hackers got access to privileged accounts and were able to extract ~500m customer data from the booking database. The breach took place in 2014, but wasn't discovered until 2018 and Marriott subsequently got issued a $24m GDPR fine [2].

But it's not always malicious intent that wreaks havoc: In 2017 GitLab experienced an 18h outage that impacted ~5000 projects worldwide after an engineer executed a command on the wrong database cluster [3]. Or consider the FAA pilot-alerting system outage in 2023 that led to a nationwide ground stop, delaying 34k flights in the US. Engineers accidentally corrupted the live database, forcing the pilot-alerting system to go offline overnight [4].

Root Cause are Insecure Practices

Throughout our conversations with engineers and CTOs, and our own experience as (cybersecurity-) engineers, we've noticed fundamental flaws in how organizations provide access to data, exposing them to a great risk. We've witnessed immature processes and insecure practices both in formalized processes and day-to-day operations. This is particularly relevant when organizations are solely reliant on the basic tooling that many (SQL-)databases offer, underscoring the urgent need for rectification. Below are the most common flaws, that we've observed:

  1. Authentication Lapses: The absence of Single Sign-On or SAML means users need multiple credentials
  2. Shared Accounts: Often, multiple users operate under a single account, making it difficult to trace actions back to individuals
  3. Missing Credential Rotation: Credentials that aren't periodically rotated, increase the window of opportunity for malicious actors to exploit potentially compromised, leaked, or outdated credentials
  4. Neglected Account Decommissioning: When account decommissioning is treated as an afterthought (e.g., when an employee is leaving the company), it results in overlooked shadow and unmanaged accounts, which can be exploited by hackers
  5. Process Circumvention: Some users might bypass standard processes (like using JIRA for access requests) in favor of shortcuts, leading to unmonitored activities
  6. Improper Audit Logs: Without correctly implemented audit logs and monitoring, tracing database activities become challenging

Compounding the risks is the absence of rigorous QA in database operations. Typical Software Development Life Cycles (SDLC) involves many steps that can prevent errors from reaching production, including CI/CD pipelines, code tests, and code reviews. However, databases, once granted access, can be freely manipulated in live environments.

Manual Data Access: More Frequent and Risky Than Assumed

As we deep dive into the challenges associated with data access, we've pinpointed the most common scenarios for data access.

We're intentionally excluding BI/data science and programmatic data access scenarios from our discussion. While these domains present distinct security challenges, they benefit from more mature processes and tooling:

  • BI and data science: Frequently facilitated through data lakes, which store a copied subset of the live data, often anonymized or pseudonymized to ensure data privacy. Data lakes are ususally interacted with using analytics tools, such as PowerBI, Tableau, and Python libraries
  • Programmatic data access: Refers to the systematic ways software applications interact with databases, incl. cloud databases and SQL. Key methodologies include using dedicated APIs or SQL queries. Authentication frequently uses username/password, API tokens or client certificates

What we want to highlight are manual database access scenarios, as we think that the existing procedures around authentication and authorization are fundamentally flawed. Below are the most common situations in which manual database access occur:

  • Incident resolution and 3rd level-support: The ideal scenario involves engineers using specialized tools to troubleshoot and resolve issues. However, what we see is that tools are often unavailable or insufficient, especially for complex incidents. In such cases, engineers require direct production access to diagnose and fix bugs, or probe into issues reported by users
  • Feature Development: While developing new features, engineers often need to understand the current data distribution (e.g., for performance evaluations) or validate assumptions about the data. To facilitate this, engineers are either provided access to a replica database or granted direct access to the live database
  • Ad-hoc access for reporting: Business analysts sometimes require direct access to production databases, when the data in the BI data lakes is insufficient, not up-to-date, or changed in a way that it corrupted reporting dashboards

In short, manual database access occurs a lot more frequently than many realize and carries most of the problems we highlighted before. Let's take a closer look at how organizations manage and orchestrate manual database access.

Database Access Management: The Current Landscape

Through our research, we've identified a spectrum of approaches to managing production database access, ranging from early stage approaches to those adopted by mature organizations:

  1. No controls: The norm in young and unregulated companies, where access is freely granted upon request and credentials are often provided unencrypted and shared amongst employees
  2. Access review process in place: While there are ticketing systems like JIRA workflows in place to review, approve, and document access requests, they often lack strict enforcement. Moreover, procedures of account decommissioning and credential rotation are commonly neglected and treated as a secondary concern
  3. Ops team as intermediaries: Some companies deploy dedicated ops teams to serve as intermediaries. They ususally create and provide access to data replications/ data lakes that add to storage expenses, or handle data requests firsthand. This reliance on the ops team creates a bottleneck, preventing engineers from engaging in devops practices. Furthermore, due to their elevated access privileges, these team members can become prominent targets for cyberattacks
  4. Internal ad-hoc solutions: While developed in-house and tailored to the organization's needs, these solutions can consume significant developer resources for development and maintenance. We've come across a range of inventive methods, from GitOps workflows to using AWS Lambdas for granting access. Although these methods effectively address specific security challenges, they might not cover all aspects comprehensively
  5. Commercial software solutions: In 2023, the market for commercial access management and data security solutions is estimated to be worth $25bn. These solutions typically offer features like RBAC, SSO, dynamic secret generation, and auditing [5]. However, they can be challenging to integrate, lack flexibility in customization, and often carry a hefty price tag
  6. Holistic custom platforms: We've seeen mature (tech) companies developing comprehensive platforms for database access with features like time-limited access and column/row restrictions. Additionally, some have implemented review and approval mechanisms on query level, upholding the "four-eyes principle" for enhanced security

In conclusion, while the landscape of database access management is diverse and ever-changing, commercial solutions haven't provided the needed balance between security and developer experience/ productivity. Given their complexity and pricing, it's no surprise that many small to medium-sized organizations lean towards building in-house solutions and processes.

Balancing Productivity and Security

The challenge facing businesses as they grow is speed versus risk: Grant extensive production access, thereby embracing the modern ethos of "You build it, you run it", but risk enlarging the attack surface. Conversely, strictly locking down access to the production environment might bolster security but slows down the pace of development and incident resolution. Such stringent measures, though appearing safe, aren't completely secure. High-privilege accounts still exist and pose a risk, and the potential for innocent mistakes remains.

In our observations, it's common for startups to begin with a more trust-based and liberal approach to production access, to allow for agility and rapid development in the early stages. However, as these companies scale and mature, there's a shift towards more stringent access controls, often driven by regulatory requirements.

In hyper-growth startups in particular the growth often outpaces their ability to implement mature processes. As a result, there can be a period where these companies find themselves in a transitional phase --- moving from the unrestricted environment of a startup to a more structured and compliant operational model.

A Call to Action

Witnessing the shortcomings first hand at previous organizations we are creating Kviklet, developing an open-source solution inspired by the most mature workflows we have seen. We believe that a community-driven tool can bridge current gaps to improve security, while improving developer experience.

If you resonate with these challenges and have insights into how database access should be managed, get in touch with us and let's co-create a tool that benefits the entire DevOps community!


References

  1. Google, "How leaders can reduce risk by shutting down security theater", Sep 2023
  2. CSO Online, "Marriott data breach FAQ: How did it happen and what was the impact?", Feb 2020
  3. GitLab, "Postmortem of database outage of January 31", Feb 2017
  4. Federal Aviation Administration, "FAA NOTAM Statement", Jan 2023
  5. Gartner, "Gartner Identifies Three Factors Influencing Growth in Security Spending", Oct 2022
  6. https://kviklet.dev