Data as Code: Exploring Data and Database Portability

By Sean Scott

Elevator Pitch

Data as Code eliminates the bottlenecks of legacy environments by democratizing the data stack! Organizations don’t have to compromise data quality against speed and convenience when developers are can work independently and responsibilities shift away from dedicated database teams!

Description

Databases are traditionally roadblocks to application development. Data refreshes are time-consuming and require downtime and coordination across teams. Legacy environments must balance the risk of stale data against the time, effort, and inconvenience of reverting databases to baselines. Even modern agile shops struggle with this task, engineering complex processes or investing in complex solutions for data management. The primary obstacle? The way traditional databases merge software, configuration, and data, creating a bottleneck around niche skills unique to database administrators.

Organizations eliminate that dependency by shifting to a container model and delivering Data as Code. Containers separate database software from its data. Decoupling data eliminates the need for specialized skills, making it another artifact in the development lifecycle. Individuals and teams can create and refresh data on-demand, version it alongside application code, and distribute it as code or objects. Data as Code reduces uncertainty introduced by data drift and allows organizations to experience greater development velocity, consistent and reliable testing, and improved application quality!

Notes

Data as Code is a natural evolution of the shift to “as code” seen over the past decade. Code and infrastructure are already part of the CI/CD pipeline. Data and databases are the last holdouts still clinging to legacy paradigms. Leveraging the same container technology now ubiquitous for application development decouples data and configuration from database software and eliminates reliance on the limited and specialized skillsets owned by database administrators. With Data as Code, data becomes an asset living alongside application code and—like code—is managed, versioned, and distributed through identical channels.

Just as Infrastructure as Code delivers repeatable and portable deployments for application environments and improves the overall quality and pace of the development lifecycle, Data as Code delivers similar benefits and eliminates the bottlenecks inherent in legacy methods. It democratizes the data stack, empowering developers to work independently and shifting responsibilities away from dedicated database administrators. Untethering data from database teams means organizations no longer have to compromise data quality against speed and convenience.

After introducing Data as Code, I demonstrate how developers and database teams can adopt it to reimagine how data is managed and presented to test, development, and QA workflows.