Infrastructure As Data

By Tobias Macey

Elevator Pitch

Programmable infrastructure lets us drive our deployments with code, but that ignores the inherent statefulness of our environments. In addition to software principles, we should be focusing on the abstractions and domain models that allow us to apply principles of data evolution to our deployments.


Our ability to build and maintain infrastructure has been made easier and faster by the introduction of technologies such as server virtualization, cloud platforms, container orchestrators, and configuration management frameworks. This has led to the mantra of “infrastructure as code”. Unfortunately, this abstraction is leaky and ignores the inherent statefulness of already deployed services.

Some of the challenges that arise when working on infrastructure as code are the inevitable refactorings and how they will interact with the current state of our environments. Adding and removing capacity can be fairly painless because you can largely ignore your current state. The real difficulty arises when you need to evolve or modify existing systems, especially when your data model isn’t structured in a way that is easy to extend.

In this talk we will explore the pain points that happen when dealing with assumptions about your requirements that are invalidated by new or udpated demands on your infrastructure. By doing some up-front design and establishing a domain model that will grow with your business we can avoid some of the incidental complexity that prevents us from evolving our infrastructure in a clean and sustainable manner.


I am the manager and team lead of technical operations at MIT Open Learning, which has allowed me to design, build and evolve our infrastructure automation systems over the past 3 1/2 years. This has been a time of growth and increased responsibility for our team, requiring us to adapt and expand our capabilities to encompass new systems and applications. Throughout this we have been able to maintain the same core code base, but it has required several phases of refactoring and revisiting the assumptions in our domain model. This presentation will allow me to share my experiences guiding us through these problems and provide a new framing for how we think about managing our programmable infrastructure.

Previous conference presentations that I have given are:

  • DevOps Days Boston 2017 - Open Sourcing Your Infrastructure:

  • Open edX Conference 2018 - Openly Deploying Open edX at MIT Open Learning:

  • DevOps Days Boston 2018 - DevOps For Data Engineers:]