From ACID to CAP and back again: Making S3 reliable

By Baruch Sadogursky

Elevator Pitch

Usage of a hybrid storage for artifact repository has clear benefits: the artifacts are in the filesystem and the metadata is in a fast and reliable, indexed database. But how do you couple eventually-consistent storage like S3 with a strong storage consistency model required by your CI?

Description

Using hybrid storage, like the one used in Artifactory, has clear benefits: while artifacts are stored on the filesystem in a layout-independent manner, all the metadata is kept in an indexed database that is fast and reliable. But what if you want to leverage a cheap, super-robust, clustered and highly available storage solution like AWS-S3 and make it work with the metadata and other features of Artifactory? How do you couple an inherently eventually-consistent storage like S3 with a strong storage consistency model required by your CI process? This talk is a story of our journey to find harmony between ACID and CAP. We will review the challenges in building a reliable and atomic system on top of eventually consistent storage, and how we solved them in Artifactory for both standalone and clustered active-active architectures.

Notes

That’s a story of a strange matchup – we made S3 ACID. Why and how?!