Chaos Engineering - why breaking things should be practised.

By Adrian Hornsby

Elevator Pitch

Ever wondered how companies delivering global services like Amazon or Netflix are architecting and testing their software systems? If you are curious and wanna learn how they do it - this session is for you!

Description

With the rise of micro-services and large-scale distributed architectures, software systems have grow increasingly complex and hard to understand. Adding to that complexity, the velocity of software delivery has also dramatically increased, resulting in failures being harder to predict and contain. While the cloud allows for high availability, redundancy and fault-tolerance, no single component can guarantee 100% uptime. Therefore, we have to understand availability but especially learn how to design architectures with failure in mind. And since failures have become more and more chaotic in nature, we must turn to chaos engineering in order to identify failures before they become outages. In this talk, I will deep dive into availability, reliability and large-scale architectures and make an introduction to chaos engineering, a discipline that promotes breaking things on purpose in order to learn how to build more resilient systems.

Notes

This is a talk I have given in the AWS Meetups in the Nordics and that I will be giving at the AWS London Summit in May 2018. The slides are here https://www.slideshare.net/hornsby/chaos-engineering-why-breaking-things-should-be-practised-93761039

I am also writing on the topic on my blog: https://medium.com/@adhorn

Happy to do the same or a modified version of it to fit the audience.