Introduction to data streaming

By Nicolas Frankel

Elevator Pitch

Data streaming is the new black: with some introduction into its background and a fun demo, I’ll show what it consists of and how one can leverage it.

Description

While “software is eating the world”, those who are able to best manage the huge mass of data will emerge out on the top.

The batch processing model has been faithfully serving us for decades. However, it might have reached the end of its usefulness for all but some very specific use-cases. As the pace of businesses increases, most of the time, decision makers prefer slightly wrong data sooner, than 100% accurate data later. Stream processing - or data streaming - exactly matches this usage: instead of managing the entire bulk of data, manage pieces of them as soon as they become available.

In this talk, I’ll define the context in which the old batch processing model was born, the reasons that are behind the new stream processing one, how they compare, what are their pros and cons, and a list of existing technologies implementing the latter with their most prominent characteristics. I’ll conclude by describing in detail one possible use-case of data streaming that is not possible with batches: display in (near) real-time all trains in Switzerland and their position on a map. I’ll go through the all the requirements and the design. Finally, using an OpenData endpoint and the Hazelcast platform, I’ll try to impress attendees with a working demo implementation of it.