Sipping from a firehose: Parallel ETL in Raku

By Steven Lembark

Elevator Pitch

‘Hypers & Gathers & Takes, Oh my!’ was a quick overview of general language facility for ETL. This looks a bit more at ETL requirements with more description of the parallel make and lazy gather that make Raku so effective for ETL work in general.

Description

ETL is about getting data from there to here. In today’s world that means handling lots of data, usually in parallel, usually with tight timelines. Raku’s parallel make and lazy gather form the basics of a nice pattern for general ETL, separating out low-level tools for extracting and munging the input data from the higher-level constructs for actually processing it. This talk looks at some fundamental ETL input varities – lines, chunks – and how to use Raku as the glue for managing their import via flat files or data channels.

Notes

People seemed to like the original talk, this one shifts from the ‘what’ of processing to more specific instruction on the how and why.