Screen scraping provides the means to extract data that you can’t get any other way. We’ll tour the packages in Go available to scrape content and examine a case study of extracting metrics from an old modem status page to monitor the device and visualize its performance.
Someday you’ll come across data on a page that you want to use, but you won’t find an API. Maybe there’s only a private API, perhaps one doesn’t exist, or maybe it’s so byzantine that you’re desperate for any alternative. When all else fails, there’s always screen scraping. Go is a perfectly capable language for performing this task, and I’ll show you the packages that make it possible. We’ll also look at a practical example of using screen scraping to collect metrics on a cable modem and visualize those in Grafana. When we’re done, any data that you can browse to in Chrome will also be available to your Go applications.
The idea for this talk came from a side project to monitor my cable modem because I was convinced there was an issue on the cable company’s side and wanted the evidence to prove it. Unfortunately, all I had was a status page that looks like this and I needed to scrape metrics out of it. That’s what led me to explore the space of screen scraping in Go, and I successfully wrote a utility to scrape those metrics and visualize them.