Exploring Abstract Syntax Trees by writing a linter with Python's ast module

By Yashvardhan Kukreja

Elevator Pitch

One of the most crucial aspects of compiling any programming language is the formation of the Abstract Syntax Tree. With this talk, we’re going to dive into understanding that properly with a theoretical as well as a practical standpoint by writing a linter with the help of Python’s ast module.

Description

Intro

Computers don’t understand Python, Java or Go. They just understand the machine code but well, we programmers never write massive streams of ones and zeros to create a program. So, how does our piece of code, written in fancy languages like Python, get interpreted/compiled by our computers?

That process is complex and one involving multiple steps which, mostly, vary from language to language. But there’s one step which is common amongst all of them and that is understanding the structure of the code getting compiled. Now, that has to be done efficiently and reliably considering the diverse nature of languages out there.

Hence, to represent the structure of the source code in an efficiently parseable manner, the compilers scan the source code and form a tree-like representation out of it called the Abstract Syntax Tree.

Now, you can clearly see how crucial an Abstract Syntax Tree is and well, it should be palpable enough that how complicated it can be considering how vastly diverse today’s languages are. Don’t believe me? Google about it and you’ll stumble across countless algorithmic and systems-y jargon and feel lost.

But fear not

This talk is going to dive into explaining Abstract Syntax Trees in a simple and intuitive manner by writing a linter from scratch. We all have used linters to ensure that our code is nice, clean and consistently written. But behind the scenes, linters face the same problem of understanding the structure of code they plan to “lint”. Therefore, linters are one of the most common use-cases of using Abstract Syntax Trees to produce fruitful results and hence, a nice gateway to practically understand how ASTs work.

About this talk

In this talk, I’ll begin with talking about the compilation process and how Abstract Syntax Trees fit into it. I’ll proceed to further dig into it and explain its granular details so that we all end up with a shared vocabulary and jargon around it. This will help us exchange ideas around it conveniently. And I will do all of this while running stuff interactively on a Python interpreter beside me so as to explain everything as comprehensively as possible. Finally, just the theory is not enough, so I’ll proceed to write a toy linter from scratch using Python’s ast module with the knowledge we would have gained in the session. This will practically depict how our newly gained knowledge of ASTs fit so well with a mainstream yet complicated use-case of linters out there.

Notes

Technical Requirements - Basics of Python programming - Desire to learn to cool stuff (not-so-technical requirement :) )

I believe that I am a suitable person to speak about this topic for two reasons: - My technical expertise in the field: I have been a software engineer working with languages like Python and GoLang around cloud-native technologies since long which has given me enough technical skills and experience to confidently speak about such technical topics and address any questions around them. - My experience with imparting knowledge: I have been imparting knowledge in various ways since multiple years to vast audiences. I have spoken at large technical conferences such as GrafanaCon 2021, GitOpsCon 2022, PyCon 2021, AWS Community Days 2020 and PyCon 2019 around various complicated technical topics. Moreover, I have had experience in hosting technical workshops for students in my college. Finally, I actively impart knowledge by writing regular technical blogs which are read by multiple folks.

Due to the aforementioned reasons, I confidently believe that I will be able to communicate and teach a complicated topic like Abstract Syntax Trees as intuitively as possible without making anyone yawn during the entirety of my talk :D