Getting Started in Programming Languages

A student in Northeastern’s Fundamentals of Computing course, in which I’m a TA, recently asked me how to learn what the “state of the art” is in the study of programming languages. It’s a long road from Fundamentals to reading and understanding today’s research, but this post is my answer to how to start down that path.

First, a distinction: to study programming languages, many people will point you to a book on compilers or interpreters. While those comprise a large fraction of programming languages, they are only a piece of the puzzle. The heart of (the study of) programming languages is formal semantics: the idea that one can mathematically specify the meaning of expressions and constructs in a programming language to avoid the ambiguity that otherwise arises from plain-English descriptions. Equipped with a formal semantics, one can develop multiple techniques to run programs (compilers, interpreters, hybrid approaches like JITs, etc.), new kinds of optimizations, and manual and automatic techniques to find bugs or prove properties of programs, all while remaining faithful to the intended definition of the programming language. I recommend first learning how to build and run your own language, then moving on to learning formal semantics to get a solid foundation for understanding current research.

With that in mind, here are some resources for the new (or not so new) computer scientist wishing to learn more about programming languages:

Essentials of Programming Languages: EOPL, as it’s better known, introduces readers to the internal workings of programming languages by describing small programming languages and creating an interpreter for each one. The book is very hands-on, with lots of exercises for the reader to modify the interpreters with new features. It touches on the ideas of reasoning about languages and formal semantics, but mostly sticks to the interpreter-as-semantics approach.

Jonathan Turner’s Reading List: Turner is an engineer on Mozilla’s Rust team and recently posted his reading list for getting up-to-speed on programming languages. The list starts with some resources on how to build interpreters and compilers, but also points to more academic material later.

Types and Programming Languages: TAPL (rhymes with “apple”), as it’s better known, has a solid introduction to formal semantics in the first few chapters and would be my pick for a starting point on formal semantics. The remainder of the book deals with type systems, which form only one part of programming languages, but it’s the canonical reference if you’re looking to learn about types.

Semantics Engineering with PLT Redex: The PhD-level programming languages course here at Northeastern uses the Redex book, and I found it to be a good introduction. The tool itself (Redex) is a great way to experiment with semantics, including reduction relations (roughly, the part of the semantics that says how the program runs) and type systems. You could use this book as a substitute for TAPL (at least for learning the basics of formal semantics), or you could use Redex to experiment with the languages described in TAPL.

10PL: This is a list compiled by Northeastern’s PL faculty of (roughly) ten academic papers that they think every student in PL should know. Not all of them are PL papers themselves, and they don’t form a full foundation on their own, but they form a kind of “great books” list for PL. Benjamin Pierce, the author of TAPL, also has a similar list (although with a slightly more type-heavy and theoretical bent).

That list is more than enough to get you started. I omitted resources for learning about formal methods and software engineering, two fields that overlap heavily with PL, but you may be interested in learning about them, too. For more information, I recommend talking to students or faculty in PL at your school, joining (or starting) a PL reading group, or eventually even applying to grad school if you’re so inclined.

Good luck!