600.428: Compiler Writing
Synopsis:
Introduction to compiler design, including lexical analysis, parsing, syntax-directed translation, symbol tables, run-time environments, and program analysis. Students are required to write a compiler as a course project.
Prerequisites:
600.226 (Data Structures).
Enrollment:
Limited to approximately 15 students.
Description:
The subject of "compilers" traditionally covers the process by which some source language, such as "C", is converted to machine code for execution. This includes everything from how to read the source file (lexical analysis and parsing) to figuring out what the source code means (semantics) to making it fast (optimization) to figuring out how to run it (register allocation and code generation).
For this semester, I have chosen to adapt this course somewhat in the direction of program analysis. Instead of dealing with optimization, register allocation, and code generation, we will view the "intermediate form" of the program as a data set for exploratory analysis, and examine ways in which this analysis might be usefully performed to find bugs or validate properties of programs. This isn't really as "untraditional" as it sounds, since optimizers and code generators view the program in very much the same way that we will. Some questions we want to be able to ask by the end of the course:
-
I have some sequence of steps that my program is supposed to do, and some other sequences that it is not. Does it actually do them? (Control Flow analysis)
-
There is a particular set of well defined states that my object is supposed to be in over its lifetime. Does the expected evolution of state actually happen? (Typestate analysis)
-
There is some theorem about my program that I would like to formally verify. Why is this hard to do?
Readings:
Required:
Appel: Modern Compiler Implementation in Java, 2nd edition (Cambridge)