Basics
- At the most basic definition, the compiler takes as input a program written in some language and produces as its output an equivalent program.
- An interpreter takes as input an executable specification and produces as output the result of executing the specification.
Principals of Compilation
- The compiler must preserve the meaning of the program being compiled.
- The compiler must improve the input program in some discernible way.
Structure of Compiler
-
Front End: focuses on understanding the source-language program
-
Back End: focuses on mapping programs to the target machine
-
The front end encodes the source program in a structure for the later use by the back end. This is called intermediate representation or IR.
- At each point, one representation will have a definite IR.
- IR allows us to have more phases between the front end and the back end
-
Optimizer: Takes in an IR, and produces a semantically equivalent IR as it’s output. It’s aim is to improve the IR in some way
Translation Overview
Aim: to generate a translation for the following code
The Front End
The front end determines if the input code is well formed, in terms of both syntax and semantics.
Checking Syntax
- The compiler compares the input program’s structure against a formal definition of the language
- This requires a formal and appropriate definition, a mechanism to test the input against the definition and a plan to handle illegal input
- Grammar: the source language is a set, usually infinite, of strings defined by some finite set of rules.
- Scanner: Scanner takes a stream of characters and converts it into a stream of classified words.
- Next, the compiler tries to match the stream of categorized words against the rules that specify syntax for the input language.
- Parsing: The process of automatically finding derivations (proof that a statement is valid) is called parsing.
Intermediate Representations
- The final task of the front end is to output an IR. For our statement above, the IR becomes