Basics

  • At the most basic definition, the compiler takes as input a program written in some language and produces as its output an equivalent program.
  • An interpreter takes as input an executable specification and produces as output the result of executing the specification.

Principals of Compilation

  • The compiler must preserve the meaning of the program being compiled.
  • The compiler must improve the input program in some discernible way.

Structure of Compiler

  • Front End: focuses on understanding the source-language program

  • Back End: focuses on mapping programs to the target machine

  • The front end encodes the source program in a structure for the later use by the back end. This is called intermediate representation or IR.

    • At each point, one representation will have a definite IR.
    • IR allows us to have more phases between the front end and the back end
  • Optimizer: Takes in an IR, and produces a semantically equivalent IR as it’s output. It’s aim is to improve the IR in some way

Translation Overview

Aim: to generate a translation for the following code

The Front End

The front end determines if the input code is well formed, in terms of both syntax and semantics.

Checking Syntax

  • The compiler compares the input program’s structure against a formal definition of the language
  • This requires a formal and appropriate definition, a mechanism to test the input against the definition and a plan to handle illegal input
  • Grammar: the source language is a set, usually infinite, of strings defined by some finite set of rules.
  • Scanner: Scanner takes a stream of characters and converts it into a stream of classified words.
  • Next, the compiler tries to match the stream of categorized words against the rules that specify syntax for the input language.
  • Parsing: The process of automatically finding derivations (proof that a statement is valid) is called parsing.

Intermediate Representations

  • The final task of the front end is to output an IR. For our statement above, the IR becomes

The Optimizer