Skip to content

01 Introduction

Language Translators

You should know by this stage :/

If you donโ€™t, refer this.

Programming Language Processing

flowchart LR
a(( )) -->
|Source<br/>Program| Preprocessor -->
|Modified<br/>Source<br/>Program| Compiler -->
|Target<br/>Assembly<br/>Code| Assembler -->
|Relocatable<br/>Machine<br/>Code| Linker/Loader -->
|Target<br/>Machine<br/>Code| b(( ))

Compiler outputs assembly code, as it is easier to

  • produce as output
  • debug

Linker resolves ext mem addresses, where code in one file may refer to location in another file.

Stages of Compiler

flowchart LR

a[/Input/] -->
|Char<br/>Stream| la

subgraph Analysis/Front End
la["Lexical<br/>Analysis<br/>(Scanning)"] -->
|Token<br/>Stream| sya["Syntax<br/>Analysis<br>(Parsing)"] -->
|Syntax<br/>Tree| sea[Semantic<br/>Analysis]
end

sea --> |Syntax<br/>Tree| icg

subgraph Synthesis/Back End
icg[Intermediate<br/>Code<br/>Generation] -->
|Intermediate<br/>Representation| mico[Machine-Independent<br/>Code Optimization] -->
|Intermediate<br/>Representation| c[Code<br/>Generator] -->
|Target<br/>Machine<br/>Code| mdco[Machine-Dependent<br/>Code Optimization]
end

mdco --> o[/Output/]
Stage Input Task
Lexical
Analysis
/
Scanning
Source prog - Group characters into lexemes (meaningful sequences)
- Generate a token for every lexeme
- Access/Update symbol table

Secondary
- Stripping comments, whitespaces (blanks, newlines, tokens)
- Keep track of line number for errors
- Macro expansion
Syntax
Analysis
/
Parsing
Tokens - Check if structure follows [context-free] grammar of lang
- Creates tree representationof grammatical structure of token stream
Semantic
Analysis
Syntax tree
Symbol table
- Check semantic consistency w/ lang definition
- Gathers type information & saves it in syntax tree/symbol table
- Type checking: each operator has matching operands
- Label Checking
- Keywords misuse
- Flow Control checking (no break outside loop)
- Type conversions called coercions
Intermediate
Code
Generation
Parse tree from semantic analyzer Generate program in low-level/machine-like intermediate representation
Code
Optimization
Intermediate code Improve code so that target code uses lesser resources
Code
Generation
Intermediate representation - Produces target language (machine/assembly code)
- Choose registers & mem locations for vars in prog

image-20230308181550420

Error Detection & Reporting

At every phase, if any error is identified, it is reported and handled

Tasks

  • Report the presence of errors clearly & accurately. One error can mask another & cause correct code to look faulty.
  • Recover from each error quickly enough to detect subsequent errors
  • Add minimal overhead to processing of correct programs

Types of Errors

Types Meaning Example
Lexical Misspelled identifier/keyword fi (a == b)
(fi could be identifier/misspelled keyword (if)/function name
But lexical analysis considers it as identifier)
Syntax Statement not following lang rules Missing ;
Arithmetic expression with unbalanced parenthesis
Semantic Divide by 0
Operation incompatible operand types
Wrong number of array index
Logical No rules broken, but incorrect logic Using < instead of <=
Infinite recursive call

Symbol-Table

Data structure (usually hash table - for efficiency) containing a record for each identifier (variables, constants, functions) with fields for the attributes of the identifier

It is accessed at every phase of compiler.

  • Scanner, parser, and semantic analyzer put names of identifiers in symbol table.
  • The semantic analyzer stores more information (e.g. types) in the table.
  • The intermediate code generator, code optimizer and code generator use information in symbol table to generate appropriate code.

Contains

  • Attributes of variables are name, type, scope, etc.
  • Attributes of procedure names which provide info about
  • no and types of its arguments
  • method of passing each argument (call by value/reference)
  • type returned

Passes

Several phases are sometimes combined into a single โ€˜passโ€™

A pass reads an input file process it and writes an output file

Normal Passes in Compilers

  • Front-end phases are combined into a pass
  • Code optimization is an optional pass
  • Back-end phase can be made into a pass

Misc

Compilation Examples

C

cc gx.c
objdump -d a.out

Java

This command shows how your class file is treated

javac File.java
javap -c File.class

It is cross platform, as it executes as a station machine

Python

python file.py
python decompile file.py

Android SDK

How does it show how your java program will work on mobile, when mobile is ARM architecture, but your laptop is usually x86 architecture.

This is because java program is cross-platform, and the simulator simulates execution of the program as if it is executed on an ARM processor.

Last Updated: 2023-01-25 ; Contributors: AhmedThahir

Comments