01 Introduction
Language Translators¶
You should know by this stage :/
If you donโt, refer this.
Programming Language Processing¶
flowchart LR
a(( )) -->
|Source<br/>Program| Preprocessor -->
|Modified<br/>Source<br/>Program| Compiler -->
|Target<br/>Assembly<br/>Code| Assembler -->
|Relocatable<br/>Machine<br/>Code| Linker/Loader -->
|Target<br/>Machine<br/>Code| b(( ))
Compiler outputs assembly code, as it is easier to
- produce as output
- debug
Linker resolves ext mem addresses, where code in one file may refer to location in another file.
Stages of Compiler¶
flowchart LR
a[/Input/] -->
|Char<br/>Stream| la
subgraph Analysis/Front End
la["Lexical<br/>Analysis<br/>(Scanning)"] -->
|Token<br/>Stream| sya["Syntax<br/>Analysis<br>(Parsing)"] -->
|Syntax<br/>Tree| sea[Semantic<br/>Analysis]
end
sea --> |Syntax<br/>Tree| icg
subgraph Synthesis/Back End
icg[Intermediate<br/>Code<br/>Generation] -->
|Intermediate<br/>Representation| mico[Machine-Independent<br/>Code Optimization] -->
|Intermediate<br/>Representation| c[Code<br/>Generator] -->
|Target<br/>Machine<br/>Code| mdco[Machine-Dependent<br/>Code Optimization]
end
mdco --> o[/Output/]
Stage | Input | Task |
---|---|---|
Lexical Analysis/ Scanning | Source prog | - Group characters into lexemes (meaningful sequences) - Generate a token for every lexeme - Access/Update symbol table Secondary - Stripping comments, whitespaces (blanks, newlines, tokens) - Keep track of line number for errors - Macro expansion |
Syntax Analysis/ Parsing | Tokens | - Check if structure follows [context-free] grammar of lang - Creates tree representationof grammatical structure of token stream |
Semantic Analysis | Syntax tree Symbol table | - Check semantic consistency w/ lang definition - Gathers type information & saves it in syntax tree/symbol table - Type checking: each operator has matching operands - Label Checking - Keywords misuse - Flow Control checking (no break outside loop)- Type conversions called coercions |
Intermediate Code Generation | Parse tree from semantic analyzer | Generate program in low-level/machine-like intermediate representation |
Code Optimization | Intermediate code | Improve code so that target code uses lesser resources |
Code Generation | Intermediate representation | - Produces target language (machine/assembly code) - Choose registers & mem locations for vars in prog |
Error Detection & Reporting¶
At every phase, if any error is identified, it is reported and handled
Tasks¶
- Report the presence of errors clearly & accurately. One error can mask another & cause correct code to look faulty.
- Recover from each error quickly enough to detect subsequent errors
- Add minimal overhead to processing of correct programs
Types of Errors¶
Types | Meaning | Example |
---|---|---|
Lexical | Misspelled identifier/keyword | fi (a == b) ( fi could be identifier/misspelled keyword (if )/function nameBut lexical analysis considers it as identifier) |
Syntax | Statement not following lang rules | Missing ; Arithmetic expression with unbalanced parenthesis |
Semantic | Divide by 0 Operation incompatible operand types Wrong number of array index | |
Logical | No rules broken, but incorrect logic | Using < instead of <= Infinite recursive call |
Symbol-Table¶
Data structure (usually hash table - for efficiency) containing a record for each identifier (variables, constants, functions) with fields for the attributes of the identifier
It is accessed at every phase of compiler.
- Scanner, parser, and semantic analyzer put names of identifiers in symbol table.
- The semantic analyzer stores more information (e.g. types) in the table.
- The intermediate code generator, code optimizer and code generator use information in symbol table to generate appropriate code.
Contains¶
- Attributes of variables are name, type, scope, etc.
- Attributes of procedure names which provide info about
- no and types of its arguments
- method of passing each argument (call by value/reference)
- type returned
Passes¶
Several phases are sometimes combined into a single โpassโ
A pass reads an input file process it and writes an output file
Normal Passes in Compilers¶
- Front-end phases are combined into a pass
- Code optimization is an optional pass
- Back-end phase can be made into a pass
Misc¶
Compilation Examples¶
C¶
Java¶
This command shows how your class file is treated
It is cross platform, as it executes as a station machine
Python¶
Android SDK¶
How does it show how your java program will work on mobile, when mobile is ARM architecture, but your laptop is usually x86 architecture.
This is because java program is cross-platform, and the simulator simulates execution of the program as if it is executed on an ARM processor.