cse340 project 3 type checking the goal of this project is to give you
Search for question
Question
CSE340
Project 3: Type Checking
The goal of this project is to give you experience in Hindley-Milner type checking.
We begin by introducing the grammar of our language which is based on the previous project with additional
constructs. Then we will discuss the semantics of our language and type checking rules. Finally, we will go
over a few examples and formalize the expected output.
1. Lexical Specification
Here is the list of tokens that your lexical analyzer should recognize (the new tokens are listed first):
INT = "int"
REAL = "real"
BOOL
"bool"
TRUE = "true"
FALSE = "false"
IF = "if"
WHILE = "while"
SWITCH = "switch"
CASE = "case"
NOT = "!"
PLUS = "+"
MINUS = ""
MULT = "*"
DIV = "/"
GREATER = ">"
LESS = "<"
GTEQ = ">=”
LTEQ = "<="
NOTEQUAL = "<>"
LPAREN = = " ("
RPAREN = ")"
NUM = (pdigit
digit*) + 0
REALNUM = NUM "." digit
PUBLIC = "public"
PRIVATE = "private"
EQUAL = 66-99
COLON = "."
COMMA=","
SEMICOLON = “;”
LBRACE = "{"
RBRACE = ""
digit*
ID = letter (letter + digit)* 2. Grammar
Here is the grammar for our input language:
program
global_vars
global_vars
→ global_vars body
→ ε
→ var_decl_list
var_decl_list → var_decl
var_decl_list → var_decl var_decl_list
var_decl
var_list
var_list
type_name
→ var_list COLON type_name SEMICOLON
→ ID
→ ID COMMA var_list
INT
→ REAL
stmt
→ stmt stmt_list
type_name
type_name
→ BOOL
body
→ LBRACE stmt_list RBRACE
stmt_list
stmt_list
stmt
stmt
stmt
stmt
expression
→ assignment_stmt
→ if_stmt
→ while_stmt
→ switch_stmt
assignment_stmt → ID EQUAL expression SEMICOLON
expression
expression
→ primary
→ binary_operator expression expression
→ unary_operator expression
unary_operator → NOT
binary_operator → PLUS I MINUS | MULT | DIV
binary_operator → GREATER | LESS I GTEQ | LTEQ | EQUAL I NOTEQUAL
primary
→ ID
→ NUM
primary
primary
REALNUM
primary
→ TRUE
primary
→ FALSE
if_stmt
while_stmt
switch_stmt
case_list
case_list
case
→ IF LPAREN expression RPAREN body
WHILE LPAREN expression RPAREN body
→ SWITCH LPAREN expression RPAREN LBRACE case_list RBRACE
→ case
→ case case_list
→ CASE NUM COLON body 3. Language Semantics
3.1. Types
The language has three built-in types: int, real, and bool.
3.2. Variables
Programmers can declare variables either explicitly or implicitly.
Explicit variables are declared in an var_list of a var_decl.
A variable is declared implicitly if it is not declared explicitly but it
the program body.
appears
in
Example
Consider the following program written in our language:
x: int;
y: bool;
{
y = x;
Z =
10%;
*
W = z 5%;
}
"
This program has four variables declared: x, y, z and W, with x and y explicitly declared and Z
and W implicitly declared.
3.3. Type System
Our language uses structural equivalence for checking type equivalence. Implicit types will be
inferred from the usage (in a simplified form of Hindley-Milner type inference).
Here are all the type rules/constraints that your type checker will enforce (constraints are labeled for
reference):
C1: The left hand side of an assignment should have the same type as its right hand side.
C2: The operands of a binary operator (GTEQ, PLUS, MINUS, MULT, DIV,
GREATER, LESS, LTEQ, EQUAL and NOTEQUAL) should have the same type (it
can be any type).
C3: The operand of a unary operator (NOT) should be of type bool. C4: Condition of if and while statements should be of type bool.
C5: The expression that follows the switch keyword in switch_stmt should be of type int.
The type of expression binary_operator opl op2 is the same as the type of opl and
op2 if operator is PLUS, MINUS, MULT, or DIV. Note that op1 and op2 must have the
same type due to C2.
The type of expression binary_operator op1 op2 is bool if operator is GREATER, LESS,
GTEQ, LTEQ, EQUAL, or NOTEQUAL.
The type of unary_operator op is bool.
NUM constants are of type int.
REALNUM constants are of type real.
true and false values are of type bool.
4. Output
There are two scenarios:
There is a type error in the input program
•
There are no type errors in the input program
4.1. Type Error
If any of the type constraints (listed in the Type System section above) is violated in the input
program, then the output of your program should be:
TYPE MISMATCH <line_number> <constraint>
Where <line_number> is replaced with the line number that the violation occurs and <constraint> should
be replaced with the label of the violated type constraint (possible values are C1 through C5). Note that
you can assume that anywhere a violation can occur it will be on a single line. 4.2. No Type Error
If there are no type errors in the input program, then you should output type information for all
variables in the input program in the order they appear in the program. There are two cases:
• If the type of the variable is determined to be one of the built-in types, then output one line in the
following format:
<variable>: <type> #
where <variable> should be replaced by the variable name and <type> should be replaced by
the type of the variable.
• If the type of the variable could not be determined to be one of the built-in types, then you need to
list all variables that have the same type as the target variable and mark all of them as printed (so
as to not print a separate entry for those later). You should output one line in the following format:
<variable_list>: ? #
where <variable_list> is a comma-separated list of variables that have the same type in the order
they appear in the program.
5. Examples
Given the following:
a, b: int;
a = < b 2;
}
The output will be the following:
TYPE MISMATCH 3 C1
This is because the type of <b 2 is bool, but a is of type int which is a violation of C1./n/n