CSC 330 Lecture Notes Week 2
Intro to Programming Language Translation
Intro to JFlex
Major Module | In | Out |
Lexical Analyzer | Source Code | Token Stream |
Parser | Token Stream |
Parse Tree
Symbol Table |
Code Generator |
Parse Tree
Symbol Table | Object Code |
Figure 1: Compiler Versus Translator.
program var X1, X2: integer; { integer var declaration } var XR1: real; { real var declaration } begin XR1 := X1 + X2 * 10; { assignment statement } end.
Figure 2: Parse Tree for Sample Program.
PUSH X2 PUSH 10 MULT PUSH X1 ADD PUSH @XR1 STORE
Symbol | Class | Type |
X1 | var | integer |
X2 | var | integer |
XR1 | var | real |
See pdf version of notes for formated figure.
program ::= PROGRAM block '.' block ::= decls BEGIN stmts END decls ::= [ decl { ';' decl } ] decl ::= typedecl | vardecl | procdecl typedecl ::= TYPE identifier '=' type type ::= identifier | ARRAY '[' integer ']' OF type vardecl ::= VAR vars ':' type vars ::= identifier { ',' identifier } procdecl ::= prochdr ';' block prochdr ::= PROCEDURE identifier '(' formals ')' [ ':' identtype ] formals ::= [ formal { ';' formal } ] formal ::= identifier ':' identifier stmts ::= stmt { ';' stmt } stmt ::= | assmntstmt | ifstmt | proccallstmt | compoundstmt assmntstmt ::= designator ':=' expr ifstmt ::= IF expr THEN stmt [ ELSE stmt ] proccallstmt ::= identifier '(' exprlist ')' compoundstmt ::= BEGIN stmts END expr ::= integer | real | char | designator | identifier '(' exprlist ')' | expr relop expr | expr addop expr | expr multop expr | unyop expr | '(' expr ')' designator ::= identifier { '[' expr ']' } exprlist ::= [ expr { ',' expr } ] relop ::= '<' | '>' | '=' | '<=' | '>=' | '<>' addop ::= '+' | '-' | OR multop ::= '*' | '/' | AND unyop ::= '+' | '-' | NOT
public static final int DIVIDE = 18; public static final int CHAR = 37; public static final int SEMI = 19; public static final int INT = 35; public static final int ARRAY = 3; public static final int LESS = 27; public static final int MINUS = 17; . . .
auxiliary code -- typically not used except for imports %% pattern definitions -- including internally-used methods in %{ ... %} %% lexical rules
x | the character "x" |
"x" | an "x", even if x is an operator. |
\x | an "x", even if x is an operator. |
[xy] | the character x or y. |
[x-z] | the characters x, y or z. |
[^x] | any character but x. |
. | any character but newline. |
^x | an x at the beginning of a line. |
<y>x | an x when Lex is in start condition y. |
x$ | an x at the end of a line. |
x? | an optional x. |
x* | 0,1,2, ... instances of x. |
x+ | 1,2,3, ... instances of x. |
x|y | an x or a y. |
(x) | an x. |
x/y | an x but only if followed by y (CAREFUL). |
{xx} | the translation of xx from the definitions section. |
x{m,n} | m through n occurrences of x |
where a pattern is in the regular expression language and an action is a Java statement, most typically a compound statement in ``{'', ``}'' braces.pattern action
/** The symbol number of the token being represented */ public int sym; /** The left and right character positions in the source file (or alternatively, the line and column number). */ public int left, right; /** The auxiliary value of a token such as an identifier string, or numeric token value. */ public Object value;
jflex pascal.jflex javac PascalLexer.java PascalLexerTest.java sym.java
cup pascal-tokens.cup jflex pascal.jflex javac PascalLexer.java PascalLexerTest.java sym.java java PascalLexerTest $*
begin { return newSym(sym.BEGIN); } BEGIN { return newSym(sym.BEGIN); }
integer {/* action for keyword integer */} {identifier} {/* action for identifiers, including integer */}
begin { ... } . . . ":=" { return newSym(sym.ASSMNT); } ":" { return newSym(sym.COLON); } . . .
begin { ... } . . . ":" { return newSym(sym.COLON); } ":=" { return newSym(sym.ASSMNT); } . . .