CSC 330 Lecture Notes Week 2
Intro to Programming Language Translation
Intro to JFlex
| Major Module | In | Out |
| Lexical Analyzer | Source Code | Token Stream |
| Parser | Token Stream |
Parse Tree
Symbol Table |
| Code Generator |
Parse Tree
Symbol Table | Object Code |
Figure 1: Compiler Versus Translator.
program
var X1, X2: integer; { integer var declaration }
var XR1: real; { real var declaration }
begin
XR1 := X1 + X2 * 10; { assignment statement }
end.
Figure 2: Parse Tree for Sample Program.
PUSH X2 PUSH 10 MULT PUSH X1 ADD PUSH @XR1 STORE
| Symbol | Class | Type |
| X1 | var | integer |
| X2 | var | integer |
| XR1 | var | real |
See pdf version of notes for formated figure.
program ::= PROGRAM block '.'
block ::= decls BEGIN stmts END
decls ::= [ decl { ';' decl } ]
decl ::= typedecl | vardecl | procdecl
typedecl ::= TYPE identifier '=' type
type ::= identifier | ARRAY '[' integer ']' OF type
vardecl ::= VAR vars ':' type
vars ::= identifier { ',' identifier }
procdecl ::= prochdr ';' block
prochdr ::= PROCEDURE identifier '(' formals ')' [ ':' identtype ]
formals ::= [ formal { ';' formal } ]
formal ::= identifier ':' identifier
stmts ::= stmt { ';' stmt }
stmt ::= | assmntstmt | ifstmt | proccallstmt | compoundstmt
assmntstmt ::= designator ':=' expr
ifstmt ::= IF expr THEN stmt [ ELSE stmt ]
proccallstmt ::= identifier '(' exprlist ')'
compoundstmt ::= BEGIN stmts END
expr ::= integer | real | char | designator |
identifier '(' exprlist ')' | expr relop expr |
expr addop expr | expr multop expr | unyop expr |
'(' expr ')'
designator ::= identifier { '[' expr ']' }
exprlist ::= [ expr { ',' expr } ]
relop ::= '<' | '>' | '=' | '<=' | '>=' | '<>'
addop ::= '+' | '-' | OR
multop ::= '*' | '/' | AND
unyop ::= '+' | '-' | NOT
public static final int DIVIDE = 18; public static final int CHAR = 37; public static final int SEMI = 19; public static final int INT = 35; public static final int ARRAY = 3; public static final int LESS = 27; public static final int MINUS = 17; . . .
auxiliary code -- typically not used except for imports
%%
pattern definitions -- including internally-used methods in %{ ... %}
%%
lexical rules
| x | the character "x" |
| "x" | an "x", even if x is an operator. |
| \x | an "x", even if x is an operator. |
| [xy] | the character x or y. |
| [x-z] | the characters x, y or z. |
| [^x] | any character but x. |
| . | any character but newline. |
| ^x | an x at the beginning of a line. |
| <y>x | an x when Lex is in start condition y. |
| x$ | an x at the end of a line. |
| x? | an optional x. |
| x* | 0,1,2, ... instances of x. |
| x+ | 1,2,3, ... instances of x. |
| x|y | an x or a y. |
| (x) | an x. |
| x/y | an x but only if followed by y (CAREFUL). |
| {xx} | the translation of xx from the definitions section. |
| x{m,n} | m through n occurrences of x |
where a pattern is in the regular expression language and an action is a Java statement, most typically a compound statement in ``{'', ``}'' braces.pattern action
/** The symbol number of the token being represented */
public int sym;
/** The left and right character positions in the source file
(or alternatively, the line and column number). */
public int left, right;
/** The auxiliary value of a token such as an identifier string,
or numeric token value. */
public Object value;
jflex pascal.jflex javac PascalLexer.java PascalLexerTest.java sym.java
cup pascal-tokens.cup jflex pascal.jflex javac PascalLexer.java PascalLexerTest.java sym.java java PascalLexerTest $*
begin { return newSym(sym.BEGIN); }
BEGIN { return newSym(sym.BEGIN); }
integer {/* action for keyword integer */}
{identifier} {/* action for identifiers, including integer */}
begin { ... }
. . .
":=" { return newSym(sym.ASSMNT); }
":" { return newSym(sym.COLON); }
. . .
begin { ... }
. . .
":" { return newSym(sym.COLON); }
":=" { return newSym(sym.ASSMNT); }
. . .