Chapter 2 Exercises

2.1 Consider the following context-free grammar

S -> S S + S S * a

a. show how the string aa+a* can be generated by this grammar.

ans: S -> SS*
Sa*
SS+ a *
a a + a *

b. Constuct a parse tree for this string

ans:
S
/ \
S S *
/ \
S S + a

a a

c. What language is generated by this grammar? Justify you answer

ans: The language defined by this grammar consists of two (2) consecutive a separated by plus and minus signs.


2.2 What language is generated by the following grammar? In each case justify you answer.

a. S -> 0 S 1 0 1

The language that is generated by this gramma is compose of digits number one (1) and zero (0).


b. S -> + S S - S S a

The language that is generated by this grammar is compose of a's separated by plus and minus signs.

c. S -> S ( S ) S є

The language that is generated by this grammar is compose of є.

d. S -> a S b S b S a S є

The language that is generated by this grammar is compose of a, b, and є.


e. S -> a S + S S S S * ( S )

The language that is generated by this grammar is compose of infinite consecutive S with plus (+) and * signs.

2.3

Group Mini-Project I

ICS 40 Compiler Design Principles

Group Mini-Project I

LEXICAL ANALYSIS

Handed out: July 2, 2008

PURPOSE

As a first step in the compiler project you are to develop a lexical analyzer for a mini-programming language, MINI-L (adapted from Aho, Sethi & Ullman, 1986), using the Jflex scanner generator (Klien, 2004). The lexical analyzer should take as input a source program written in MINI-L, and it should return the tokens in the input stream. Information about identifiers found should be also written into a symbol table. The lexical analyzer should also create a listing file, giving line numbers and error messages. When an error is encountered, the lexical analyzer should indicate it by a message.

TOKEN SPECIFICATION

In MINI-L a comment is introduced by ``--'' and goes to the end of the line. All tokens are separated by blanks, tabs, newlines, delimiters, operators or comments. A valid identifier begins with a letter and may be followed by additional letters, digits or underscores.

The tokens and the information that should be returned are summarized in the attached table. If the token is a reserved word, a delimiter or an operator, then a scalar should be returned indicating that token. For instance, for the character string ``beginprogram'', the scalar BEGPROGSYM is returned. If the token is an identifier, 2 items are returned: IDENTIFIER as token number and the character string of the identifier. If the token is an integer constant, then beside the scalar indicating the token to be a constant, the numerical value should be returned.

MATERIAL TO BE HANDED IN

On the due date, you are to submit both printed and softcopies of the: (a) Jflex specification (defining the MINI-L token set), (b) Java source code of the analyzer (generated by Jflex), (c) Java byte code of the lexical analyzer (compiled by Javac), and (d) test results (use attached sample MINI-L program) as part of the DOCUMENTATION of your mini-project.

You will also be required to DISCUSS and DEMONSTRATE your project in class.

This mini-project is due on: Midterm Exam Week

MINI-L TOKEN TABLE

TOKEN

SYMBOL

VALUE

STRING

identifier

IDENTIFIER

-

name

Integer constant

NUMBER

value

-

program

PROGSYM

-

-

beginprogram

BEGPROGSYM

-

-

endprogram

ENDPROGSYM

-

-

array

ARRAYSYM

-

-

of

OFSYM

-

-

integer

INTSYM

-

-

if

IFSYM

-

-

then

THENSYM

-

-

else

ELSESYM

-

-

endif

ENDIFSYM

-

-

while

WHILESYM

-

-

loop

LOOPSYM

-

-

endloop

ENDLOOPSYM

-

-

read

READSYM

-

-

write

WRITESYM

-

-

eof

EOFSYM

-

-

not

NOTSYM

-

-

and

ANDSYM

-

-

or

ORSYM

-

-

true

TRUESYM

-

-

false

FALSESYM

-

-

;

SEMISYM

-

-

,

COMMASYM

-

-

:

COLONSYM

-

-

:=

ASSIGNSYM

-

-

=

EQUALSYM

-

-

(

LPARSYM

-

-

)

RPARSYM

-

-

<>

NOTEQSYM

-

-

<

LTSYM

-

-

>

GTSYM

-

-

<=

LESYM

-

-

>=

GESYM

-

-

+

PLUSSYM

-

-

-

MINUSSYM

-

-

*

TIMESSYM

-

-

/

DIVSYM

-

-


MINI-L EXAMPLE

Below is an example that shows a sample program and the output of the lexical analyzer

SAMPLE PROGRAM
--------------
-- This program prints the even numbers from 0 .. 10
program even;
i : integer;
beginprogram
     i:=0;
     while i <= 10 loop
            write i;
            i := i + 2;
     endloop;
endprogram
 
OUTPUT OF LEXICAL ANALYZER
--------------------------
PROGSYM
IDENTIFIER even
SEMISYM
IDENTIFIER i
COLONSYM
INTSYM
SEMISYM
BEGPROGSYM
IDENTIFIER i
ASSIGNSYM
NUMBER 0
SEMISYM
WHILESYM
IDENTIFIER i
LESYM
NUMBER 10
LOOPSYM
WRITESYM
IDENTIFIER i
SEMISYM
IDENTIFIER i
ASSIGNSYM
IDENTIFIER i
PLUSSYM
NUMBER 2
SEMISYM
ENDLOOPSYM
SEMISYM
ENDPROGSYM