Lab 3 - Scanner Self-Test
1. Objectives
The objectives of this exercise are
- to have you to test and turn in the first full draft of your scanner (if you have it done)
- to give you first-hand experience with parsing
- to give you experience with turning formal language definitions in extended Backus-Naur form (EBNF) into context-free grammar (CFG) form.
2. Reflecting on Your Scanner
Examine your project through a team discussion that investigates whether all team members have been following good programming methodology. In specific, unless a deviation was cleared ahead of time with the instructor, the scanner should have the following form, based on sound programming principles for the chosen implementation language. For instance, if the implementation language is object oriented, one way to view scanner design is:
- The scanner could be a single static class (because there is no need to have multiple scanner objects)
- The scanner class coould have the following public static methods
- getToken (method names can be different as long as they are clear)
- getLexeme
- getLineNumber
- getColumnNumber
or it could implement tokens as objects of a token class with these public methods, and when the parser calls the scanner for the next token, the scanner could create a new token object with all fields properly filled, and a reference to that token object could be returned to the parser.
The scanner class should also:
- Have a dispatcher that skips white space and then calls individual private FSA methods for individual tokens based on the first character of the token, without consuming that token.
- Ensure that the individual private FSA scanners for individual tokens have the pre- and post-conditions specified that state that the file pointer is pointing to the first character of the token to be scanned when the FSA is called, and that the file pointer is pointing to the first character after the token when the FSA returns. The postcondition should also specify that the token, lexeme, line number, and column number have all been set appropriately.
- Each FSA has been programmed exactly according to the specifications given, either:
- A while loop contains a case statement that switches on the current state, and each case of the case statement has another case statement (or if statement) that determines the next state based on the current input symbol, or
- A goto structure that branches to the next state based on the current symbol being consumed from the input
For different implementation language paradigms a different approach may be necessary.
3. To Do
3-1. Testing (Optional)
If you are far enough along that your scanner first draft is running, exchange scanners with a different team.
Run the tests you tested your scanner with on theirs. Discuss any problems you see in your tests with the other team. Teams can interact to share information about what might be fixed.
3-2. Parsing (Preparing for Understanding Parsing)
Click here to bring up the Web-based grammar animator. As with the prior fsa animator you may need to use Firefox and you may have to set your Java security setting to High for the applet to run and set the link to this page as an Exception. Use the grammar animator to do the following exercises.
For this exercise, you will be parsing the string
a * b + c * b
- Bring up the grammar called Arithmetic Exp #1. To do this, click the "Access Grammar Tools Menu" button in the animation window. This brings up another list; click "select a different grammar" from the menu. Finally click on the desired grammar. You can now begin working.
Using this grammar, each team member should construct a different parse tree for the expression a * b + c * b. Capture the different parse trees as an image file and then proceed to the next problem.
- Note, to capture on a PC, you may need to do the following:
- put your cursor over the pane containing the grammar applet.
- hold down the <ALT> key and hit the Print Screen key. This puts the image in the Internet Explorer window onto the clipboard.
- On a Mac, use the Grab utility to capture whatever you need from the screen.
- Save the captured images for inclusion in your lab report.
- Complete the same exercise using the grammar called Arithmetic Exp #2. In this case team members should work together to obtain a parse tree. Capture and save the parse tree and then proceed to the next problem.
- Complete the same exercise with the grammar called Arithmetic Exp #3. In this case team members should work together to obtain a parse tree. Capture and the parse tree and then proceed to the next problem.
3-2-2. μPascal
Bring up the grammar called microPascal #1 in the applet. Use it to parse the following program. Notice that this grammar has the same properties as the "better" grammar from the previous exercise. You can proceed easily through the parse, knowing exactly which rule to apply at any step by just looking at the next token in the program that has not yet appeared in the parse tree.
begin read(Num); Num := Num + 5 * Num; write(Num); end.
The entire tree will likely fit in the display pane, but position the tree in the pane so that the root of the tree is in view and centered, and capture and save that portion.
3-3. Working with EBNF and CFGs
Bring up the EBNF definition of an older version of μPascal. (You may need to right click—control click on Mac—and open the link in a new window.) Use it to do the following. Type your answers in a word processing program. Remember that some of the ways you convert EBNF to grammar rules will result in the rules having the LL(1) property whereas other ways will not. Since we are going to focus on top down parsing based on LL(1) grammars you should think about this as you do the conversions. However, it is not a requirement that you make the rules you create conform to LL(1) properties, because we haven't yet discussed how to do this in general.
3-3-1. Do Just 1, 3, and 7 below.
- Convert the EBNF rule for StatementSequence into equivalent CFG rules.
- Convert the EBNF rule for IfStatement into equivalent CFG rules.
- Convert the EBNF rule for WhileStatement into equivalent CFG rules.
- Convert the EBNF rule for ForStatement into equivalent CFG rules.
- Convert the EBNF rule for SimpleExpression into equivalent CFG rules.
- Use the EBNF rules to determine whether the statement X := -A * B is legal.
- Use the EBNF to determine whether the statement X := -A * -B is legal.
5. Turn In
Create a single PDF file from all parts of this exercise. Submit this report to the Lab 3 dropbox. Include a cover page that has the following information.
- Lab 3
- Group number
- Today's date
- Your team names and e-mail addresses
Include in your PDF document, with identifying headers:
- A brief summary of your reflections from section 2.
- How the team's scanner worked that you tested from section 3-1 (if you did)
- The screen captures from section 3-2-2.
- The screen capture from section 3-2-3.
- The EBNF rules you converted to CFG rules in section 3-3-1. List the EBNF rule in each case followed by the equivalent CFG rule(s).