Milestone 2 - Starting the Scanner
Due at the Beginning of Your Next Lab Period
Objectives
The objectives of this assignment are:
- determining the overall structure of the scanner
- settling on a standard approach to token recognition by way of FSA's
- establishing team member assignments
To Turn In From Milestone 1
Milestone 1 had to do with formulating teams, selecting an implementation language, choosing a team leader, establishing a versioning system for team use, and other logistics. This was accomplished in lab so there is nothing more to turn in.
To Do This Week for Milestone 2
Prepare to Implement A Driver Program
The team leader is to implement and test a driver prototype, named mp for μPascal, to ensure that:
- mp can be written in your chosen implementation language on whatever platform you are using for project development and recompiled to run on esus
- mp can be invoked with a command line parameter that is the name of an input text file with the extension .mp, as in
mp tokenTest.mp
Fix on a Structure for your Program
The team must do the following:
- Resolve on a scanner interface for the driver (eventually the parser) to access the items it needs. For example, you might decide that the driver should be able to call the scanner with Scanner.getToken(), Scanner.getLexeme(), Scanner.getLine, and Scanner.getColumn. Or you may decide that Scanner.getToken() is all that is needed, in which case the scanner returns a pointer to a token object that has all of the attributes in it (token name, lexeme, line number, and column number).
- Decide on an implementation strategy for the scanner, such as
- Utilizing a static scanner class with static private methods, one for each FSA. Since there will be only one scanner, a static class is a valid approach in this instance (you won't need to instantiate multiple instances of the scanner).
- Ensuring that each call from the parser to the scanner to get the next token always causes the dispatcher of the scanner to be invoked, which in turn will call the private token methods implemented internally with FSA's.
- Using a procedural approach rather than an object-oriented approach if your language has such features.
- Using any good other programming principles you agree on. The only restriction is that each team member get a set of tokens for which FSA's can be designed in a standard fashion as specified in the following bullets.
Prepare to Implement Individual FSA's for Token Scanning
To prepare for individual FSA token scanning, the team is to determine a standard implementation approach that all team members follow when implementing their assigned individual FSA's for scanning tokens. It is important that the FSA's follow a standard implementation for a few reasons:
- it makes the task of reviewing and scoring a the team project much easier
- for the same reason, in real life if someone has to maintain a similar program, a clear, standard method for implementation makes it much easier to read, modify, and/or extend, so it is simply good team programming practice
Thus, the team must do the following:
- Determine whether the chosen implementation programming language has a "goto" or equivalent statement. If so, the individual FSA's for each token should be implemented with the goto statement for moving between states, as described in class.
- If the programming language does not have a goto-equivalent statement, study the case (switch) statement in the language to see whether it can be used to follow the example given in the hypertextbook at Contents > Topics > Scanning > Section 1 > Page 3. Notice in that example that there is an outer case statement to select the current state of the FSA, and that inside each case of the outer case statement there is a different inner case (switch) statement that sets the next state based on the next character in the input file. To determine are:
- Whether the case statement in your programming language can switch on an enumerated type (the state names of an FSA should be implemented as elements of an enumerated type). This should be possible in any programming language, but if it is not in yours you will need to implement the outer case statement with an if-then-elseif-...-elseif-else construct.
- Whether the case statement in your programming language can switch on individual characters. If not, you will need to implement the inner case statement with an if-then-elseif-...-elseif-else construct. You are allowed (and should use) built-in library routines for detecting whether a character is a letter (e.g., with method isLetter() ) or digit. If there is no such built-in function you should write one yourself. You should not be including a long string of if-statement checks to determine whether a character is an "a" or a "b" or...or a "z" etc.
- Standardize among team members how the "others" branches on the modified FSA tokens are to be handled (e.g., in the default case of the switch statement or the else case of an if-then-elseif-...-elseif-else construct)
- Establish the FSA precondition that the dispatcher writer will leave the file pointer at the position of the first character of the token to be handled by that FSA
- Establish the FSA postcondition that the FSA implementer will ensure that the file pointer is at the first character following the lexeme that is matched for a token when the FSA is done.
To Turn In
Turn a PDF file in to the Milestone 2 dropbox with
- a cover page that contains your team names and numbers
- a content page that contains four sections with a relevant title that discuss:
- A description of the way you intend to implement the scanner including:
- as a static class, as a non-static class, as a procedure, etc
- the private and/or public attributes (such as lexeme, columnNumber, etc.)
- the public methods
- the private methods
- A description of your team standard approach for implementing FSA's according to the discussion above
- The team member assignments for implementing the scanner components including:
- overall scanner structure
- dispatcher
- specific tokens (FSA's)
- A description of the way you intend to implement the scanner including: