Milestone 3 - Completing the Scanner

Due at the beginning of your next lab period


Each team member is to your scanner in your chosen implementation programming language.  The scanner should be written according following the outline covered in class. If your team is would like to consider doing a variation, you must clear it first with the instructor.

Public Scanner Methods

The driver (main method and eventual parser) must have access to the following methods:

openFile

Remember that your program must run on esus regardless of the platform on which it is being developed. In this case, the main program or method should be the driver, which should start executing with something like Linux command line

mp source-code-file.pas

where source-code-file is a command-line parameter that is the filename (actually the path name of that file relative to the directory in which the executable mp exists) of the text file the scanner will be scanning.

getToken

This method is to return the next token in the input file. Tokens can best be implemented as an enumerated type.

getLexeme

This method is to return the current lexeme, namely the string that was matched when the current token was scanned.

getLineNumber

This method is to return the line number in which the first character in the lexeme for the current token was scanned.

getColumnNumber

This method is to return the column number of the first character of the lexeme for the current token.

Driver Output

Design the driver  to continuously retrieve tokens from the scanner and to print a token file. The token file is to be a standard text file with one line per scanned token containing the following information in this order, spaced nicely so that the output is easy to read, as given below.

where token 1 is the first token scanned, line number 1 is the number of the line on which the token was scanned, column number 1 is the column on that line where the token begins, and lexeme 1 is the lexeme corresponding to the token, and so on for each line.   (Notice that scanner errors are not handled at this time.)  For example, the first two lines of the output file might read

Scanner Design

You must implement the scanner according to the design given in the lecture unless you have cleared a variation with the instructor.  The scanner should have a standard dispatcher that first skips white space and then examines (but does not consume) the first non-white space character to select the proper finite state automaton for scanning the token. 

Implement each finite state automaton (augmented for practicality) as a separate method.

The precondition for each finite state automaton is to be:

The postcondition for each finite state automaton is to be:

Each finite state atutomaton is to use the structures given as options in class for doing so, and each team member is to implement their FSA's in the selected fashion so that the code all has the same appearance.  

Notes

In Pascal,

By the time this assignment is done, your scanner should be completely functional, ready to work with your parser (to come).

Your scanner should also:

Optional

Produce a source listing of the program you are scanning so that the program looks just like it is entered by the programmer.  As errors occur, they should be noted by inserting an error line right below the source line with the appropriate error message and a mark (^) pointing to the start of the problem.  You can extend your scanner to do this, but it is not a requirement.

Special requirements

To Turn In