Lab 1 - Scanning I
Objectives
The objectives of this laboratory exercise are
- to ensure that your team has settled on a version control system
- to ensure that your team has chosen a common IDE
- to allow all team members to have Google accounts configured for document sharing and remote meetings via hangouts
- to have each team member install the Free Pascal Compiler to use as a reference during the project
- to give you opportunity to compile a real Pascal program with options that allow you to view the compiled code
- to acquaint you with the definition of µPascal
- to give you experience with EBNF
- to help you get started with your scanner implementation.
To Do
Open a Google writer doc that is sharable by all team members. Your answers for this exercise are to be recorded in this doc. For consistency with answers use Linux for these assignments.
Part 1 - Working with an ASCII Input File
Remember that the scanner module of a compiler reads the source code of a program to be compiled as a stream of characters from a text file. This part of this exercises is designed to help you better understand what a text file is. The table below is a copy of an ASCII text file that represents a Pascal program. Note the following:
- Use this ASCII table to decipher the text file
- The characters are represented by each two hexadecimal pairs
- The left column is to be ignored; those numbers is is simply an offset (in hexadecimal) into the number of hexadecimal digits represented in the file
- Each two hexadecimal digit pairs represent a single ASCII character
- The blank spaces between each four hexadecimal digits are there only for ease of reading; ignore them
0000000 7270 676f 6172 206d 7266 6465 6f28 7475 0000020 7570 2974 0a3b 200a 6220 6765 6e69 200a 0000040 2020 7720 6972 6574 6e6c 2728 6948 202c 0000060 6874 7369 6920 2073 7246 6465 2927 0a3b 0000100 2020 6e65 2e64 000a 0000107
- Type in the following Pascal program and save it as a text file on Google Drive with the name program1.pas:
program Tester(input, output); var I: Integer; begin {tester} Writeln; Writeln; Write('Please enter an integer value for I: '); Read(I); I := I + 1; Writeln('The current value of I is ', I:0); Writeln; Writeln; end. {tester}
- Read about the Unix utility od using the man pages.
- Determine how to convert the above program into a hexadecimal file.
- Write the od command, with flags, for doing this conversion in your Google doc.
Part 2 - Install a Pascal Compiler
It will help for the compiler project to have a Pascal compiler installed on your own computer for reference. It really does not matter which one. If you don't have a Pascal compiler, the recommended one is the Free Pascal Compiler found freepascal.org. Download and install it if needed. Once you have access to a Pascal compiler, do the following:
- Discover how to compile the program of step 1 while keeping the assembly file with source lines inserted generated by the compiler.
- Include in your Google doc the Linux Pascal compiler command, including flags, you used to accomplish this.
Part 3 - Working with the Tokens for μPascal
Open the formal grammar for µPascal in a spreadsheet. As a team, try to determine all of the items in the definition that should be considered tokens. List these in your Google doc. Once you have finished, compare these with the official list of tokens. Don't peek until you get a reasonable list of your own, because this is part of your learning process. Compare your list with the official list. Find places where they differ. Discuss these differences to see that you understand why the official list of tokens is what it is.
Part 4 - Designing an FSA for a Particular Token
- Consider the following English descriptions for two tokens:
- A float_literal is a string that has:
- one or more digits followed by an exponent, or
- it is a string that has one or more digits, followed by a decimal point, followed by one or more digits, followed by an optional exponent.
- If present in either case, the exponent is formed by including the letter e or E, followed by an optional + or -, followed by one or more digits.
- one or more digits followed by an exponent, or
- An identifier must start with a letter or underscore, and following that may contain letters, digits, and the underscore character. An identifier may not end with an underscore character and may not have two underscore characters in a row.
- A float_literal is a string that has:
- Write down in your doc regular expressions for float_literal and identifier.
- Make a regular expression for white_space. Include this on your worksheet.
- Construct finite state automata for float_literal. Use the fsa animator tool below to do this. When you are finished, screen capture the fsa and include it in your doc.
NOTE: You may have to fiddle with security issues on your computer to allow a Java applet to run (particularly on a Mac).
Part 5 - Preparing for Milestone 2
Delegate responsibilities for implementing the scanner (see milestone 2), due next week, and begin design and implementation of the scanner. The team leader should be responsible for coding the dispatcher and the routine for skipping white space. The individual fsa scanning stubs should be divided up evenly among the remaining team members. Be sure that each individual has at least one "interesting" fsa to complete. Also, include the new definitions of identifier and float_literal as new tokens in the scanner (these will be modifications to µPascal definition).
Part 6 - Turning Work In
Create a Google doc cover sheet that you can use for each submission from now on. The cover sheet is to have on it in order:
- Your group number
- Today's date
- Your team names and email addresses
For this assignment attach the cover sheet to the Google doc you created for this lab assignment and submit the report as a PDF file to the D2L dropbox.