CS 115 Program 5 File Filter Spring 2001

Due Date: Friday, April 27, 2001

You will write a program that will act as a "filter" on a file. It will input a file as a stream of characters, perform some action on each character and output the characters to another file.

Phase I

Write a program that will ask for a filename, and open that file for input. Then it will ask for a filename, and open that file for output. If either open fails, the program should give a message and abort its run. If things were successful, then copy each character from the input file to the output file. After you have processed the whole file, report the number of lines the input file had in it. Lines are determined by the presence of a newline character. You only need one function, the main one, for this phase. If you wish, you can write more.

Example interaction with the user:

Enter the name of your file:  myfile.txt
Enter the name for the output file: output.dat

Done.
There were 15 lines in the file.

and the file output.dat would exist, as an exact copy of myfile.txt.

Phase II

Write a void function which accepts one character as a parameter, encodes it as described below and returns the encoded character as another parameter.

The encoding is done with a "substitution" algorithm as follows: An array of characters is set up as an encoding string. It is 26 characters long and it contains all the uppercase letters of the alphabet, in any order desired. This can be done by initialization at its declaration. The letter that is in the first element of the array, the zeroth position, is the letter that is substituted for an 'A'. The letter that is in the second element of the array, the position indexed by 1, is the letter that is substituted for a 'B'. And so on. The letter in the last position in the string is substituted for a 'Z'.

For example: if the "encoding string" were "DEFGHABCIJKPQRMNOLTSXYZUVW" then the word "CAT" would become "FDS", because C is the third letter in the alphabet and F is the third letter in the string, A is the first letter in the alphabet and D is the first letter in the string, and T is the twentieth letter in the alphabet and S is the twentieth letter in the string.

It is possible to calculate the substitution character from the input character. You do NOT want to do 26 different if statements! Think about using ASCII codes and typecasts. Consider what the expression int('A') gives you. This function will be pretty short.

For characters that are uppercase letters of the alphabet, follow the procedure described above. For lowercase letters, convert them to uppercase and then proceed as before. (There is a function that will convert them to uppercase.) For anything else that is not alphabetic, just return the same value as was input.

Call your encoder function from your main function, with the character you read in from the input file. The encoded character is what you should write out to the output file. You should only have to add a few lines to your main function to do this.

Example:
The input file is 

	this is a Test of
	the Emergency Broadcast system.
	12345 (123) 456-7987
	And some more text.

The output file is

	SCIT IT D SHTS MA
	SCH HQHLBHRFV ELMDGFDTS TVTSHQ.
	12345 (123) 456-7987
	DRG TMQH QMLH SHUS.
Phase III

Write another function which will count the frequency of each letter of the alphabet in the stream of characters coming from the input file. That is, the function will keep track of how many A's, how many B's, how many C's and so on that it saw in the file.

For example, for the same input file above, after the file has been processed, the counters can be displayed as:

The Frequency of letters in myfile.txt is:
A  4    B  1    C  2    D  2
E  9    F  1    G  1    H  2
I  2    J  0    K  0    L  0
M  4    N  2    O  4    P  0
Q  0    R  3    S  7    T  8
U  0    V  0    W  0    X  1
Y  2    Z  0 

meaning there were 4 A's, 1 B, 2 C's, 2 D's and so on.

Again, you do not need 26 different if statements or 26 different counting statements. You will need 26 different counters, which should all be initialized to zero. You can use the same idea that worked in Phase II to do this job too. Think of an array of counters. The very first one in the array is the counter for 'A's, the second one is the counter for 'B's and so on. Consider the expression int(charvar) - int('A'). The counter function will be very short.

Call your counter function from the main function. It should have arguments of the current character under consideration (from the input file, before it gets encoded) as well as the array of counters.

After the input file is processed, display the counts on the screen. Use the manipulators like setw to format the table above so that it looks nice. Remember you will have to include iomanip.h. This can be done in the main, or in another function. It is not part of the counter function.

So the total output for Phase III would be

Example interaction with the user:

Enter the name of your file:  myfile.txt
Enter the name for the output file: output.dat

Done.
There were 15 lines in the file.

The Frequency of letters in myfile.txt is:
A  4    B  1    C  2    D  2
E  9    F  1    G  1    H  2
I  2    J  0    K  0    L  0
M  4    N  2    O  4    P  0
Q  0    R  3    S  7    T  8
U  0    V  0    W  0    X  1
Y  2    Z  0 

and the file output.dat would exist, as an encoded copy of myfile.txt.

Testing:
For Phase I, you will want to show what happens with a big file, a little file, an empty file, a file that does not exist, files with lots of lines, files with blank lines, a file that has only one line. Be creative!

For Phase II try different encoding strings. What happens if you run it on a file that has no alphabetic characters in it?

For Phase III, using the same test files you used for Phase I would be sensible. And of course try files where there is only one of the letters of the alphabet. Try upper and lower case too.

Design
Write a pseudocode design for Phase I at detail Level 2. This will be discussed in a designated recitation, and you will be responsible for having it in readable form at that time. Please read the documentation standard. As you can see from looking at the grading sheet, we will be looking to see how you meet these standards. Every function needs a description of its purpose and its parameters. Local variables need documentation too.

As described in the documentation standard, turn in the following, neatly stapled, in this order: