Skip to content

UKSPC Practice Problems » Dynamic Dictionary Coding

Input file: ddc.in
Output file: ddc.out

A common method of data compression, "dictionary coding", involves replacing words in a text by numbers indicating their positions in a dictionary. Static dictionary coding, in which the dictionary is known in advance, can be problematic, because it is necessary to have the dictionary available to understand the coded text. Dynamic dictionary coding avoids this problem by deriving the dictionary from the text to be compressed. The text is processed from beginning to end, starting with an empty dictionary. Whenever a word is encountered that is in the dictionary, it is replaced by a number indicating its position in the dictionary. Whenever a word is encountered that is not in the dictionary, it appears as-is in the compressed text and is added to the end of the dictionary.

You are to implement dynamic dictionary coding. See the input and output specifications below.

Input Specification

The first line of the input file contains a positive integer which is the number of sets of text which are to be compressed. Each set of text will consist of several lines containing text made of lower case letters and spaces only. You may assume that no word is longer than 20 letters, that no input line is longer than 80 characters, and that there are no more than 100 input lines. The sets of text are separated by a single blank line, and are to be compressed individually.

Output Specification

The output file should contain the sets of text input compressed using dynamic dictionary coding as described above. Lineation and spacing should be preserved exactly as in the input file with the different sets of compressed text separated by a single blank line.

Sample Input

1
the cat chased the rat while
the dog chased the cat into the rat house

Sample Output

the cat chased 1 rat while
1 dog 3 1 2 into 1 4 house