Cats Claw
Computer-Assisted Technology Service
Computational Linguist's Automated Workbench

Analyzing principal parts

Instructions are below. If you use results from this tool, please acknowledge its creators: Raphael Finkel and Gregory Stump at the University of Kentucky. Send suggestions and comments to Raphael Finkel.

You can enter a chart of paradigms (but not a plat) here (one column should be labelled CONJ, and another Gloss), and get back a plat ready for hand-tuning.
sample chart (Dakota)

Or enter your language's plat here:
or upload it from this file:
Assume input is in Latin encoding (instead of UTF-8)
Bit pattern of MPSs to force as distillations
Compute static principal parts | Quick version, if many principal parts
Compute adaptive principal parts
Compute dynamic principal parts
Group inflection classes
Compute center of inflection classes
Compute inflection-class predictabilities
Compute full predictabilities
empty set doesn't predict (that's what the book says)
Compute predictiveness
Compute MPS entropies
Compute inflection-class entropies
Weight entropies based on this key (DUPLICATES for inflection-class type frequency)
Compute surface forms
Compare surface forms to this file of ground truth
Build table of surface forms
Compute plat of stem referrals
Number of distillations in patterns
Expand plat based on stem referrals
Replace plat with the stem referrals
Stratify inflection classes by number of principal parts
Make the output more verbose
Show what inflection classes have the same signature
Restrict the plat to inflection classes with given keys
Restrict the plat to the MPS names with given substrings
Restrict the output to inflection classes with given keys
Output a data file restricted by given keys and MPS substrings


A language plat is presented as follows:

% any comments you want, running to the end of the line
% blank lines are acceptable

ABBR 1 1S1C % shorthand used in templates

IC       1sg 1pl 2sg 2pl
TEMPLATE 1Ao 1As 1At 1At
conj1    a   a   n   an
conj2    e   en  n   en
conj3    i   i   in  in
conj4    i   i   i   i
conj5    u   u   n   un  
conj6    u   un  un  un  

% continuation lines of the above are acceptable; just start
% them with the same word (such as CONJ or conj5).

STEM 1 present % just a comment describing the stem
STEM 2 past 
STEM 3 future
STEM 4 optative
STEM 5 frequentative

REFER conj1 2 -> 1
REFER conj2 2, 3 -> 1
REFER conj3 2-3 -> 1; 5 -> 4 % if there were 5 stems

LEXEME eat   conj1 1:ede   3:edu   4:ede  5:edr
LEXEME see   conj2 1:vida  4:vide  5:vidu
LEXEME sleep conj3 1:dorma 4:dormu

CLASS obstruent b d f g h k p s t v z ʒ

SANDHI a o | => o % edeao → edeo ; the '|' symbol means end of word
SANDHI a i o => io % dormaio → dormio ; spaces are segment boundaries 
SANDHI n [:obstruent:] => $1 om $1 % dormaont → dormaitomt

KEYS conj1 short FREQ=3
KEYS conj2 short FREQ=1
KEYS conj3 long  animate   FREQ=10
KEYS conj4 long  inanimate FREQ=2
KEYS conj5 long  deitic    FREQ=1
KEYS conj6 long  demonic   FREQ=1

ANALYZE ANY=edeao 1pl=edas 2pl=edeint