Evaluation — ``selkie.nlp.dp.eval`` *********************************** The following functions are in the module selkie.nlp.dp.eval:: >>> from selkie.dp.eval import * >>> from selkie import ex >>> from selkie.dep import conll_sents evaluate -------- This is the main function. It takes a parser, a list of sentences with gold pgovrs and proles, and prints out evaluation information. The parser should place its output in the govr and role slots, not pgovr and prole. One may specify ``excludepunc=False`` to count punctuation tokens. (They are ignored by default.) One may provide ``output=`` *stream* to specify an output stream other than stdout:: >>> evaluate(parser, sents) ispunc ------ The function ispunc() returns True if all the characters in the given string have a Unicode category beginning with "P":: >>> ispunc('.') True >>> ispunc('Dr.') False eval_sent --------- The function eval_sent() evaluates a single sentence. Its arguments are *pred* and *truth*. It considers the govrs and roles of the predicted sentence, but the pgovrs and proles of the true sentence. (A projective dependency parser can produce non-projective output if it ever fails to attach a word, so the output of even a projective dependency parser is stored in the govr/role slots rather than the pgovr/prole slots.) The outputs are *las*, *uas*, *la*, *n*, where *las* is the number of words that have the correct govr and role, *uas* is the number of words that have the correct govr, *la* is the number of words that have the correct role, and *n* is the number of words. Nota bene: these are counts, not proportions. Note also that *n* will be less than the length of the sentence. The length of the sentence includes the root token (position 0), which is never included in *n*. Also, by default, punctuation tokens are ignored. (One can cause them to be counted by specifying ``excludepunc=False``:: >>> pred = next(conll_sents(ex.depsent3_pred)) >>> gold = next(conll_sents(ex.depsent3_gold)) >>> eval_sent(pred, gold) (2, 3, 2, 4) >>> eval_sent(pred, gold, excludepunc=False) (3, 4, 3, 5) compare ------- The function compare() prints out a detailed comparison of a predicted and a gold sentence:: >>> compare(pred, gold) 1 This G R 2 subj 2 subj 2 is G R 0 mv 0 mv 3 a 2 pt 4 det 4 test G 2 obj 2 prednom 5 * . 2 obj 2 prednom LAS: 2 4 0.5 UAS: 3 4 0.75 LA: 2 4 0.5 Punctuation tokens are marked with '\*' in the second column. Tokens marked 'G' contribute to the UAS score, tokens marked 'R' contribute to the LA score, and tokens marked 'G R' contribute to the LAS score.