Checklist
From Icelandic Parsed Historical Corpus (IcePaHC)
Manual checking only
1. Have you checked pos-tags?
a. If "það" appears, is it "ES" or "PRO"?
2. Have you checked lemmas?
3. Before beginning to parse a new file, right after preprocessing, use text editor to check:
b. "er" for C or BEPI c. "mikill", "margur", "lítill" are Q d. "að" e. "svo"
Sanity checked by automatic queries
1. Does every finite (non-gapped) IP have a subject?
2. Are all argument NPs (i.e. NP-SBJ, NP-OB1, NP-OB2) dominated by IP?
3. Does every CP dominate an IP?
4. Are all IP-level constituents of an appropriate type for the IP?
(i.e. if bare ADJP occurs at the IP level, then the main verb must be a copular verb, e.g. VERA, VERÐA, HEITA)