Sanity checks

From Icelandic Parsed Historical Corpus (IcePaHC)
Jump to: navigation, search

To run the sanity checks on a file:

  • cd into "icecorpus/parsing"
  • run the command:
./sanity.sh filename
  • IMPORTANT: filename is WITHOUT THE PSD EXTENSION here
  • Open filename.sanity.psd in same directory and search for ZZZ, and for each result: fix the parse

For Brynhildur and Hulda:

  • cd into "Dropbox/corpus"
  • run the command:
./lemma.sh filename
  • If bash: ./lemma.sh: Permission denied, then right-click lemma.sh, select Properties/Permissions and tick the box Allow executing file as program
  • IMPORTANT: filename is WITHOUT THE PSD EXTENSION here
  • Open filename.sanity.psd in same directory and search for ZZZ, and for each result: fix the parse

To open CorpusDraw and find only sentences with ZZZ:

Make the query zzz.q (it already exists in Dropbox/corpus):

node: $ROOT
query: (*ZZZ* exists)

Now open terminal window and write:

CD zzz.q filename.sanity.psd

Start fixing the parse.