Punctuation

From Icelandic Parsed Historical Corpus (IcePaHC)
Jump to: navigation, search

Only three(?) punctuation marks are used on the tag (head) level: " , .

If a token ends with abbreviation, an extra period can be added (depending on the context, e.g. when the next token starts with a capital letter):

'var þá ofan á það lagt 60 rd.'

( (IP-MAT (BEDI var-vera)
	  (ADVP-TMP (ADV þá-þá))
	  (PP (ADV ofan-ofan)
	      (P á-á)
	      (NP (PRO-A það-það)))
	  (VAN lagt-leggja)
	  (NP-SBJ (NUM-N 60-sextíu) (NS-N rd.-ríxdalur))
	  (. .-.)))