Difference between revisions of "Nonstructural labels"

From Icelandic Parsed Historical Corpus (IcePaHC)
Jump to: navigation, search
(QTP)
(FRAG)
 
Line 78: Line 78:
 
==FRAG==
 
==FRAG==
 
Fragments are grammatical utterances which consist of at least two constituents. In the utterance, however, is not enough material to construct an IP.
 
Fragments are grammatical utterances which consist of at least two constituents. In the utterance, however, is not enough material to construct an IP.
 +
 +
==REP==
 +
 +
See [http://www.ling.upenn.edu/hist-corpora/annotation/disfluencies.htm#rep PPCHE].
  
 
==QTP==
 
==QTP==

Latest revision as of 15:24, 12 August 2014

CODE

Click here for PPCME2, PPCEME documentation.

When unsure of parse (COM for comment):

(CODE ({COM:unsure_of_parse}))

Foreign language passages

Click here for PPCME2, PPCEME documentation.

Foreign language passages more than one word are labeled with the language name (e.g. Latin). If the passage forms its own clause, there are two options.

1) If the passage is a direct speech or, e.g., a prayer, it is parsed as QTP which idoms LATIN (which in turn idoms FWs).

2) If the language passage is not a direct speech, it is parsed as Latin on the clause level (idominating FWs).

Rule of thumb: if a word is not found in the Icelandic Dictionary, tag it as FW (foreign word).

( (LATIN (FW Assumptio-assumptio) (FW sancte-sancte) (FW Marie-marie)))
	  (IP-MAT=1 (CONJ en-en)
		    (PP (P í-í)
			(NP (OTHER-D öðru-annar) (N-D lífi-líf)))
		    (VB veita-veita)
		    (NP-OB2 (PRO-D oss-ég))
		    (NP-OB1 (ADJR-D meiri-mikill)
			    (N-D dýrð-dýrð)
			    (PP (P en-en)
				(CP-CMP (WNP-4 0)
					(C 0)
					(IP-SUB (NP-OB1 *T*-4)
						(NP-SBJ (PRO-N vér-ég))
						(MDPS KUNNIM-KUNNA)
						(ADVP-TMP (ADV nú-nú))
						(VB biðja-biðja))))))
	  (, .-.)
	  (NP-PRN (D-N SÁ-SÁ)
		  (D-N inn-inn)
		  (ADJ-N sami-samur)
		  (NPR-N Jesús-jesús)
		  (NPR-N Kristur-kristur)
		  (, ,-,)
		  (CP-REL (WNP-1 0)
			  (C ER-ER)
			  (IP-SUB (NP-SBJ *T*-1)
				  (PP (P með-með)
				      (NP (N-D feður-faðir)
					  (CONJP (CONJ og-og)
						 (NP (ADJ-D helgum-helga) (N-D anda-andi)))))
				  (VBPI (VBPI lifir-lifa) (CONJ og-og) (VBPI ríkir-ríkja)))))
	  (LATIN (FW per-per) (FW omnia-omnia) (FW secula-secula) (FW seculorum-seculorum))
	  (. .-.)))
	  (ADVP-TMP-RSP (ADV þá-þá))
	  (VBPI fer-fara)
	  (NP-SBJ (PRO-N hann-hann))
	  (PP (P til-til)
	      (LATIN (FW templum-templur) (FW Domini-dominur)))
	  (IP-INF-PRP (TO að-að)
		      (VB bera-bera)
		      (ADVP-DIR (ADV þar-þar))
		      (NP-OB1 (N-A reykelsi-reykelsi)))
	  (. .-.)))

META

Click here for PPCME2, PPCEME documentation.

Used, e.g., in chapter headings that are part of the author's text (and not the editor's).

FRAG

Fragments are grammatical utterances which consist of at least two constituents. In the utterance, however, is not enough material to construct an IP.

REP

See PPCHE.

QTP

Quotation phrase

Click here for PPCME2, PPCEME documentation.

QTP and FRAG do not idom arguments of verbs. They can, however, idom NP-VOC, NP-ADV ...

When yes or no are an argument of a verb, they are parsed as QTP (and tagged as INTJ).

Reference

Click here for PPCME2, PPCEME documentation.