RP

From Icelandic Parsed Historical Corpus (IcePaHC)
Revision as of 12:36, 12 May 2011 by Joelcw (Talk | contribs) (UNDIR)

Jump to: navigation, search

Following the English corpora, following the Brown corpus, we tag a certain set of words as "RP" any time they do not take a complement.

The Icelandic equivalents to the above list of words (see link) include, at least (this list in progress): Á, AF, AÐ, FRAM, FRÁ, FYRIR, INN, Í, NIÐUR, MEÐ, OF, TIL, UM, UPP, ÚR, ÚT, VIÐ, YFIR

(MEÐ and VIÐ not following ppcme2, but following the early new high german corpus and penn yiddish corpus).

Note that Á is tagged RP when it does not take a complement, but P elsewhere. Where Á modifies a PP, it is parsed as an RP specifier of PP where possible.

Note that the following are tagged ADV, projecting an ADVP-DIR, ADVP-LOC, or ADVP-TMP: AFTUR, ÁFRAM, BURT, EFTIR, HJÁ, INNAN, INNI, FRAMAN, FRAMMI, OFAN, NEÐAN, NIÐRI, UNDAN, UNDIR (but see below), UPPI, UTAN, ÚTI,

These are generally clause (IP)-level constituents, unless they appear in the specifier of a PP, as in the following examples:

(PP (RP Upp-upp)
    (P af-af)
    (NP (N-D héraði-hérað)
        (D-D því-það)
(PP (RP Fram-fram)
    (P-D af-af)
    (NP (N-D dal-dalur) (D-D þessum-þessi)))

Note that if a particle constitutes the *only* complement of a preposition, then it is exceptionally tagged ADV and projects an ADVP. This is in order to maintain the generalization that prepositions take phrasal complements. See e.g. FRÁ in Treatment of individual words.

Usually, ADVPs do not dominate RPs (see here).

UNDIR

UNDIR is tagged RP *only* where the locative meaning is impossible in context, as in the following cases:

1) in combination with the verb STANDA when it means "understand" (not when it literally means "to stand underneath something").

2) in combination with GANGA meaning "undergo" (not when it literally means "to go underneath something"

3) in UNDIRÞRYKKJA

4) UNDIRORPINN

5) UNDIRVÍSA

Other particles

RPX

See here.

FP

See here.