Treatment of individual words
AÐEINS, tag as FP=focus particle, cf. only in English
AF HVERJU 'why', parse (WPP (P af) (WNP (WPRO-D hverju)))
AFTAN can be ADV or P, depending on whether it takes a complement or not.
AFVEGA is tagged ADV.
ALLA REIÐU, parse NP-TMP.
ALLEINASTA, ALLEINA is generally used as a focus particle and accordingly tagged FP, as with English ONLY in focus particle use.
ALLTAF is ADVP-TMP. When written ALLT AF, tag it as ADV21 allt ADV22 af, projecting ADVP-TMP.
ALLT AF LÉTTA, "af létta" is a PP and the lemma of létta is létti.
ALLT TIL + NP (e.g. "allt til enda veraldar"), the whole thing is a PP idominating Q-N allt:
(PP (Q-N allt-allur) (P til-til) (NP (ADJS-G yðsta-ytri) (NP-POS (N-G jarðar-jörð)) (N-G enda-endi)))))))))))
ALLS can be an ADV or P. ALLS is always tagged P when it introduces a CP-ADV, in which case it has a meaning akin to "þar sem" or "af því að".
ALÞING(I) is tagged NPR rather than N.
ANGRA is *not* treated as having an accusative subject. So in sentences like "Hún angraði þau" (= "She angered them"), "Hún" is parsed as the NP-SBJ. This analysis is based on the intuitions of modern speakers, but note that it may not be the correct analysis for *all* texts and periods of the language.
ANNAÐHVORT tagged OTHER+WPRO, and should not have case (only) when it functions as a correlative conjunction.
ANNAÐ TVEGGJA, tag as NP-ADV
ANNAR, in most cases is tagged as OTHER; AÐRIR as OTHERS. However, if ANNAR clearly means the ordinal number "second" in context, it is tagged ADJ as all ordinal numbers are.
ANNARS, when it occurs alone, without being assigned the genitive gase, parse as NP-ADV.
ANNO Latin "year", used for dates, is tagged FW.
ARNA/ARNI the swearing-expletive-like element, as in ""skituna þá arna", is simply tagged N.
AUK can be P or ADV, depending on whether it takes a complement or not.
ÁÐUR is ADV when it occurs alone, projecting an ADVP-TMP. But when it introduces an adverbial clause alone, it is a P. When ÁÐUR introduces a comparative clause (which has an adverbial function) along with "en", see ÁÐUR EN below.
ÁÐUR EN, ÁÐUR EN AÐ ÁÐUR is tagged ADVR and projects an ADVP-TMP. The EN is a P and frequently takes a CP-CMP complement; see the documentation there. (cf. also FYRR EN, FYRR EN AÐ). Note that this construction sometimes occurs without the EN in older texts.
ÁLÍKA is tagged ADVR when it introduces a comparative clause.
ÁN, this preposition assigns genitive in Modern Icelandic, but in Old Icelandic, such as Homiliubok (12th century), Thorlakur (13th century), Alexander (13th/14th century), Marta (14th century), Bandamenn (15th century), Ector (15th century) and Judit (15th century), it sometimes assigns dative.
Á BAK VIÐ, the whole expression is a PP; the PP headed by VIÐ is the complement of the noun BAK. In the cases where the prepositon Á is missing, BAK VIÐ, parse it nevertheless as a PP with an empty preposition. This is similar to the Á MÓTI expression (which takes a NP-COM complement).
BAKA is treated as NS-G in the PP til baka.
BARA is tagged FP.
BÁÐIR, tagged as Q.
BRAUT undeclined "braut" indicating motion away ("abroad"), either occurring alone or inside a PP, is tagged ADV. Any declined forms or otherwise clearly nominal forms are tagged N.
BRÁÐ, as in Í BRÁÐ, Í BRÁÐINA, is a noun projecting an NP.
BRÁTT, tag as ADVP-TMP.
BURTSÉÐ is parsed as BURT$ $SÉÐ and tagged as ADV VAN.
BÚA in the periphrastic perfect construction VERA BÚINN AÐ X (where X is some VB forming an IP-INF with AÐ), BÚA is tagged VBN, not VAN. Click on link for an example.
DAGLEGANA, and some other adverbs ending with "lega+na", are temporal (cf. also NÝLEGANA)
DÆMI is N, and "til dæmis" can project a FRAG in cases such as the one below:
( (QTP (" ") (PP (P Af-af) (NP (PRO-D því-það) (, ,-,) (CP-QUE-SPE-PRN (WNP-1 (WPRO-N hvað-hver)) (C 0) (IP-SUB-SPE (NP-ADV-XXX *T*-1) (NP-SBJ (PRO-N þér-þú)) (VBPI veljið-velja) (NP-OB2 (PRO-D yður-þú)) (NP-OB1 (ADJ-A góða-góður) (NS-A vini-vinur) (, ,-,) (FRAG (PP (P t.-til) (NP (N-G d.-dæmi))) (NP (N-A herra-herra) (NPR-A Þorlák-þorlák)))))))) (. .-.) (" ")))
EF TIL VILL, e.t.v. 'maybe, perhaps'
(NP-TMP (OTHER-A Annan-annar) (N-A dag-dagur) (ADV eftir-eftir))
EIGIN, tag as ADJ.
EINMITT is generally tagged ADV, but it may also have a focus particle use, and so the tagging convention may be revised (to FP) in later versions of the corpus.
EINN, usually tagged as ONE. However, if it means "alone" in a copular clause (e.h. "Jón var þar einn"), it is tagged ADJ. Also, it can be tagged FP following the English corpora in the following case:
"When ONE means ONLY, ALONE and follows the noun or pronoun it focuses or when it follows NOT in the meaning NOT ONLY, it is treated as a focus particle (FP)."
EINN can also be in the plural, in which case it is tagged ONES.
EINNIG, EINNINN 'also', tag as ALSO
EINS, meaning 'alike' as in ekki fór eins fyrir honum og henni, tag as ADJ. Otherwise, ADVR. In the EINS OG construction (a type of comparative) or any other comparative construction (see CP-CMP), "eins" is tagged ADVR. See EINS OG in ADJP#ADJ_heads_of_ADJP and ADJP#ADVR_heads_of_ADJP. Also, "eins" is ADVR in "undir eins".
EINSKONAR and margskonar, etc: Q+N-G, projecting NP-POS.
EINUNGIS is tagged FP.
EITTHVAÐ is tagged ONE+Q.
ELLEGAR , "otherwise", is tagged ADV and projects ADVP.
ENDA, usually tagged as ADV, but it can be CONJ in cases where it clearly conjoins clauses. ENDA can also be tagged P, but *only* where it *clearly* introduces a subordinate clause of the CP-ADV type; in this latter case ENDA usually means something like "on the condition that", and it introduces a CP-ADV without a C node but with V-to-C movement of a conditional verb : "enda sé hann svo lítillátur..."
ENGINN, tagged Q
ER When it means 'which, that' it is a complementizer of a relative clause (CP-REL). When ER means "when", there are two possibilities (1) if there is no antecedent we take it to be a C as before, projecting a CP-ADV with no wh-word (as in the CP-ADV complement of "þegar") (2) if there is a temporal antecedent it introduces a CP-REL clause. Rarely, but on occasion ER can also be a complementizer projecting a CP-THT clause. (Check the latter parse when you find it in the corpora, as there may be confusion on this point).
ETC as in the English corpora, "etc" is tagged FW. It can appear at the clause level as FW in some cases, though it generally functions as an adverb phrase there.
EYKT is tagged N and projects NP-TMP. It means "half past 3 o´clock".
FEIKN meaning "a great quantity" is tagged N and usually projects a NP-MSR.
FIRRUM is tagged ADV, projecting ADVP-TMP, like FORÐUM.
FJARRI, tag either as ADVR or ADJR and lemmatize as FJÆR. The superlative of FJARRI is FJARST (ADVS) or FJARSTUR (ADJS) (both lemmatized as FJÆR); the superlative of FJÆR is FJÆRST (ADVS) or FJÆRSTUR (ADVR).
FJÓRÐUNGUR is tagged N, like HUNDRUÐ (not like HÁLFUR).
FRAMVEGIS, o.s.frv., og svo framvegis, etc: tagged ADV. These words can project an ADVP which can be coordinated with any category, as in the examples below. Note that when "svo" appears, it is attached at the level of CONJ, not inside the ADVP headed by FRAMVEGIS.
(PP-1 (P about) (NP (NP (NS Males)) (CONJP (CONJ and) (NP (NS Females))) (, ,) (CONJP (CONJ and) (ADVP (ADV so)) (ADVP (ADV forth)))))
(NP-PRN-1 (NP (N-D stöðuglyndi-stöðuglyndi) (CONJ og-og) (N-D sparsemi-sparsemi) (IP-INF-PRP (TO að-að) (VB passa-passa) (RP upp-upp) (PP (P á-á) (NP (N-A heilsu-heilsa) (NP-POS (PRO-A sína-sinn)))))) (CONJP (CONJ og-og) (ADVP (ADV svo-svo)) (ADVP (ADV framvegis-framvegis)))) (. .-.)))
FREMI, tag as ADV
FYRIRGEFA when it takes 2 arguments, the dative (usually animate) one is NP-OB2, and the accusative one (the sin to be forgiven) is NP-OB1.
FYRIR OFAN, parse as recursive PPs
FYRST, tagged as P introducing CP-ADV when the meaning is 'since', as in I will do it since you won't. When it is a temporal adverb, it is tagged ADVS (though there may be some inconsistency about whether it is tagged ADV or ADVS in the corpus).
FYRSTA unlike the English corpora, FYRSTA is tagged ADJ (not ADJS, i.e. not superlative), projecting an NP, in the PP í fyrstu 'at first' but not as ADV in that case. A strong argument for not doing it as in the English corpora is that FYRSTA can have a determiner, cf. í fyrstunni. For the ordinal number form FYRSTA as in í fyrsta skipti 'for the first time', see FYRSTI. For the temporal adverb parallel to English first, see FYRST.
(PP (P í) (NP (ADJ-D fyrstu)))
(PP (P í) (NP (ADJ-D fyrstu$) (D-D $nni))))
FYRSTI, the ordinal number FYRSTI is tagged ADJ.
GAMAN is tagged as an N by default. However, when modified by an adverb it is treated as an adjective.
GERA is tagged DO, DODI, DOPI, etc., in all meanings. GERAST, however, is not tagged DO; GERAST is VB in the meanings "to happen" and "to become" in which case it takes a predicate. However, it is wise to include both DO and VB in searches for GERAST. See also Lemmatization.
GJÖRSVOVEL (subject to revision) we split this up into VBPI, ADV, and ADV, and parse normally. In this way, "við bara gjörsvovel og veiðum hann" (= "we just go ahead and catch him") will be split into two matrix tokens, and it will be necessary to search for "(VBDI gjör$)" in order to find such examples.
GIFTA can be a double object verb, taking NP-OB1 and NP-OB2.
GIFTAST where this verb takes a single object, who is the person that the subject is marrying, that object is NP-OB2.
GÆR as in "í gær" is tagged N-A.
HANDA projects NP in 'til handa honum' but PP in 'handa honum'
HÁTTUR, see MEÐ SAMA HÆTTI
HEILL tagged ADJ.
HEILSA takes an NP-OB1 object, unlike ÞAKKA.
HEIM, HEIMA is tagged ADV. This is not like "home" in the English corpora, which is N. HEIMA is different because unlike English "home", HEIMA is generally not used as a noun (except in the construction "að eiga heima").
HEIMKOMINN "heim" is split off as an ADV projecting an ADVP-DIR, and "kominn" is VBN.
HELDRI is ADJR, see HELSTUR for ADJS.
HELDUR is ADVR; where it means "but", we assume there is a silent "but" or no conjunction. HELDUR can occasionally be tagged FP, especially if it appears to participate in the NEG...BUT construction. In the FP use, HELDUR can be translated as English "only" (in the NEG...BUT construction, NEG and NEMA together mean "only"). Click on the link for more information, as well as for examples of the HELDUR EN ("rather than") construction.
Please note that the NEG...BUT construction is *not* the same (does not have the same meaning as) NEG inside of a conjunction with HELDUR, which also occurs.
HELSTUR is tagged ADJS. HELST can also occur as an ADVS.
HINN is tagged D. It is still tagged D even when it is used in the meaning 'other'.
HINUMEGIN meaning "on the other side" is tagged N-D, and it usually takes an NP-POS complement, as in "hinumegin árinnar" (="on the other side of the river"). It projects an NP-ADV.
HÉR 'here' is usually tagged as ADVP-LOC. When it does not have a locative meaning, as in hér eftir 'from now on', it is only tagged as ADVP.
HUNDRAÐ and other quantity words that can occur in the plural (e.g. þÚSUND, TYLFT) are tagged N or NS, rather than NUM. This is following the English corpora guidelines for the plurals of such quantity words.
HVAÐ, sometimes WADV, as in HVAÐ ER ÞETTA MIKIÐ? (similar to HVE MIKIÐ ER ÞETTA?).
HVAÐA, as in Hvaða fólk er þetta, is tagged WD.
HVAÐAN AF, WPP like English whereto
HVAÐVETNA, HOTVETNA, or HVERSVETNA, usually tagged as Q, projecting an ADVP-LOC. However, in older texts, it can also be a wh-word.
HVAR, meaning 'where' tag as WADV
HVARVETNA, HORVETNA, when it does not intruduce a CP, it is tagged as WADV projecting ADVP-LOC.
HVERIGUR is sometimes tagged WD.
HVER meaning 'each' (as in HVER ANNAR 'each other') is tagged Q, but WPRO when it means 'who' (interrogative pronoun or relative pronoun).
HVERGI meaning "nowhere" is tagged Q+ADV, it projects ADVP-LOC or ADVP-DIR. HVERGI can also be a quantifier, meaning 'every (one)'.
HVERT is tagged as WADV.
HVÍ introduces CP-QUE. It usually means 'why' and is tagged WADV. However, in older Icelandic, it sometimes is the dative form of HVAÐ 'what' (as in "Hví sætir það?"). Then it is tagged WPRO-D.
HVÍLÍKUR can be tagged SUCH, or it can be WD, in which case it functions as the wh-word counterpart to words like SLÍKUR and ÞVÍLÍKUR.
HVOR, as in hvor hjá öðrum 'each with the other'
HVORKI, HVORTKI meaning "neither" and used with NÉ (= "nor") is tagged CONJ.
HVORT usually tagged WQ, in which case it means "whether" and introduces an indirect question. Occasionally it can be WPRO and project a WNP, as in the case of English "whether" when it means "which of two", and HVORT is tagged WPRO in the expression "hvort sem er" (see CP-FRL).
HVORTVEGGJA, HVORTTVEGGJA, and hvorirtveggja, hvorartveggja, etc., are tagged Q+NUM. HVORTVEGGJA is sometimes used as a correlative conjunction, similar to ANNAÐHVORT ... EÐA.
INNANTIL, utantil, útifyrir, etc. are tagged ADV and generally project an ADVP-LOC.
JAFNFRAMT is tagged ADVR+P.
JAFNFÆTIS ADVR+ADV, which frequently has a, NP-CMP sister.
KANNSKI 'perhaps, maybe', tagged ADV. KANN SKE is tagged (ADV (ADV21 KANN) (ADV22 SKE))
KONAR, tag N-G. It is usually modified by a quantifier, cf. ALLS KONAR, and projects NP-POS. When written in one, ALLSKONAR is tagged Q+N-G; similarly EINSKONAR is tagged Q+N-G.
KRING, KRINGUM 'around, round', as in round the edges of the flowers, is tagged P:
(PP (P kringum) (NP (PRO-A hana)))
As in the case of MILLI, KRING(UM) always projects a PP, even if it is sometimes intransitive.
(IP-SUB (ADVP-TMP *T*-2) (NP-SBJ (NPR-N Pétur-pétur)) (VBDI ferðaðist-ferðast) (PP (P um-um) (PP (P kring-kring))) (IP-INF-PRP (NP-OB1 (Q-G allra-allur)) (TO að-að) (VB vitja-vitja)))))
( (IP-MAT (CONJ og-og) (NP-SBJ (Q-N allar-allur) (NS-N ekkjur$-ekkja) (D-N $nar-hinn)) (VBDI flykktust-flykkjast) (ADVP-LOC (ADV utan-utan)) (PP (P um-um) (PP (P kring-kring) (NP (PRO-A hann-hann)))) (IP-PPL (VAG grátandi-gráta))))
KRINGUR is a noun, as in í krók og (í) kring
LANGTUM, parsed as NP-MSR.
LENGI is ADV, as in "hann var þar lengi" (lit. he was there long, i.e. 'he stayed there for a long time'). However, LENGI, like LANGUR, frequently projects an NP-MSR. Not to be confused with forms of LANGUR.
LIFA (verb, meaning live), often takes NP-MSR. "He lived 80 years."
LIFANDI, tag VAG
LÍFS, usually used as a predicate, "Hann er lífs" 'He is alive'. LÍFS projects an NP-POS which projects NP-PRD. With verbs like KOMAST, as in "Hann komst þaðan lífs", LÍFS projects NP-ADV.
LÍKA 'also', tag as ALSO; can exceptionally be ADVR and license a CP-CMP in Oddur Gottskálksson´s New Testament.
LíKT is ADVR when it licenses a comparative.
LÍTILL, LÍTIÐ 'little, not much', tag as ADV in "Þær þekktust lítið", cf. "Þær þekktust vel". Otherwise it is usually Q, QR, or QS, parallel to MIKIÐ provided it cannot be replaced by "smár" in the given usage. Anywhere that LÍTILL can be replaced by "smár", "smærri", or "smæstur", it is tagged ADJ, ADJR, or ADJS as appropriate. In cases in doubt (as to the precise meaning of the word in context), the default is Q (or QR, QS). See also NP-MSR.
MARGUR, tagged Q
MEÐAN, Á MEÐAN, MEÐAN Á, Á MEÐAN Á; usually tagged P, projecting a PP and taking a CP-ADV complement, as with English "while". When MEÐAN takes no complement, it is tagged ADV and projects ADVP (this is the parse whether or not MEÐAN is the complement of another preposition). When MEÐAN takes no complement and occurs without Á, it generally projects an ADVP-TMP.
MEÐFERÐ tagged N, and frequently projects an NP-ADV as in "En-en kóngar-kóngur þeir-sá sem-sem þú-þú hefir-hafa meðferðar-meðferð eru-vera mínir-minn fangar-fangi" (usually "meðferðis") in the modern language.
MEGIN tagged N. In "öðru megin", MEGIN is tagged N-D and the phrase projects NP-ADV.
MIKIÐ, MIKILL This is tagged as a quantifier, Q, QR (for MEIRA), or QS, when it cannot be replaced by "stór", "stærri", "stærstur". If it can be replaced by "stór", then it is tagged ADJ, ADJR, or ADJS as appropriate. In cases in doubt (as to the precise meaning of the word in context), the default is Q (or QR, QS). See also NP-MSR.
MITT can be the neuter form of the ADJ "miður", but when it means "in the middle" and does not agree with some argument in number and case, we tag it ADV and it projects an ADVP-LOC.
When MITT is a measure phrase, it is ADJ-A and projects NP-MSR.
MIÐUR as in "því miður":
(ADVP (NP-MSR (PRO-D því-það)) (ADVR miður-miður))
MISKUNNA takes NP-OB1.
See also NP-MSR.
MJÖG is tagged ADV. When it occurs alone and means "much" or "a lot", it projects NP-MSR.
MÓTI as in á móti honum tagged "N", even when it means "facing", e.g. "hvor á móti öðrum". In this construction, "móti" is considered a dative N taking an NP-COM, analogously to English "side"; see the PPCME2,PPCEME guidelines on Complements of N and NP-COM. When the preposition is missing parse the whole phrase as a PP with silent P.
NÉ is tagged CONJ, like English "nor"
NEI 'no', tag as INTJ.
NEMA tagged as P, analogously to English "except". NEMA can occasionally be tagged FP, especially if it appears to participate in the NEG...BUT construction. In the FP use, NEMA can be translated as English "only" (in the NEG...BUT construction, NEG and NEMA together mean "only"). See also HELDUR and EN.
NOKKUR usually Q. But when NOKKUÐ means "quite" or "somewhat", it is tagged ADV.
NÓGU, as in "nógu góður", is tagged ADVR.
NÓGUR, GNÓGUR is tagged ADJR, and it frequently licenses an IP-INF-DEG or CP-DEG.
NÁLÆGT is generally tagged ADJ and frequently projects an ADJP-LOC, like NÆR. It can also be ADV.
NÆR, NÆRRI, NÆSTUM, are either ADV or ADJ. See the discussion in ADVP#Complements_of_ADVP and ADJP#Complements_of_ADJ. Note that NÆR can sometimes mean 'when', in which case it is ADV. NÆR can also be the wh-word "when" in Vídalínspostilla, in which (exceptional) case it is tagged WADV.
NÆST meaning "next" in "því næst" and "þessu næst" is tagged ADVS.
NÆSTUR when it is morphologically an adjective is tagged ADJS, whether prenominal or postnominal.
OGSVO can be tagged ALSO, if it has that meaning.
OF tagged as ADVR when it means 'too' as in 'too much', P when it takes a complement, and RP otherwise (the last case applies often when of is a word that has no obvious meaning in Old Icelandic)
RAUNAR, tagged as ADV.
SAKIR or SÖKUM is always tagged as NS, projecting an NP. It frequently projects a PP with a silent preposition, like MÓTI.
SAMUR 'same', tag as ADJ
SANNLEGA, two of these, SANNLEGA, SANNLEGA, are parsed as one ADVP, cf. English, TRULY, TRULY (Bible)
SEINN usually an ADV projecting an ADVP-TMP, when it is a clausal modifier denoting the time that something happened. It can also be tagged ADJ when it modifies a noun. See the PPCME2/PPCEME guidelines on EARLY and LATE.
SEM is always tagged as a complementizer, C. This is true even for comparatives such as feitur sem svín, 'as fat as a pig'; see CP-CMP for a full discussion. This is not the same as the treatment of "as" in the PPCME2, PPCEME, and in general, our treatment of comparatives differs somewhat. All comparatives are treated as clausal, i.e. involving a CP-CMP, in Icelandic.
When SEM introduces an adverbial clause, it is still a complementizer, and it simply projects a CP-ADV in which it occupies the C position.
SEM AÐ is treated as a single C, like: (C (C21 sem-sem) (C22 að-að)).
SÉRHVER is tagged Q, just like the quantifier usage of HVER.
SÍÐAN, is usually an ADV projecting ADVP-TMP. It can also be a preposition (Ég hef verið hér síðan í gær 'I have been here since yesterday') which can introduce CP-ADV (Ég hef verið þreyttur síðan þú komst heim 'I have been tired since you came home')
SÍÐASTA is tagged ADJ (not ADJS, i.e. not superlative), projecting an NP, in the PP að síðustu(nni) 'at last'
SÍÐUR is tagged QR when it means "less", and it projects an ADVP in the PP "að síður".
SJÁLFUR is tagged PRO, and is parsed as an NP-PRN when it modifies another pronoun, parallel to emphatic "himself", "herself", etc., in the English corpora.
SKIPA, when it means 'appoint' it takes an NP-SPR.
SKÖMMU, as in SKÖMMU SÍÐAR, is NP-MSR.
SMÁR, taged ADJ. But "smám saman":
(NP-MSR (ADV Smám) (ADV saman))
SODAN, SODDAN, SVODDAN tagged SUCH.
SPYRJA, 'ask' takes ((NP-SBJ NOM) SPYR (NP-OB2 ACC) (NP-OB1 GEN))
STRAX is ADVP-TMP, it sometimes introduces CP-CMP, as in STRAX OG, STRAX SEM.
SVO is tagged as ADVR when it is a degree adverb, e.g. when it modifies an adjective or another adverb or occurs in a svo...að clause. As with English "so", when "svo" is not used in a degree sense (ADVR) or as a preposition (P), SVO is tagged ADV. In its adverbial (ADV) use, SVO can generally be paraphrased by "þannig" or "á þá leið" (in English, IN THAT WAY).
STADDUR Although this is etymologically the participle of "steðja", it is generally used as an adjective and we tag it ADJ in most cases. On the very rare occasion that this parse is unlikely, STADDUR can also be tagged VAN or VBN.
STÓRUM when it means "much" is tagged ADV and projects an NP-MSR.
SUMUR, SUMIR, tagged as Q.
SUNDUR is ADV, projecting an ADVP complement in Í SUNDUR
TÍÐUM is NS-D and projects NP-TMP, like STUNDUM.
TUGUR is tagged N, and it frequently occurs within an NP-MSR (though this need not always be true).
TRÁSS can be a P taking a dative complement, meaning "even though", like German trotz or Danish.
TVENNUR, TVISVAR, ÞRENNUR, etc. are tagged NUM, analogously to "once", "twice", "thrice" in English. These words are not tagged for case when they occur alone, but they are tagged for case if they appear inside a larger NP (e.g. "tvisvar sinnum").
UMKRINGIS is tagged ADV projecting ADVP-LOC when it does not take a complement. When it does, it is tagged as P. This is similar to the convention for UMHVERFIS. When UMKRINGIS is written in two words, it is parsed as two Ps, similar to UM KRING (see KRINGUM).
UMHVERFIS When it does not have a complement, it is tagged ADV and projects an ADVP-LOC. When UMHVERFIS does take a complement, as in "umhverfis borgina", it is tagged P.
UNDIR is a P. In expressions like "undir eins", "undir hádegi", or "undir eins og CP-CMP", UNDIR is still tagged P and takes a PP complement. See also PP.
UNNVÖRPUM is NS-D with lemma UNNVARP, projecting NP-ADT.
UNS 'until', tag as P with CP-ADV as sister.
UTAN can be ADV or P, depending on whether it takes a complement or not. It can also be FP if it occurs in the NEG...BUT construction, like NEMA.
VAKIUÐR, tag VAN.
VERÐA is tagged with its own tag, RD (RDDI, RDPI, RDDS, RDPS, etc.), in all uses. This is because, like BE, it has auxiliary and non-auxiliary uses in Icelandic. This matches the treatment of "werden" in Caitlin Light´s Early New High German corpus.
VETTUGI is tagged Q, meaning "nothing", "naught", as in "meta e-n að vettugi".
VINSÆLL is an ADJ, even in sentences like "Hann var vinsæll af öllum mönnum" 'He was popular by every man'
VIRKA takes a small clause (IP-SMC) when it takes a predicate: "Þetta virkar ótrúverðugt".
VOÐA is tagged as an adverb, as in voða lítið.
YFIR is tagged P when it takes a complement and projects a PP, and RP when it does not take a complement. It can also be a degree modifier, like English OVER, in which case it is tagged ADVR: "yfir tvær þúsundir manns".
ÝMIST meaning "variously" is tagged Q without case and projecting an ADVP.
ÖLDUNGIS can be FP or ADV, depending on meaning.
ÖNDVERÐUR is ADJ. It projects an ADJP, even when it is inside a PP.
ÖRSKOT is tagged N, but it can project an NP-ADV in many contexts.
ÖÐRUVÍSI is tagged ADV. There may be some inconsistencies currently on this point, but it is easy to find.
ÞAÐ The object pronoun ÞAÐ is always tagged PRO. The subject pronoun ÞAÐ is tagged as PRO in any syntactic context where it *never* disappears under subject-finite-verb inversion. In Icelandic, when the finite verb fronts over the subject under V-to-C movement in matrix clauses or embedded topicalizations, truly non-expletive ÞAÐ will still surface in subject position, but truly expletive ÞAÐ will disappear. In any syntactic context in which ÞAÐ would disappear under subject-verb inversion, it is tagged ES (in accordance with Caitlin Light's Early New High German corpus). Such contexts include, at least: weather expressions, impersonals, and existentials.
ÞAÐ is tagged ES even in contexts where there is inter-speaker or inter-text (or diachronic) variation with regard to whether it disappears.
See Expletives also, for *exp*, which is the empty category corresponding to "(ES ÞAÐ)". Note that when ÞAÐ disappears under verb-movement, "(NP-SBJ *exp*)" is only inserted in the sentence if there is no other possible subject (e.g. it is not inserted in subject-postposition constructions).
ÞAR Á MEÐAL, treated like ÞAR Á MILLI
ÞAR Á MILLI, second PP idoms a P and a trace of the R-pronoun "þar".
ÞEGAR, 'when', is tagged as P when it introduces an adverbial clause (i.e. in the case that there is no antecedent for the "when"-clause), CP-ADV, so þegar Norðmenn tóku ..., 'when Norwegians took ...', is (PP (P þegar) (CP-ADV (C 0) (IP-SUB (NP-SUB Norðamenn) (VBD tóku) (...))).
However, ÞEGAR is tagged WADV when it introduces an indirect question or a relative clause, just as with English WHEN in the PPCME2/PPCEME.
ÞEGAR can also be tagged ADV, projecting an ADVP-TMP or an ADVP where it unambiguously means "immediately" or "promptly".
ÞEYGI Icel. '(þó) eigi', '(though) not', tag (ADV þ$) (NEG $eygi).
ÞVÍ is normally a dative pronoun, PRO. However, when it is a wh-word, meaning "hví" ("why"), it is tagged WADV. Where ÞVÍ means "because" by itself, or appears to function as an adverb by itself roughly meaning "therefore", it is still tagged PRO: see CP-THT for more information on how these constructions are parsed.
ÞVÍSA, tag as D-D.
ÞVÍLÍKUR is tagged SUCH, in the same was as SLÍKUR