Naija SUD Guidelines
This page outlines various features and annotation conventions useful for the annotation of Naija.
Table of contents
Cleft sentences and questions
Cleft sentences are an extremely common construction in Naija, making the comp:cleft
relation a particularly important for the the annotation of this language. The basic cleft construction in Naija includes the phrase na im (it’s him), followed by a verb phrase, though a number of variants exist. The following provides several such examples.
The comp:cleft
relation is also used in questions containing interrogative words such as who or where. In such cases, the wh-word is annotated as the root, and is connected to the verb via a comp:cleft
relation.
Be, dey, na and the zero copula
Dey
The term dey in Naija performs two primary roles. The first is that of a copula. In these instances, dey is annotated as a verb and is connected to its complement with a comp:pred
relation, as in the examples below.
This word is also used as an auxiliary verb which marks the imperfective aspect. In these cases, dey is annotated as an auxiliary and is connected to the following verb with a comp:aux
relation.
Be and na
In addition to dey, Naija contains two other words that can function as copulas: be and na. Like dey, be, is annotated as a verb, and is connected to the subject via a subj
relationship and to the predicate via the comp:pred
relationship. We also treat na in a similar fashion, though it is tagged as a particle rather than a verb.
Zero copula
However, the copula is not always needed to link subjects to their predicates. In cases where no copula is present, the predicate is connected to its subject via a subj
relationship.
Compounds and phrasal verbs
Our annotation of Naija makes frequent use of the compound
relation. In our annotation system, this relation is systematically applied to relationships between two nouns in which one of them acts as a form of modifier. In this sense, compound
functions much like the mod
relation, except that it links two nouns together rather than a noun to an adjective.
The compound
relation is also used in some relations between nouns and adjectives, such as dry cleaner, which are considered fixed expressions whose meaning cannot be directly understood from its constituent parts
The subtype compound:prt
is also used to connect the components of various phrasal verbs inherited from English.
Please note that other languages might use the compound
relation in a more limited set of contexts, if at all. For a more general overview of this relation, please consult the dedicated page.
Numbers and dates
Numbers composed of more than one word, such as five hundred or six thousand are primarily chained together with the flat
relation. If the number contains the coordinating conjunction and, such as in one hundred and one, the integer directly preceding the coordinating conjunction is connected to one directly following it with a conj:coord
relation.
If the number contains a decimal, the point is marked as a noun and is integrated into the number with a simple flat
relation.
If numerals are listed a sequence, such as in telephone numbers, the constituents are chained together with the conj:coord
relation.
Note that references to radio stations which use this format nevertheless contain a flat
relation. This is because we consider that the frequency number effectively functions as a title.
When annotating dates, the mod:appos
relation is used to connect the month to the numerical day. Meanwhile, the year is connected to the month using the mod
relation.
Multi-word placenames and organizations
In Naija, multi-word placenames and organizations are currently annotated with a simple flat
relation, though their constituents retain their typical parts of speech.
Titles and honorifics
Honorifics such as Mister or President are connected to the names they precede with a simple flat
relation.
However, this is not the case when a title is connected to a determiner or otherwise modified in some way. In these cases, a mod:appos
relation is used.
Official multi-word titles such as Minister of Foreign Affairs are treated as titles (see here for a detailed guide). The head of the title is given an ExtPos
of PROPN
.