Naija SUD Guidelines
This page outlines various features and annotation conventions useful for the annotation of Naija.
Table of contents
- Table of contents
Cleft sentences and questions
Cleft sentences are an extremely common construction in Naija, making the
comp:cleft relation a particularly important for the the annotation of this language. The basic cleft construction in Naija includes the phrase na im (it’s him), followed by a verb phrase, though a number of variants exist. The following provides several such examples.
comp:cleft relation is also used in questions containing interrogative words such as who or where. In such cases, the wh-word is annotated as the root, and is connected to the verb via a
Be, dey, na and the zero copula
The term dey in Naija performs two primary roles. The first is that of a copula. In these instances, dey is annotated as a verb and is connected to its complement with a
comp:pred relation, as in the examples below.
This word is also used as an auxiliary verb which marks the imperfective aspect. In these cases, dey is annotated as an auxiliary and is connected to the following verb with a
Be and na
In addition to dey, Naija contains two other words that can function as copulas: be and na. Like dey, be, is annotated as a verb, and is connected to the subject via a
subj relationship and to the predicate via the
comp:pred relationship. We also treat na in a similar fashion, though it is tagged as a particle rather than a verb.
However, the copula is not always needed to link subjects to their predicates. In cases where no copula is present, the predicate is connected to its subject via a
Compounds and phrasal verbs
Our annotation of Naija makes frequent use of the
compound relation. In our annotation system, this relation is systematically applied to relationships between two nouns in which one of them acts as a form of modifier. In this sense,
compound functions much like the
mod relation, except that it links two nouns together rather than a noun to an adjective.
compound relation is also used in some relations between nouns and adjectives, such as dry cleaner, which are considered fixed expressions whose meaning cannot be directly understood from its constituent parts
compound:prt is also used to connect the components of various phrasal verbs inherited from English.
Please note that other languages might use the
compound relation in a more limited set of contexts, if at all. For a more general overview of this relation, please consult the dedicated page.
Numbers and dates
Numbers composed of more than one word, such as five hundred or six thousand are primarily chained together with the
flat relation. If the number contains the coordinating conjunction and, such as in one hundred and one, the integer directly preceding the coordinating conjunction is connected to one directly following it with a
If the number contains a decimal, the point is marked as a noun and is integrated into the number with a simple
If numerals are listed a sequence, such as in telephone numbers, the constituents are chained together with the
Note that references to radio stations which use this format nevertheless contain a
flat relation. This is because we consider that the frequency number effectively functions as a title.
When annotating dates, the
mod:appos relation is used to connect the month to the numerical day. Meanwhile, the year is connected to the month using the
Multi-word placenames and organizations
In Naija, multi-word placenames and organizations are currently annotated with a simple
flat relation, though their constituents retain their typical parts of speech.
Titles and honorifics
Honorifics such as Mister or President are connected to the names they precede with a simple
However, this is not the case when a title is connected to a determiner or otherwise modified in some way. In these cases, a
mod:appos relation is used.
Official multi-word titles such as Minister of Foreign Affairs are treated as titles (see here for a detailed guide). The head of the title is given an