SUD is a Surface-syntax Universal Dependencies scheme. SUD follows the Surface syntax criteria (favoring functional heads) and can be automatically converted to the UD scheme.
This page describes the universal principles used in SUD.
Some pages are available for specific usage of SUD in French and Naija.
SUD differs from UD in several general principles. The key differences between SUD and UD as well as a table summarizing the most frequent correspondences may be consulted here.
The other layers of annotations follow the UD guidelines. Please refer to UD for these aspects:
- Tokenization and word segmentation
- POS tags (single document)
- Features (single document)
Specific SUD relations
SUD has 4 specific syntactic relations and a few extended relations:
SUD shares a number of syntactic relations with UD, the list of which is given below (links to UD related page are given):
However, we must stress that there are some differences between the usage of some of these relations in UD and SUD. Namely, the relations
reparandum are only used when analysing written texts. When analysing oral texts, we use instead the relations
conj:dicto respectively. The same goes for
parataxis which is used differently in SUD when analysing oral textes. We will explain the details in the section below.
Relations specific to SUD used when analysing oral texts
- paradigmatical lists
- macrosyntactic relations
SUD deep features
In SUD, dependency relations are designed to describe syntactic surface relations.
Information related to deep syntax or semantics is given on dependencies with deep features which are extensions to dependency label introduced by the
The main deep features are:
Particular linguistic phenomena
For each linguistic phenomenon below, there is an explanation of how SUD takes it into account.
- Idioms and titles
- Light verb constructions
- Comparative, superlative and consecutive constructions
- Compounds and flats
Usages of the ExtPos feature
In SUD, the External POS (
ExtPos) feature is used to designate multi-word units which together behave like a certain part of speech, even though none of their constituents carry that part of speech.
It can also be used for cases where the internal POS of a token is different its usage.
A detailed description of the
ExtPos feature and its usages can be found here and on the Idioms and titles page.
Analyzing phenomena specific to spoken language
SUD is also used to annotate oral corpora. Spoken language is distinct from written texts in several ways, which can sometimes make it more difficult to analyze. Below, we propose analyses of several such phenomena.