SUD Guidelines

SUD is a Surface-syntax Universal Dependencies scheme. SUD follows the Surface syntax criteria (favoring functional heads) and can be automatically converted to the UD scheme.

This page describes the universal principles used in SUD.

Some pages are available for specific usage of SUD in French and Naija.

General Principles

SUD differs from UD in several general principles. The key differences between SUD and UD as well as a table summarizing the most frequent correspondences may be consulted here.

The other layers of annotations follow the UD guidelines. Please refer to UD for these aspects:

Specific SUD relations

SUD has 4 specific syntactic relations and a few extended relations:

SUD shares a number of syntactic relations with UD, the list of which is given below (links to UD related page are given): vocative, compound, dislocated, discourse, appos, det, clf, conj, cc, flat, parataxis, orphan, goeswith, reparandum, punct.

However, we must stress that there are some differences between the usage of some of these relations in UD and SUD. Namely, the relations appos, conj and reparandum are only used when analysing written texts. When analysing oral texts, we use instead the relations conj:appos, conj:coord and conj:dicto respectively. The same goes for parataxis which is used differently in SUD when analysing oral textes. We will explain the details in the section below.

Relations specific to SUD used when analysing oral texts

SUD deep features

In SUD, dependency relations are designed to describe syntactic surface relations. Information related to deep syntax or semantics is given on dependencies with deep features which are extensions to dependency label introduced by the @ symbol.

The main deep features are: @agent, @caus, @expl, @lvc, @pass, @relcl, @tense, @x.

Particular linguistic phenomena

For each linguistic phenomenon below, there is an explanation of how SUD takes it into account.

Usages of the ExtPos feature

In SUD, the External POS (ExtPos) feature is used to designate multi-word units which together behave like a certain part of speech, even though none of their constituents carry that part of speech. It can also be used for cases where the internal POS of a token is different its usage. A detailed description of the ExtPos feature and its usages can be found here and on the Idioms and titles page.

Analyzing phenomena specific to spoken language

SUD is also used to annotate oral corpora. Spoken language is distinct from written texts in several ways, which can sometimes make it more difficult to analyze. Below, we propose analyses of several such phenomena.