SUD data • Version 2.12

In version 2.12 of SUD data, released in May 2023:

The full release SUD 2.12 contains 244 corpora. Note that UD 2.12 has 245 corpora but one corpus cannot be released in the SUD version, because of its CC license which contain the ND (NoDerivative) flags:

Download all corpora

Download the full set of 244 SUD corpora: sud-treebanks-v2.12.tgz.

Native SUD corpora

In the table below, the 7 native SUD corpora are given. Note that each corresponding UD version is obtained by automatic conversion.

Corpus Files Grew-match
SUD_Beja-NSC 2.12latest 2.12latest
SUD_Chinese-PatentChar 2.12latest 2.12latest
SUD_French-GSD 2.12latest 2.12latest
SUD_French-ParisStories 2.12latest 2.12latest
SUD_French-Rhapsodie 2.12latest 2.12latest
🆕 SUD_French-Sequoia 2.12latest 2.12latest
SUD_Naija-NSC 2.12latest 2.12latest
SUD_Zaar-Autogramm 2.12latest 2.12latest

Conversion from UD

Access to each corpus

In the table below, for each corpus you can access to the Grew-match query system.

Corpus Grew-match
Abaza-ATB [Query] [Relations]
Afrikaans-AfriBooms [Query] [Relations]
Akkadian-PISANDUB [Query] [Relations]
Akkadian-RIAO [Query] [Relations]
Akuntsu-TuDeT [Query] [Relations]
Albanian-TSA [Query] [Relations]
Amharic-ATT [Query] [Relations]
Ancient_Greek-Perseus [Query] [Relations]
Ancient_Greek-PROIEL [Query] [Relations]
Ancient_Hebrew-PTNK [Query] [Relations]
Apurina-UFPA [Query] [Relations]
Arabic-NYUAD [Query] [Relations]
Arabic-PADT [Query] [Relations]
Arabic-PUD [Query] [Relations]
Armenian-ArmTDP [Query] [Relations]
Armenian-BSUT [Query] [Relations]
Assyrian-AS [Query] [Relations]
Bambara-CRB [Query] [Relations]
Basque-BDT [Query] [Relations]
Beja-NSC (Native) [Query] [Relations]
Belarusian-HSE [Query] [Relations]
Bengali-BRU [Query] [Relations]
Bhojpuri-BHTB [Query] [Relations]
🆕 Bororo-BDT [Query] [Relations]
Breton-KEB [Query] [Relations]
Bulgarian-BTB [Query] [Relations]
Buryat-BDT [Query] [Relations]
Cantonese-HK [Query] [Relations]
Catalan-AnCora [Query] [Relations]
Cebuano-GJA [Query] [Relations]
Chinese-CFL [Query] [Relations]
Chinese-GSD [Query] [Relations]
Chinese-GSDSimp [Query] [Relations]
Chinese-HK [Query] [Relations]
Chinese-PatentChar (Native) [Query] [Relations]
Chinese-PUD [Query] [Relations]
Chukchi-HSE [Query] [Relations]
Classical_Chinese-Kyoto [Query] [Relations]
Coptic-Scriptorium [Query] [Relations]
Croatian-SET [Query] [Relations]
Czech-CAC [Query] [Relations]
Czech-CLTT [Query] [Relations]
Czech-FicTree [Query] [Relations]
Czech-PDT [Query] [Relations]
Czech-PUD [Query] [Relations]
Danish-DDT [Query] [Relations]
Dutch-Alpino [Query] [Relations]
Dutch-LassySmall [Query] [Relations]
English-Atis [Query] [Relations]
🆕 English-ESLSpok [Query] [Relations]
English-EWT [Query] [Relations]
🆕 English-GENTLE [Query] [Relations]
English-GUM [Query] [Relations]
English-GUMReddit [Query] [Relations]
English-LinES [Query] [Relations]
English-PUD [Query] [Relations]
English-Pronouns [Query] [Relations]
Erzya-JR [Query] [Relations]
Estonian-EDT [Query] [Relations]
Estonian-EWT [Query] [Relations]
Faroese-OFT [Query] [Relations]
Faroese-FarPaHC [Query] [Relations]
Finnish-FTB [Query] [Relations]
Finnish-PUD [Query] [Relations]
Finnish-TDT [Query] [Relations]
Finnish-OOD [Query] [Relations]
French-FQB [Query] [Relations]
French-GSD (Native) [Query] [Relations]
French-ParTUT [Query] [Relations]
French-PUD [Query] [Relations]
French-Sequoia (Native) [Query] [Relations]
French-ParisStories (Native) [Query] [Relations]
French-Rhapsodie (Native) [Query] [Relations]
Frisian_Dutch-Fame [Query] [Relations]
Galician-CTG [Query] [Relations]
Galician-TreeGal [Query] [Relations]
German-GSD [Query] [Relations]
German-HDT [Query] [Relations]
German-LIT [Query] [Relations]
German-PUD [Query] [Relations]
Gothic-PROIEL [Query] [Relations]
Greek-GDT [Query] [Relations]
🆕 Greek-GUD [Query] [Relations]
Guajajara-TuDeT [Query] [Relations]
Guarani-OldTuDeT [Query] [Relations]
Hebrew-HTB [Query] [Relations]
Hebrew-IAHLTwiki [Query] [Relations]
Hindi-HDTB [Query] [Relations]
Hindi-PUD [Query] [Relations]
Hittite-HitTB [Query] [Relations]
Hungarian-Szeged [Query] [Relations]
Icelandic-PUD [Query] [Relations]
Icelandic-Modern [Query] [Relations]
Icelandic-IcePaHC [Query] [Relations]
Indonesian-GSD [Query] [Relations]
Indonesian-PUD [Query] [Relations]
Indonesian-CSUI [Query] [Relations]
Irish-Cadhan [Query] [Relations]
Irish-IDT [Query] [Relations]
Irish-TwittIrish [Query] [Relations]
Italian-ISDT [Query] [Relations]
Italian-MarkIT [Query] [Relations]
Italian-ParTUT [Query] [Relations]
Italian-PoSTWITA [Query] [Relations]
Italian-TWITTIRO [Query] [Relations]
Italian-ParlaMint [Query] [Relations]
Italian-PUD [Query] [Relations]
Italian-Valico [Query] [Relations]
Italian-VIT [Query] [Relations]
Japanese-BCCWJ [Query] [Relations]
Japanese-BCCWJLUW [Query] [Relations]
Japanese-GSD [Query] [Relations]
Japanese-GSDLUW [Query] [Relations]
Japanese-PUD [Query] [Relations]
Japanese-PUDLUW [Query] [Relations]
Javanese-CSUI [Query] [Relations]
Kaapor-TuDeT [Query] [Relations]
Kangri-KDTB [Query] [Relations]
Karelian-KKPP [Query] [Relations]
Karo-TuDeT [Query] [Relations]
Kazakh-KTB [Query] [Relations]
Khunsari-AHA [Query] [Relations]
Kiche-IU [Query] [Relations]
Komi_Permyak-UH [Query] [Relations]
Komi_Zyrian-IKDP [Query] [Relations]
Komi_Zyrian-Lattice [Query] [Relations]
Korean-GSD [Query] [Relations]
Korean-Kaist [Query] [Relations]
Korean-PUD [Query] [Relations]
Kurmanji-MG [Query] [Relations]
🆕 Kyrgyz-KTMU [Query] [Relations]
Latin-ITTB [Query] [Relations]
Latin-LLCT [Query] [Relations]
Latin-Perseus [Query] [Relations]
Latin-PROIEL [Query] [Relations]
Latin-UDante [Query] [Relations]
Latvian-LVTB [Query] [Relations]
Ligurian-GLT [Query] [Relations]
Lithuanian-ALKSNIS [Query] [Relations]
Lithuanian-HSE [Query] [Relations]
Livvi-KKPP [Query] [Relations]
Low_Saxon-LSDC [Query] [Relations]
Madi-Jarawara [Query] [Relations]
🆕 Maghrebi_Arabic_French-Arabizi [Query] [Relations]
Makurap-TuDeT [Query] [Relations]
Malayalam-UFA [Query] [Relations]
Maltese-MUDT [Query] [Relations]
Manx-Cadhan [Query] [Relations]
Marathi-UFAL [Query] [Relations]
Mbya_Guarani-Dooley [Query] [Relations]
Mbya_Guarani-Thomas [Query] [Relations]
Moksha-JR [Query] [Relations]
Munduruku-TuDeT [Query] [Relations]
Naija-NSC (Native) [Query] [Relations]
Nayini-AHA [Query] [Relations]
Neapolitan-RB [Query] [Relations]
Nheengatu-CompLin [Query] [Relations]
North_Sami-Giella [Query] [Relations]
Norwegian-Bokmaal [Query] [Relations]
Norwegian-Nynorsk [Query] [Relations]
Old_Church_Slavonic-PROIEL [Query] [Relations]
Old_East_Slavic-Birchbark [Query] [Relations]
Old_East_Slavic-RNC [Query] [Relations]
Old_East_Slavic-Ruthenian [Query] [Relations]
Old_East_Slavic-TOROT [Query] [Relations]
Old_French-SRCMF [Query] [Relations]
🆕 Old_Irish-DipSGG [Query] [Relations]
🆕 Old_Irish-DipWBG [Query] [Relations]
Old_Russian-RNC [Query] [Relations]
Old_Russian-TOROT [Query] [Relations]
Old_Turkish-Tonqq [Query] [Relations]
Persian-Seraji [Query] [Relations]
Persian-PerDT [Query] [Relations]
Pomak-Philotis [Query] [Relations]
Polish-LFG [Query] [Relations]
Polish-PDB [Query] [Relations]
Polish-PUD [Query] [Relations]
Portuguese-Bosque [Query] [Relations]
Portuguese-PetroGold [Query] [Relations]
Portuguese-PUD [Query] [Relations]
Romanian-ArT [Query] [Relations]
Romanian-Nonstandard [Query] [Relations]
Romanian-RRT [Query] [Relations]
Romanian-SiMoNERo [Query] [Relations]
Russian-GSD [Query] [Relations]
Russian-PUD [Query] [Relations]
Russian-SynTagRus [Query] [Relations]
Russian-Taiga [Query] [Relations]
Sanskrit-UFAL [Query] [Relations]
Sanskrit-Vedic [Query] [Relations]
Scottish_Gaelic-ARCOSG [Query] [Relations]
Serbian-SET [Query] [Relations]
Sinhala-STB [Query] [Relations]
Skolt_Sami-Giellagas [Query] [Relations]
Slovak-SNK [Query] [Relations]
Slovenian-SSJ [Query] [Relations]
Slovenian-SST [Query] [Relations]
Soi-AHA [Query] [Relations]
South_Levantine_Arabic-MADAR [Query] [Relations]
Spanish-AnCora [Query] [Relations]
Spanish-GSD [Query] [Relations]
Spanish-PUD [Query] [Relations]
Swedish-LinES [Query] [Relations]
Swedish-PUD [Query] [Relations]
Swedish_Sign_Language-SSLC [Query] [Relations]
Swedish-Talbanken [Query] [Relations]
Swiss_German-UZH [Query] [Relations]
Tagalog-TRG [Query] [Relations]
Tagalog-Ugnayan [Query] [Relations]
Tamil-TTB [Query] [Relations]
Tamil-MWTT [Query] [Relations]
Tatar-NMCTT [Query] [Relations]
Teko-TuDeT [Query] [Relations]
Telugu-MTG [Query] [Relations]
Thai-PUD [Query] [Relations]
Tupinamba-TuDeT [Query] [Relations]
Turkish-Atis [Query] [Relations]
Turkish-BOUN [Query] [Relations]
Turkish-FrameNet [Query] [Relations]
Turkish-GB [Query] [Relations]
Turkish-IMST [Query] [Relations]
Turkish-Kenet [Query] [Relations]
Turkish-PUD [Query] [Relations]
Turkish-Penn [Query] [Relations]
Turkish-Tourism [Query] [Relations]
Turkish_German-SAGT [Query] [Relations]
Ukrainian-IU [Query] [Relations]
Umbrian-IKUVINA [Query] [Relations]
Upper_Sorbian-UFAL [Query] [Relations]
Urdu-UDTB [Query] [Relations]
Uyghur-UDT [Query] [Relations]
Vietnamese-VTB [Query] [Relations]
Warlpiri-UFAL [Query] [Relations]
Welsh-CCG [Query] [Relations]
Western_Armenian-ArmTDP [Query] [Relations]
Western_Sierra_Puebla_Nahuatl-ITML [Query] [Relations]
Wolof-WTB [Query] [Relations]
Xavante-XDT [Query] [Relations]
Xibe-XDT [Query] [Relations]
Yakut-YKTDT [Query] [Relations]
Yoruba-YTB [Query] [Relations]
Yupik-SLI [Query] [Relations]
Zaar-Autogramm (Native) [Query] [Relations]