The Surrey Morphology Group have produced a large number of freely accessible typological databases that cover a wide range of morphological phenomena.
Latest news and database accessibility
- Two new databases will soon join the existing suite of resources in early 2015: An ERC-funded database on morphological complexity in a typological perspective and an AHRC/ESRC-funded database on morphological complexity in the Oto-Manguean languages of Mexico.
- In order to ensure their long term sustainability, the existing databases are currently undergoing an infrastructure and design update to be completed in early 2015. Consequently, some of the databases are currently located at a temporary location and, for the time being, existing bookmarks/searches to these resources may not resolve to the correct page. However, you can still access all of the databases using the links below.
The term 'syncretism' refers to the phenomenon whereby a single form fulfils two or more different functions in the within the inflectional morphology of a language. This database reports on a sample of 30 languages.
Person syncretism occurs when two or more person values are represented by a single form in the inflectional paradigm for agreement with an argument on verbs. This database reports on a sample of 111 languages.
Agreement is the expression of grammatical information in the ‘wrong place’: a relation that can be described in terms of controllers, targets, domains, categories and conditions. This database reports on a sample of 15 languages.
Suppletion is a morphological phenomenon where different inflectional forms of the same sign are maximally regular in their semantics, yet maximally irregular in form. This database reports on a sample of 34 languages.
Deponency describes mismatches between morphology and morphosyntax. A mismatch occurs where the word form is used in some function incompatible with its normal function. There are two deponency databases:
- Deponency - typological database (searchable by features)
- Deponency - cross-linguistic database (100 languages, searchable by language name)
The notion of 'short term morphosyntactic change' can be used to characterise changes in the use of forms in a short period of time even when the forms themselves have changed relatively little. There are six short term morphosyntactic change (STMC) databases covering different change phenomena in Russian over a 200 year period:
- STMC - Case of modifier in phrases with ‘two’, ‘three’, ‘four’
- STMC - Case assignment on predicate nouns
- STMC - Predicatve adjectives
- STMC - Predicate agreement with quantified expressions
- STMC - Case assignment on direct objects of negated transitive verbs
- STMC - Predicate agreement with conjoined noun phrases
The term 'defectiveness' refers to gaps in inflectional paradigms — specifically, gaps which do not appear to follow from natural restrictions imposed by meaning or function. There are two defectiveness databases:
- Defectiveness - typological database (searchable by features)
- Defectiveness - cross-linguistic database (100 languages, searchable by language name)
Periphrasis reveals how the construction of meaning in language is apportioned between morphology ('bright' and 'brighter') and syntax ('intelligent' and 'more intelligent'). This database reports on a sample of 19 languages.
Features are fundamental components of linguistic description. They have proven invaluable for grammatical analysis and and have a major role in contemporary linguistics.
Possessive morphology marking owners or custodians may be used as a source of subject-indexing marking actors or agents in the languages surrounding the Bougainville region of Papua New Guinea.
Examples from fieldwork and select recordings on the SENĆOŦEN language of the Saanich people collected from community elders are organised according to their role as sentences, verbs or roots.
The online version of the Archi-Russian-English dictionary contains sound files for every word form of the lexeme, digital pictures of culturally significant objects, idioms and example sentences with interlinear glossing.
The databases can be accessed without cost. Users agree not to pass on the databases to third parties and to properly acknowledge the database as the source of information in publications or manuscripts that make use of its data. When citing information obtained from a database query, please mention the authors, date, and the name of the database and address of the database homepage (all of this information is available from the database homepage). It is also a good idea to give the date you last accessed the database.
- A dictionary of the Archi (Daghestanian) language
- Archi language page
- Archi wiki
- Northwest Solomonic
- Agreement Bibliography
- Canonical Typology Bibliography
- Network Morphology Bibliography
- Slavonic Number Bibliography
- Slavonic Agreement
- Suppletion Bibliography
- Short Term Morphosyntactic Change Bibliography
- Syncretism Bibliography
- Download the Excel spreadsheet of nouns from the Uppsala corpus (product of the ESRC project 'Number Use in Language'). For each of the approx. 5440 noun lexemes the spreadsheet includes information on frequency, animacy (modified version of the Smith-Stark hierarchy), as well as counts for occurrences of particular case and number combinations.
There is a readme file for the spreadsheet.
- Diachronic datasets of Slavonic colour terms: East Slavonic, South Slavonic, West Slavonic.
- Datasets associated with the ESRC project 'Paradigms in Use'.
DATR Fragments & theorem dumps
- DATR fragments from the ESRC projects 'A DATR Theory of Russian Morphology' and 'The Theory of Network Morphology' are available from the Sussex DATR Archive.
- The Semantics of Gender in Mayali: Partially Parallel Systems and Formal Implementation.
- Dalabon pronominal prefixes and the typology of syncretism: a Network Morphology analysis.
- DATR analyses of typologically distinct deponent paradigms (from the ESRC project 'Extended deponency').
- Theorems from polish15.dtr.
- View our statistical model of plural proportions from the Uppsala corpus (from the ESRC project 'Number Use in Language').