Web resources

Databases

The Surrey Morphology Group have produced a large number of freely accessible typological databases that cover a wide range of morphological phenomena.

Latest news and database accessibility

  • New database on morphological complexity, funded by the European Research Council, available here.
  • Coming soon a new database on morphological complexity in the Oto-Manguean languages of Mexico, funded by the AHRC and ESRC.
  • In order to ensure their long term sustainability, the existing databases are currently undergoing an infrastructure and design update to be completed in early 2015. Consequently, some of the databases are currently located at a temporary location and, for the time being, existing bookmarks/searches to these resources may not resolve to the correct page. However, you can still access all of the databases using the links below. 

Morphological complexity

This database records examples of what we have identified as morphological complexity, which we define as the morphologically-conditioned deviation between inflectional forms and the inflectional features they realize. This is manifested both within the paradigm (e.g. as syncretism or patterns of stem alternation) and across sets of lexemes (as inflection classes and lexically-conditioned allomorphy). The database covers the language families as given by Ethnologue, and provides further depth for particular hotbeds of complexity.

Syncretism database

The term 'syncretism' refers to the phenomenon whereby a single form fulfils two or more different functions in the within the inflectional morphology of a language. This database reports on a sample of 30 languages.

Person syncretism database

Person syncretism occurs when two or more person values are represented by a single form in the inflectional paradigm for agreement with an argument on verbs. This database reports on a sample of 111 languages.

Agreement database

Agreement is the expression of grammatical information in the 'wrong place': a relation that can be described in terms of controllers, targets, domains, categories and conditions. This database reports on a sample of 15 languages.

Suppletion database

Suppletion is a morphological phenomenon where different inflectional forms of the same sign are maximally regular in their semantics, yet maximally irregular in form. This database reports on a sample of 34 languages.

Deponency databases

Deponency describes mismatches between morphology and morphosyntax. A mismatch occurs where the word form is used in some function incompatible with its normal function. There are two deponency databases:

Short term morphosyntactic change databases

The notion of 'short term morphosyntactic change' can be used to characterise changes in the use of forms in a short period of time even when the forms themselves have changed relatively little. There are six short term morphosyntactic change (STMC) databases covering different change phenomena in Russian over a 200 year period:

Defectiveness databases

The term 'defectiveness' refers to gaps in inflectional paradigms - specifically, gaps which do not appear to follow from natural restrictions imposed by meaning or function. There are two defectiveness databases:

Periphrasis database

Periphrasis reveals how the construction of meaning in language is apportioned between morphology ('bright' and 'brighter') and syntax ('intelligent' and 'more intelligent'). This database reports on a sample of 19 languages.

Features inventory

Features are fundamental components of linguistic description. They have proven invaluable for grammatical analysis and and have a major role in contemporary linguistics.

Turning owners into actors database

Possessive morphology marking owners or custodians may be used as a source of subject-indexing marking actors or agents in the languages surrounding the Bougainville region of Papua New Guinea.

Saanich verb database

Examples from fieldwork and select recordings on the SENOŦEN language of the Saanich people collected from community elders are organised according to their role as sentences, verbs or roots.

A dictionary of the Archi (Daghestanian) language

The online version of the Archi-Russian-English dictionary contains sound files for every word form of the lexeme, digital pictures of culturally significant objects, idioms and example sentences with interlinear glossing.

The databases can be accessed without cost. Users agree not to pass on the databases to third parties and to properly acknowledge the database as the source of information in publications or manuscripts that make use of its data. When citing information obtained from a database query, please mention the authors, date, and the name of the database and address of the database homepage (all of this information is available from the database homepage). It is also a good idea to give the date you last accessed the database.

Language resources

Annotated bibliographies

Datasets

  • Download the Excel spreadsheet of nouns from the Uppsala corpus (product of the ESRC project 'Number Use in Language'). For each of the approx. 5440 noun lexemes the spreadsheet includes information on frequency, animacy (modified version of the Smith-Stark hierarchy), as well as counts for occurrences of particular case and number combinations.
    There is a readme file for the spreadsheet.
  • Diachronic datasets of Slavonic colour terms: East Slavonic, South Slavonic, West Slavonic.
  • Datasets associated with the ESRC project 'Paradigms in Use'.

DATR Fragments & theorem dumps

Statistical Models

  • View our statistical model of plural proportions from the Uppsala corpus (from the ESRC project 'Number Use in Language').

Page Owner: t00356
Page Created: Tuesday 4 May 2010 16:46:46 by t00356
Last Modified: Monday 8 June 2015 10:31:01 by pe0007
Expiry Date: Thursday 4 August 2011 16:45:56
Assembly date: Fri Jul 03 17:09:32 BST 2015
Content ID: 27299
Revision: 32
Community: 1199