Paradigms in Use

Description

The nature of the relationship between frequency of use and grammar in natural language is poorly understood. In order to understand this relationship better, we propose to look at textual frequency distributions in a language which encodes a reasonable number of grammatical distinctions in its word forms, namely Russian. We have developed for other purposes a precise, computationally verified hierarchical model of Russian morphology. In this project we propose to take the next logical step, namely to use this model to determine how far distinct categorizations within the model correspond to differences in use in Russian texts. In order to achieve this we will look at a specific kind of construct which we have already investigated cross-linguistically. This is syncretism, or grammatical ambiguity, where one form can have multiple functions. The major new element in this project is to investigate the relationship between frequency of use and syncretism based on corpus analysis. There are different types of syncretism (and we have reflected this by locating them at different points in the hierarchy of our formal model); this makes syncretism an ideal construct to use to investigate the more general and harder question of the relationship between textual frequency and grammar.

 

Appendix 1 :  Outputs

Datasets

Tiberius, Carole. 2005.  Grammatical functions, inflectional class and textual frequency in Russian nominals.
Dataset available here

The following articles and presentations have arisen wholly or in part from the project:

Publications

Baerman, Matthew, Dunstan Brown and Greville G. Corbett. 2005. The syntax-morphology interface: a study of syncretism. Cambridge University Press [Monograph in the Cambridge Studies in Linguistics series]

Brown, Dunstan. Forthcoming. Peripheral Functions and Overdifferentiation: the Russian Second Locative. To appear in Russian Linguistics.

Brown, Dunstan. Forthcoming. Morphological Typology. Invited submission in Jae Jung Song (ed.) Handbook of Linguistic Typology. Oxford: Oxford University Press.

Brown, Dunstan. 2005. Declension and conjugation. In D. Alan Cruse, Franz Hundsnurscher, Michael Job, Peter Rolf Lutzeier (eds.) Lexicology: An international handbook on the nature and structure of words and vocabularies. Volume 2. Walter de Gruyter. New York. 1646-1655.

Brown, Dunstan, Carole Tiberius and Greville Corbett. (submitted). The Alignment of Form and Function: Corpus-Based Evidence. Submitted to International Journal of Corpus Linguistics.

Brown, Dunstan, Carole Tiberus and Greville Corbett. 2004. Inflectional Syncretism and Corpora. In Silvia Hansen-Shirra, Stephan Oepen and Hans Uszkoreit (eds.) Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora. 29 August. Coling 2004. Geneva. 11-18.

Evans, Roger, Carole Tiberius, Dunstan Brown and Greville Corbett. 2003. A large-scale inheritance-based morphological lexicon for Russian. In Tomaž Erjavec and Duško Vitas (eds) Proceedings of the Workshop on Morphological Processing of Slavic Languages, 10th Conference of the European Chapter of the Assocation for Computational Linguistics, April 12-17. EACL: Budapest. 9-16.

[Also available online as NLTG Technical Report at http://www.itri.bton.ac.uk/techreports/index.html#ITRI-03-02 ]


Presentations

Baerman, Matthew, Dunstan Brown, Marina Chumakina, Greville Corbett, Andrew Hippisley and Carole Tiberius. 2004. The Surrey Typological Databases. Paper presented at the Annual Meeting of the LTRC Network, Freie Universität Berlin, 2-3 October, 2004.

Brown, Dunstan. 2003a. Peripheral functions: the Russian Second Locative. Paper presented at the Workshop on Paradigm Irregularities, Manchester University, 10-12 April, 2003.

Brown, Dunstan. 2003b. Inflectional syncretism and Network Morphology. Paper presented at the First York-Essex Morphology Meeting (YEMM), University of York, 29-30 November, 2003.

Brown, Dunstan, Greville Corbett, Carole Tiberius. 2003. The asymmetry of syncretism: how theory plays out in a corpus. Paper presented at the Autumn Meeting of the LAGB. Oxford, 4-6 September, 2003.

Brown, Dunstan and Carole Tiberius. 2004. Syncretism and textual frequency in Russian. Paper presented at ITRI seminar. ITRI, University of Brighton, United Kingdom. 10 June, 2004.

Brown, Dunstan and Carole Tiberius. 2005a. On the adequacy of frequency as a predictor of syncretism: an example from Russian. Paper presented at SOAS Linguistics Department Seminar, SOAS, London, 3 May, 2005.

Brown, Dunstan and Carole Tiberius. 2005b. Frequency and the alignment of form and function. Paper presented at the LAGB annual meeting. Cambridge University, 31 August – 3 September, 2005.

Brown, Dunstan, Carole Tiberius and Greville Corbett. 2005. Theoretical morphology viewed from corpus linguistics: the example of Russian syncretism. Paper presented at British Association of Slavonic and East European Studies annual conference, Cambridge University, 2-4 April, 2005.

Evans, Roger, Dunstan Brown, Carole Tiberius. 2003. DATR as a corpus analysis tool for Russian. Paper presented at Morphology Meeting. University of Surrey. United Kingdom. 12 February, 2003.

Evans, Roger, Carole Tiberius, Dunstan Brown and Greville Corbett. 2003a. A large-scale inheritance-based morphological lexicon for Russian. Paper presented at the Workshop on Morphological Processing of Slavic Languages, 10th Conference of the European Chapter of the Assocation for Computational Linguistics, April 13. EACL: Budapest. Hungary

Evans, Roger, Carole Tiberius, Dunstan Brown and Greville Corbett. 2003b. Russian Lemmatisation with DATR. Paper presented at SLOVKO 2003: second international seminar on computer treatment of Slavonic languages. Bratislava, Slovakia. 24,25 October 2003. [Available as NLTG Technical Report at http://www.itri.bton.ac.uk/techreports/index.html#ITRI-03-23]

Tiberius, Carole, Dunstan Brown and Greville Corbett. 2003. Ambiguity in Russian morphology. Paper presented at Corpus Linguistics 2003. Lancaster University, United Kingdom. 30 March 2003. [Abstract in Dawn Archer, Paul Rayson, Andrew Wilson and Tony McEnery Proceedings of Corpus Linguistics 2003. University Centre for Computer Corpus Research on Language Technical Papers Volume 16 – Special issue. Lancaster. 790]