A Computer Implementation of Russian Derivational Morphology

A Computer Implementation of

Russian Derivational Morphology

represented in DATR

 

Project funded by the Leverhulme Trust (F.242M)

directed by Professor G. Corbett and Dr. N. Fraser

University of Surrey

Research fellow: Andrew Hippisley

Completed 1995

 

SUMMARY:

The project was designed to produce a formally explicit account of a substantial fragment of the word-stock of Russian, encoded in such a way that it could be computationally checked. The formalism chosen was DATR (Evans and Gazdar 1989), a lexical knowledge representation language that is computable, and captures lexical knowledge in terms of default inheritance structures. The area under investigation was morphology (the structure of words) and, in particular, derivational morphology (the building of new words from old). A significant number of publications have followed, as well as numerous presentations (see “outputs” on web page). The project was significantly strengthened through cross-fertilisation of ideas with a parallel ESRC project on Russian inflectional morphology, resulting in two joint publications, Brown and Hippisley (1994), Brown et al. (1996).

1. Ground covered

1.1. a substantial fragment of Russian derivational morphology in machine readable form

This is the goal around which the whole project revolved, and whose achievement on both the qualitative, and quantitative level has signified the formal end of the project.

 

The qualitative objective has been met by:

 

(i) representing productive derivation formally;

 

(ii) focusing on suffixation (the main operation in Russian derivation);

 

(iii) analysing nouns and adjectives (which represent most derivatives).

 

On the quantitative level, we have compiled a lexicon of approximately 4,000 words (derivatives and their bases) whose forms are correctly generated by the DATR theory (see section E.2). This satisfies one of the outcomes promised in the proposal.

 

1.2. a hierarchical organisation of the facts

The English suffix -ness productively derives nouns from adjectives, as in good > goodness, whereas -th is limited to warmth and a few others. This can be expressed in DATR by making -ness available as a default to all adjectives. Exceptions like -th can be made to override the default. The facts of Russian are dealt with similarly in our DATR theory, though they are somewhat more complex in their detail.

 

1.3. new insights into specific problem areas of Russian derivational morphology

(i) An account of rival affixes and DATR forms the basis of Hippisley (1995b). Usually several affixes fulfil the same function. The selection of the correct rival suffix may depend on the word-class of the base. Russian -tel´ and -n´ik both create person nouns, but the former attaches to verbs and the latter to nouns. For example, učit´ 'teach' is a verb, hence the person noun will be itel´ 'teacher', but sapog(a) 'boot' is a noun, and therefore the person noun is sapožn´ik 'cobbler'. This is encoded in our DATR account by means of nodes representing the major word-classes from which bases are marked to inherit, hence capturing in a natural way Aronoff's (1976) Unitary Base Hypothesis. Alternatively, selection may depend on the semantics of the base. The relational adjective of učitel´ 'teacher' is itel´sk(ij) but that of Šum 'noise' is Šumov(oj) because -sk attaches to bases which have the semantic feature 'person'.

 

(ii) A derivative's semantics is composed of the semantics of the deriving word, and the function of the derivation; for example, lopat(a) means 'spade', and lopatk(a) 'little spade', where 'diminutive' is realised by -k. However, lopatk(a) also has the unpredictable meaning 'shoulder-blade'. The diminutive meaning is the default, and the 'semantic drift' is captured by encoding the additional meaning lexically .

 

(iii) A productive derivational pattern is one that is operative at the current time. For example, the productive derivation of person nouns is in -n´ik, as in sapog(a) 'boot' > sapožn´ik 'cobbler' (see (i) above). In our account, productive patterns are naturally accommodated as high level defaults in the inheritance hierarchy. Some nouns, however, do not conform to the productive pattern. For example rib(a) 'fish' conforms to a rival (unproductive) pattern where person derivation is marked by -ak, i.e. ribak 'fisherman'. Here we mark the alternative derivation as part of rib(a)'s lexical characteristics, thereby overriding the default in favour of more specific information. Thus we account for the way in which the putative derivative *ribn´ik has been blocked by the already existing ribak, in terms of defaults and overrides.

 

2. Personnel

Directors: Prof. G.G. Corbett, Dr. N. Fraser.

Research Fellow: Mr A. Hippisley.

Advisors (and speciality): Prof. G. Gazdar (DATR), Dr. A. Spencer (Morphology), Prof. A. Timberlake (Russian phonology).

Visitors: Tore Nesset (Oslo), Prof. J. Nørgård-Sørensen (Copenhagen), Prof. D. Wunderlich (Düsseldorf).

 

3. Conclusions

The project has successfully achieved the goals defined at its outset.

 

3.1. formal account of Russian derivation

This has been achieved by the production of a substantial fragment of Russian derivational morphology encoded in DATR, as outlined in 1.1.

 

3.2. linguistically significant generalisations about Russian

Since there is rarely a one to one correspondence between meaning and affix, a lexeme-based approach elegantly captures the situation in Russian. Affixes do not contain meaning as such, rather, a certain interpretation is predicted by their presence in a lexeme. Hence the lexeme is the minimal linguistic sign. This has been the theme of presentations before and after the close of the project (see section C).

 

3.3. generalisations about language: the interaction of inflection and derivation

Exploring how derivational and inflectional morphology interface with each other has led to publication on Russian expressive derivation (Hippisley 1995a; Hippisley forthcoming). We have compared our approach to that of the parallel ESRC project on inflection:

 

• The derivation hierarchy 'feeds' the inflection hierarchy, reflecting the fact that in a word's structure, derivation precedes inflection: in t´iran-stv(o) 'tyranny', -stv is derivational and (o) inflectional (marking singular nominative).

 

• In exceptional circumstances, the inflectional hierarchy feeds the derivational hierarchy by providing it with formal gender, which is an inflectional property as demonstrated in Fraser & Corbett (1995), Corbett & Fraser (2000), Hippisley (1996).

 

4. Significance

practical

The machine readable account of Russian derivation and the accompanying substantial lexicon make an important contribution to the field of natural language processing, significantly in respect of one of the world's major languages.

theoretical

Most work on DATR to date relates to inflection; our contribution is one of the few derivational accounts. Moreover, ours is the first lexeme-based (as opposed to morpheme-based) account of which we are aware (see 3.2.).

futurework

• Based in part on work done in this project, we are now developing the theory of Network Morphology, for which an ESRC grant has been awarded.

• Under the British Council-DAAD ARC scheme, we will be sharing findings in morphology with a project at Düsseldorf.

• An ESRC funded seminar series 'Frontiers of Research on Morphology' has been set up with Surrey as co-ordinator; in that context, results from this project provide one focus for debate in the group, which includes researchers in morphology from the Universities of Brighton, Essex, London (SOAS) and Sussex.

 

5. Evaluation

5.1. successful aspects

• considerable size and scope of fragment of Russian derivational morphology

• coherent account of key derivational issues

• demonstration of a lexeme-based approach and its significance for linguistics generally

• derivation ~ inflection interface (successful collaboration with ESRC project)

• document recording the significance of DATR for linguists, promised as an outcome (see section E.2).

 

5.2. less successful aspects

Because of the magnitude of this subject, it was inevitable that we would miss some areas, for example we have not fully investigated the role of prefixation in derivation and mutations.

 

6. References

Aronoff, Mark. 1976. Word Formation in Generative Grammar. Cambridge, Mass: MIT Press.

Brown, D. and Hippisley, A. 1994. Conflict in Russian Genitive Plural Assignment: A Solution Represented in DATR. Journal of Slavic Linguistics 2, 1 (winter - spring): 48-76.

Brown, D., Corbett, G., Fraser, N., Hippisley, A. and Timberlake, A. 1996. Russian Noun Stress and Network Morphology. Linguistics 34,  53-107

Corbett, G. and Fraser, N. 2000. Default genders. In: Barbara Unterbeck, Matti Rissanen, Terttu Nevalainen & Mirja Saari (eds) Gender in Grammar and Cognition (Trends in Linguistics: Studies and Monographs 124). Berlin: Mouton de Gruyter, 55-97. Reprinted 2002 in the Mouton Jubilee collection “Mouton Classics: From Syntax to Cognition: From Phonology to Text”, volume I, 297-339.

Evans, Roger and Gazdar, Gerald 1989. Inference in DATR. Proceedings of the 4th Conference of the European Chapter of the Association for Computational Linguistics, 66-71. Manchester, England.

Fraser, N. and Corbett, G. 1995. Gender, animacy and declensional class assignment: a unified account for Russian. In: Geert Booij and Jaap van Marle (eds) Yearbook of Morphology 1994, 123-50. Dordrecht: Kluwer.

Hippisley, A. 1995a. Nasledovanie po umolčaniju i slovoobrazovanie: Èkspressivnaja derivacija v russkom jazyke, predstavlennaja v DATR/Expressive derivation in Russian represented in DATR. [Abstract] In: A. E. Kibrik, I. M. Kobozeva, A. I. Kuznecova, T. B. Nazarova (eds) Lingvistika na isxode XX veka: itogi i perspektivy: Tesisy meždunarodnojkonferencii: tom I, 1995, 524-526. Moscow: Filologičeskij fakultet MGU imeni M. V. Lomonosova.

Hippisley, A. 1995b. Default Inheritance and Russian Word-Formation: an Account of Russian Denominal Adjectives Represented in DATR. Under consideration by Canadian Slavonic Papers.

Hippisley, A.1996. Russian Expressive Derivation: a Network Morphology Account. Slavonic and East European Review 74:201-222.

 ·vedova , N.Ju. et al. (eds) 1980. Russkaja grammatika vol. I. Moscow: AN SSSR.

Tixonov, A. N. 1985. Slovoobrazovatel´nyj slovar´ russkogo jazyka. Moskva: Russkij jazyk.

Zaliznjak, A. A. 1977. Grammatičeskijslovar' russkogo jazyka. Moscow: Russkij jazyk.

 

7. Outputs

For the published outputs, see below and for the DATR fragments, go to the Sussex DATR pages and search for ‘Hippisley’ (there are several entries).

 

Publications (in chronological order)

Corbett, G. and Fraser, N. 1993. Network Morphology: a DATR account of Russian nominal inflection. Journal of Linguistics 29: 113-42.

Brown, D. and Hippisley, A. 1994. Conflict in Russian Genitive Plural Assignment: A Solution Represented in DATR. Journal of Slavic Linguistics 2, 1 (winter - spring): 48-76.

Corbett, G. 1994. Systems of Grammatical Number in Slavonic. Slavonic and East European Review, 72, 2. 201 - 17.

Brown, D., Corbett, G., Fraser,  N., Hippisley, A. and Timberlake, A. 1996. Russian Noun Stress and Network Morphology.  Linguistics 34, 1.

Corbett, G. and Fraser, N. 1995.  Vycislitel’naja lingvistika i tipologija/Computational linguistics meets typology.  [Abstract]  In: A. E. Kibrik, I. M. Kobozeva, A. I. Kuznecova, T. B. Nazarova (eds) Lingvistika na isxode XX veka: itogi i perspektivy: Tesisy meždunarodnoj konferencii: tom I, 256-258.  Moscow: Filologieskij fakultet MGU imeni M. V. Lomonosova.

Fraser, N. and Corbett, G. 1995. Gender, animacy and  declensional class assignment: a unified account for Russian. In: Geert Booij and Jaap van Marle (eds) Yearbook of Morphology 1994,  123-50. Dordrecht: Kluwer.

Hippisley, A. 1995a. Nasledovanie po umolaniju i slovoobrazovanie: Èkspressivnaja derivacija v russkom jazyke, predstavlennaja v DATR/Expressive derivation in Russian represented in DATR. [Abstract] In: A. E. Kibrik, I. M. Kobozeva, A. I. Kuznecova, T. B. Nazarova (eds) Lingvistika na isxode XX veka: itogi i perspektivy: Tesisy medunarodnoj konferencii: tom I, 1995, 524-526.  Moscow: Filologieskij fakultet MGU imeni M. V. Lomonosova.

Hippisley, A. 1996  Network Morphology and Russian Expressive Derivation. Slavonic and East European Review 74, 2.