The Features project, begun at the University of Surrey in November 2004, aims to deepen the knowledge of the linguistic concept 'feature' by bringing together typological research on the content of features with formal work on their behaviour. One of the objectives of the project has been to produce an inventory of morphosyntactic features, listing features proposed, with sources pointing to the decisive evidence. The inventory was envisaged as a stepping-stone for asking: What can be a feature? What features occur across different components? How do features interact? What potential features, as inferred from the patterns of the occurring features, appear to be missing from the feature inventory? The typological approach was meant to help ensure that any proposed feature theory could meet the range of diversity found in natural language.
Below you will find the inventory of features which we have found to be morphosyntactic, and a list of others which we investigated but found no evidence for their morphosyntactic status. The catalogue of the various types and uses of features has constituted the basis for our theoretical conceptualisation of the notion 'feature'. It has helped us demonstrate the type of features on which linguistic theory can legitimately call and the implications of adopting different theoretical perspectives on features while using them for the same descriptive goals. Many interim results of the project have been presented at conferences, invited talks and lectures, and in a number of published papers (see our Outputs page for a detailed list). A monograph on Grammatical Features is in preparation.
We will be very glad to receive comments or corrections regarding the Inventory of morphosyntactic features proposed on these pages. If you would like to cite the material provided here, we have included information about how to do it at the bottom of this webpage. Thank you.
The project has been supported by the ESRC (grant number RES-051-27-0122). This support is gratefully acknowledged.
– Anna Kibort & Greville Corbett
|What are morphosyntactic features|
|Inventory of morphosyntactic features|
|Our world-wide query about features|
|Grammatical Features website|
|Conventions for glossing feature values|
by Anna Kibort
Inflected words show variation in form. The different forms are correlated with meanings or functions which we label as 'features'. However, not all features that are identified through inflectional morphology are morphosyntactic. The most basic definition of a morphosyntactic feature is a feature which is relevant to syntax. For a feature, to be 'relevant to syntax' means that it is involved in either syntactic agreement or government. Gender, number, and person are involved in agreement in a large number of languages, therefore they are typical morphosyntactic features. However, while in many familiar languages the feature 'tense' encodes regular semantic distinctions, it is not required by the syntax through the mechanisms of either agreement or government: syntax is not sensitive to the tense value of the verb. Therefore, many familiar instances of the feature 'tense' are morphosemantic, but not morphosyntactic.
Features and values
In discussion of features, labels such as 'gender', 'person', or 'tense' are often used to refer both to the value of the feature and to the feature as such. For example, the term 'gender' is used both for the particular classes of nouns (so, a language may have two or more genders) and for the whole grammatical category (so, a language may or may not have the category of gender). Similarly, we refer to an 'inventory of features' (meaning, categories, or features as such), while at the same time talk about 'feature checking', or 'unification of features' in syntax (meaning, checking or unifying feature specifications, i.e. feature values). However, it is important to remember the distinction between 'features' and their 'values' while attempting to construct any taxonomy or typology of features, because the characteristics or behaviour of the feature as such will not be the same as the characteristics of a feature value.
The relationship between the concept of 'gender' and the concepts 'masculine', 'feminine', 'neuter'; or between the concept 'case' and the concepts 'nominative', 'accusative', 'genitive', etc., has been referred to with the following pairs of terms (based on Castairs-McCarthy 1999:266-267, expanded):
Following Zwicky (1985), we use terms 'feature' and 'value', which correspond to Matthews's (1972/1991) terms 'category' and 'property/feature', respectively. Although the concepts 'masculine', 'feminine', 'neuter'; or the concepts 'nominative', 'accusative', 'genitive', etc., are all 'values', further questions can be asked about the relationship between them. One question concerns the partitioning of the feature space in general between the available values (see, for example, attempts to arrive at definitions of gender and number values for an ontology of linguistic description), another concerns the structuring within the values available for a particular feature in a particular language (see, for example, the structuring of gender values discussed in Corbett 1991, or the structuring of number values discussed in Corbett 2000).
Establishing an inventory of features and values for a language can be a complex issue. An example of a language in which justifying a feature requires careful analysis is Archi (Daghestanian), and the feature in question is person. This language has no phonologically distinct forms realising person and the standard description of Archi does not involve the feature person (only gender and number). However, the agreement patterns in Archi indicate that this language does require us to recognise a person feature, even though it is a non-autonomous one (Chumakina, Kibort & Corbett 2007).
There are also many examples in the literature of argumentation regarding the number of values for various features in different languages. In an inferential-realisational approach to inflectional morphology, which we adopt after Stump (2001), we identify the realisations of feature values by establishing a paradigm correlating inflected forms with morphosyntactic properties. The cells in a lexeme's paradigm are regarded as pairings of a stem with a morphosyntactic property (or a morphosyntactic property set), to yield an inflected word form which is the realisation of the pairing. Examples of how to establish the paradigm for case in Russian can be found in Zaliznjak (1973) and Comrie (1986). Following on from the work of Schenker (1955) and Zaliznjak (1964), Corbett (1991: Chapter 6) describes principles for determining the number of gender values in a language, giving sample analyses of gender in Romanian, Telugu, Lak, Tamil, Chibemba, Slovene, Tsova-Tush, and other languages.
Agreement and government
Both agreement and government are concepts that are necessary to describe inflectional morphology. Both involve specifying, or determining, a feature value on an element in the clause. In the case of agreement we call this element 'target', and in the case of government we call it 'governee'. In both agreement and government the demand for the specific feature value comes from elsewhere (i.e. not from the target or the governee): it comes from a 'controller' (in the case of agreement), or from a 'governor' (in the case of government). In this way, agreement and government 'share the characteristic of being syntactic relations of an asymmetric type' (Corbett 2006:8). However, while a controller of agreement bears the feature value it requires of its target (this has been referred to as 'feature matching'), a governor does not bear the feature value it requires of its governee (we can refer to this as 'feature branding'). Despite this general principle, note that agreement mismatches may occur for various reasons (see Corbett 2006: Chapter 5), and a governor may have the relevant feature specification coincidentally.
Both agreement and government can apply to more than one element in the clause simultaneously, resulting in multiple occurrence of the same feature specification in the domain. In agreement, we find that an element may control a set of targets in the clause (and beyond). In government, we find that an element typically governs a unit consisting of one or more elements. The most familiar example of government of a feature over a unit is the assignment of case to (the elements within) a noun phrase. When a noun and its adjectival modifier are in the same case, it is because the case value is imposed on both simultaneously. Corbett (2006:133-135) discusses the possibility of viewing this type of feature multirepresentation as agreement and concludes that it 'will not count as canonical agreement, if we take seriously the issue of asymmetry'. If one accepts the view of syntax which is based on the notion of constituency, when the noun and its modifier are a constituent 'it follows that we have matching of features within the noun phrase resulting from government (rather than agreement in case)'. Note that the same analysis holds for languages which allow more than one case to be stacked. This is characteristic of many Australian languages including Kayardild (Evans 1995) whose multiple case marking has been offered for consideration as agreement (Evans 2003). Following the heuristic developed for the feature inventory, some types of multiply represented cases in Kayardild have now been re-classified as instances of government of case over a noun phrase (specifically, this analysis applies to the multiple case marking of some relational, adnominal, and verbalising cases in Kayardild, as well as to the multiple marking of associating case, which is a type of case governed by the nominalised verb over the nominal phrase it heads; for details see Evans 1995 and 2003, and for discussion and reanalysis see Kibort forthcoming).
Multirepresentation of morphosemantic features
Apart from agreement and government, we find one other source of simultaneous inflectional marking of the same information on more than one element in the domain: semantic choice. We find that the same feature value may be assigned on the basis of semantics to several elements which are members of a constituent or an 'informational unit', e.g. a noun phrase, verb phrase, verbal complex, or the clause. In this situation, multiple elements will be expressing the same value of a morphosemantic feature simultaneously. The 'rule' that determines which elements have to bear particular inflections is found in the lexicon in the form of a generalisation over relevant parts of speech or a subclass within a part of speech. The clearest examples of a feature value which is semantically imposed on several elements simultaneously are: semantic case imposed on a noun phrase, or a verbal feature imposed on (the elements making up) a verbal complex.
It is also possible that simultaneous marking of the same information on more than one element in the clause could be due to a semantic or pragmatic choice made at each element individually for the same semantic or pragmatic reason ('what's once true stays true'). In other words, the multirepresentation of a feature value in the clause could be due to coinciding individual semantics of the elements bearing the feature value. A clear example of this type of multirepresentation of a feature value would be a feature of respect whose marking could be justified semantically on every element on which it appears. Corbett (2006:137) remarks: 'There are ... languages where the existence of multiple honorifics suggests an agreement analysis, but where it is not clear that this is justified. It may be argued that each honorific is determined on pragmatic grounds (and that they agree only in the sense that they are being used in the same pragmatic circumstances).'
Finally, some semantically justified multiple marking of information in the domain may, arguably, not be an expression even of a morphosemantic feature. This applies to instances such as the so-called 'negative concord', where the principal marker of information (negation) is there or not, and when it is there, it requires the presence of the second negation marker. Arguably, the phenomenon does not qualify to be a feature because the 'positive' polarity is not information that can be assigned to a value - it is, rather, simply lack of information. Corbett (2006:29) suggests that where the selection of additional information requires simply that it has to be repeated somewhere else in the clause, such instances can be termed 'concord'. (In order to determine whether such phenomena are indeed not features, or whether they are perhaps less canonical features, we would have to analyse them within a canonical framework. This work is in preparation.)
The distinction between multirepresentation of inflectional information which is due to agreement or government, and multirepresentation of inflectional information which is due to semantic choice, makes it possible to classify the remaining problematic phenomena in Kayardild (Evans 1995; 2003) as instances of multirepresentation of morphosemantic features (of semantic case, and tense-aspect-mood-polarity), but not as agreement phenomena. Hence, Kayardild modal case is a component of the tense-aspect-mood-polarity (TAMP) marking, with the particular TAMP value selected for the clause for semantic reasons, and Kayardild complementising case is a type of semantic case assigned to a noun phrase for semantic reasons. Multirepresentation of case values in Kayardild is due to the generalisation in the lexicon that nominal elements have to bear all cases they are assigned (in general, 'case suffixes appear on all words over which they have semantic or syntactic scope', Evans 1995:103). Furthermore, Kayardild multiple TAMP inflection on elements bearing verbalising case is due to a particular value of TAMP having been selected for the clause following a semantic choice, and to the requirement that TAMP be marked on all elements of the verbal complex. Similarly, multiple TAMP inflection on elements in a verbal group/complex is due to a particular value of TAMP having been selected for the clause following a semantic choice, and to the requirement that TAMP be marked on all elements of the verbal group (verbal complex). These conclusions are consistent with the widely held view that tense, aspect, mood, and polarity are features of the clause (rather than being assigned to lexical items). Therefore, tense, aspect, mood, and polarity have not been included in the inventory of morphosyntactic features (for more discussion see Kibort forthcoming).
Inherent versus contextual features
Having established the inventory of values to draw from, and identified the feature value on the element we are analysing, we can compare the sources of feature specifications found on different elements. The feature value may arise from within the element itself, in which case it is inherent, or it may be determined by some other element, in which case it is contextual. The inherent versus contextual distinction was first proposed in the description of inflection (Zwicky 1986a; Anderson 1982; Booij 1994, 1996). Roughly, contextual inflection is 'dictated by syntax', while inherent inflection is 'not required by the syntactic context, although it may have syntactic relevance' (Booij 1996:2). This classification is not absolute but relative to particular word classes. Corbett suggests that the inherent versus contextual distinction can be applied to features in general, specifically, that it 'concerns the feature in relation to where it is realized' (2006:123). Thus, a contextual feature can be defined as 'dictated by syntax', while an inherent feature can be defined as 'not required by the syntactic context (for the particular item), although it may have syntactic relevance'.
Inherent feature specification can be thought of as expressing information that logically belongs to, or arises from within the element on which it is found, while contextual feature specification can be thought of as expressing information that logically originates outside the element on which it is found (in agreement, we call this information 'displaced', and in government, we can refer to this information as a 'brand mark'). Thus, features found on controllers of agreement are inherent features, while features found on agreement targets and on governees are contextual features.
Examples of contextual features of agreement:
Examples of contextual features of government:
Assignment of a feature value
The term 'assignment' with respect to feature values was first used by Corbett (1991) in his discussion of mechanisms for allotting nouns to different genders. Native speakers have the ability to 'work out' the gender of a noun, and models of this ability have been called 'gender assignment systems'. So far, the concept of 'assignment of a feature value' has not been used outside gender. For some features, there may not be as much to say as for gender. However, using the concept of 'assignment' with respect to the values of all features is necessary to be able to compare the features.
Therefore, for this typology of features, I have adopted the following definitions:
A phylogenetic tree of feature specifications
We can now construct a phylogenetic tree for the different types of feature specifications identified so far. The tree includes the inherent versus contextual distinction, and within the contextual assignment, distinguishes between feature values determined through agreement and feature values determined through government:
A further question that can be asked about inherently assigned feature values is whether the value is lexically supplied to the element, or whether it has been selected from a range of available values. The following example will illustrate the distinction: both gender and number values are inherently assigned to nouns, they logically 'belong to' nouns and 'originate from within' nouns when they are used to demand matching agreement on targets. But they are different in that a gender value is typically fixed to a particular noun, while a number value is typically not fixed, but selected from a set of options. Furthermore, since inherent feature values are 'not dictated by syntax', they can also be found on elements other than controllers of agreement. Examples include features such as tense, aspect, mood, polarity, transitivity, evidentiality, voice, topic, focus, and other nominal and verbal features which can be expressed through inflectional morphology in various languages. For the majority of these features, the feature value found on the element is a value selected from a range of values available in the language. So, for example, an inflectional tense marker on a verb can be regarded as expressing an inherently assigned value of the feature tense, selected form a range of options. Thus, the phylogenetic tree of feature specifications can be extended to include the fixed versus selected distinction within inherent feature specification:
Finally, one more distinction can be made within both types of inherently assigned feature values: that between formal and semantic assignment. Frequently, instances of the assignment of a fixed or a selected feature value to an element can be identified as either formally or semantically determined. In many instances the formal and semantic type of assignment coincide. Therefore, this distinction can be considered an option within the phylogenetic tree, to enable the choice when the two methods of feature value assignment do not coincide, but a value is nevertheless assigned on the basis of one of these methods. This distinction, between semantic versus formal criteria in the assignment of an inherent feature value to an element, corresponds to a distinction proposed independently elsewhere, that of semantic versus syntactic agreement (see, for example, Corbett 2006:155-165):
Defining a feature
I can now take a different perspective on features and compare 'features as such', rather than ways of assigning feature values to elements. I want to distinguish between features which are relevant to syntax (morphosyntactic features), and those which are not (morphosemantic features). Also, I want to be able to relate purely morphological features to the other two types. In order to do this, I define a feature as follows:
A morphosyntactic feature is a feature whose values are involved in either agreement or government. Since agreement requires the presence of the controller which is specified for the feature value it imposes on the target, the values of a morphosyntactic feature may be contextual (when found on targets and governees) or inherent (when found on controllers of agreement). Hence, a morphosyntactic feature is a set of values which have all of the realisation options identified in the phylogenetic tree available to them:
A morphosemantic feature is a feature whose values are not involved in agreement or government, but are inherent only. That is, the elements on which the values are found are not controllers of agreement. Because it is not involved in either agreement or government, a morphosemantic feature is not relevant to syntax. Hence, a morphosemantic feature is a set of values which have the following realisation options available to them:
An example of a morphosemantic feature is tense in many familiar languages where it encodes regular semantic distinctions, but it is not required by the syntax through the mechanisms of either agreement or government.
A morphosemantic feature may be marked more than once in the phrase or clause. We distinguish such multiple marking of a morphosemantic feature from agreement. Agreement requires systematic co-variance of controllers and targets (Corbett 2006). Multiple marking of a morphosemantic feature does not fall under agreement, but is better analysed as an additional piece of information that is marked in more than one place in the clause. The inclusion of such information depends wholly on the speaker who chooses to express the given meaning or function, and is independent of syntactic requirements. Sometimes the lack of this information does not imply that the phrase or clause is marked for the negative value of the feature in question (in such instances we refer to the multiple marking of the information, when it is added, as 'concord').
A morphological feature is a feature whose values are not involved in agreement or government, and are inherent only. Furthermore, the values of a morphological feature do not co-vary with semantic functions (even though there may be instances of free formal variation between values of a morphological feature). Hence, a morphological feature is a set of values which have the following realisation options available to them:
Morphological features have a role only in morphology (hence the notion of 'morphology-free syntax'). An example of a morphological feature is inflectional class (a 'declensional class', or a 'conjugation'). Morphological features can be arbitrary; they may have to be specified for individual lexical items, hence they are instances of lexical features. Alternatively, they may be predictable, to varying extents, from phonological and/or semantic correlations. That is, given the phonology or semantics of a given lexical item, it may be possible to assign its morphological feature by an assignment rule, rather than having to specify it in the lexicon (Corbett 2006:122-123; for more on morphological features, see Corbett & Baerman 2006).
Jump to top of page
The definitions given in the sections above correspond to canonical morphosyntactic, morphosemantic, and morphological features. No feature in any language has values which are consistently assigned in the permitted ways across all relevant elements. However, in a given language, we recognise the feature as morphosyntactic if its values are involved in either government of agreement for any set of elements. In turn, a morphosemantic feature in a given language is a feature which is inherent only; that is, there are no elements (word classes or lexemes) for which it is contextual.
In our search for possible morphosyntactic features, we have found at least one language (for each feature) in which the following features can be morphosyntactic:
However, while some of these are typical features of agreement or government and occur very commonly, others only rarely play a role in syntax. The following map shows how they share out the load of work in syntax between them. The features which we have not found to participate in either agreement or government are morphosemantic features:
Gender is an indisputable morphosyntactic feature, since it is required for agreement. The realisation of the value for gender on the target is the canonical instance of the need for a syntactic rule of agreement (Corbett 2006:126).
Gender is an inherent feature of nouns, and a contextual feature (determined through agreement) for any other elements that have to agree with the nouns in this feature (e.g. adjectives, verbs, etc.). Typically, gender is lexically supplied and its value is fixed for the noun. However, on some nouns (multi-gendered nouns such as English baby, and hybrid nouns such as Russian vrač 'doctor') gender can be a semantically selected feature, where one gender value is selected from a set of options. Therefore, the lexical entries of nouns in a gendered language must specify either that the noun has a fixed gender value or that it is capable of taking on different gender values as dictated by the semantics.
Some lexical oppositions which correspond to semantic distinctions similar to gender, but which are not instantiations of the morphosyntactic feature of gender, include semantically contrasting lexical items (as in Kanuri, Nilo-Saharan, Nigeria - which doesn't have a gender system but does have lexical items for 'boy' versus 'girl', for example), or lexical derivations (e.g. in English, the class of nouns ending in -tion).
We have not found instances of gender as a feature of government. There are instances of a gender value found on elements which normally have to agree in gender with a controller, but the controller is absent. This can happen, for example, with predicate adjectives used in nominalisations (as in 'being happy'), or when predicate adjectives are complements of infinivites (i.e. they occur in uncontrolled, or arbitrarily controlled, infinitivals). An example comes from Polish, where adjectives obligatorily express gender and number. When a predicate adjective is a complement of an infinitive, it has to appear in instrumental case, singular number, and masculine/neuter gender. Masculine and neuter forms of adjectives are syncretic in the instrumental.
Since the adjective has to express gender and number, in a situation where there is no controller that could dictate its gender and number, it shows 'default agreement', which is typically 'third person singular neuter'. Hence, instances of gendered adjectives which have no controller are not instances of government but of default agreement.
A terminology note on gender and nominal classification
Gender is perhaps the only feature whose values, when found on controllers of agreement (i.e. the gender values inherently assigned to nouns), typically have no overt expression in the majority of languages which have the gender feature. There are a few languages where gender is marked on every noun (Bantu languages, also Berber, especially North Berber - Kabyle, Tashelhit, Tamazight) or on most nouns with the exception of nouns referring to humans (many Arawak languages of South America, e.g. Baniwa and Tariana; in these languages genders are also used in other classificatory functions, such as numeral classifiers; Alexandra Aikhenvald, personal communication). However, languages marking genders on nouns are a small minority. Perhaps the fact that gender value is not marked on nouns may be related to the fact that gender is also a feature in which inherent assignment of the value is predominantly fixed - that is, typically, most nouns in languages that have the feature of gender have only one, fixed value of gender. Furthermore, the inherent value of gender on the noun is assigned to it on the basis of some specific criteria, usually a combination of semantic and formal criteria. Therefore, in fact, the gender of a noun in a gendered language need not even be specified in the noun's lexical entry, since it can be derived from other information - semantic, morphological, or phonological.
Because of these characteristics, the term 'gender' is most commonly used to refer to classes of nouns within a language which are 'reflected in the behaviour of associated words' (Hockett 1958:231). This is also the definition adopted by Corbett (1991), who argues that in order to define gender we have to refer to the targets of agreement in gender, which allow us to justify the classification of nouns into genders. In this typology, it has been possible to retain the special status of gender. However, the position of gender within the typology needs to be clarified in order to enable comparisons with other features.
As defined in Hockett (1958) and Corbett (1991), gender is exclusively a feature of agreement. Hence, the feature is referred to as 'gender' in a language if it concerns the classification of the nominal inventory of the language, but only if the inherently assigned gender values found on nouns are matched by contextually assigned gender values found on targets of agreement in gender. If a language has a system of nominal classification expressed through inflectional morphology, but the feature of nominal classification does not participate in agreement, it does not qualify as 'gender'. With respect to syntax, the status of such feature is similar to the status of tense in most of the familiar languages: an inflectionally marked feature such as tense expresses a semantic or formal distinction, but is not relevant to syntax for the purposes of agreement or government. Syntax does not need to know the value of the inflectional noun classifier or inflectionally marked tense. Therefore, the distinction between inflectional noun classification and gender is that, while the former can only be a morphosemantic feature, gender can only be a morphosyntactic feature.
I will first consider nominal number. It is a morphosyntactic feature if it participates in agreement (or government) in the language, regardless of whether it is expressed on the controller (the noun - as in the majority of languages where number is inflectional; or the noun phrase as such - as in Farsi) or not. If number is not found affecting other elements of the clause, it can only be regarded as a morphosemantic feature in the language.
Nominal number is inherent to nouns, and contextual to all other elements in the clause which express number due to agreement. On some nouns, number is lexically supplied - this is the case with nouns which have one lexically determined number value that they impose on the agreeing elements (e.g. English health, trousers). In other cases, where the nouns of a given language can be associated with different number values available in this language, number is semantically selected. In such languages, number (both inherent and contextual) is typically regarded as an inflectional feature if it is obligatory. However, in number systems with general number, number can be seen as derivational (Corbett 1999). Alternatively, all inherent number (i.e. number marked on nouns themselves, and on the nominal phrases as such) could be regarded as derivational, while all contextual number (i.e. number marked on other elements of the clause, through agreement) - as inflectional.
Verbal number, which is an inherent feature of verbs, is typically derivational (Corbett 1999:3, after Durie 1986; Mithun 1988a, 1988b) and it appears to be a morphosemantic, but not a morphosyntactic feature, as no agreement effects of verbal number have yet been found.
One contemporary language has been reported not to have the category of number at all. It is Pirahã (the only remaining member of the Mura family; spoken along the Maici River in Amazonas, Brazil), which does not appear to have any plural forms, even in the pronouns. The ways of expressing the notion of plurality are by conjoining, the associative/comitative postposition, and various quantifiers. But there does not seem to be a grammatical feature of number (Corbett 2000:50-51, after Everett 1986, 1997). A similar situation has been reported for two ancient languages: Kawi (Old Javanese), and Classical Chinese.
As with gender, we have not found instances of number as a feature of government. And similarly, there are instances of a number value found on elements which normally have to agree in number with a controller, but the controller is absent. The example that was cited above for gender also illustrates the assignment of a default number value. In Polish, when a predicate adjective is used in nominalisation (as in 'being happy') or a predicate adjective is a complement of an infinitive, it has to appear in instrumental case, singular number, and masculine/neuter gender:
Since the adjective has to express gender and number, in a situation where there is no controller that could dictate its gender and number, it shows 'default agreement', which is typically 'third person singular neuter'. Hence, instances of adjectives expressing number, which have no controller for number, are not instances of government but of default agreement in number.
The feature of person has often been assumed to be universal (Forchheimer 1953:1; Greenberg 1963:31,96; Benveniste 1971:225; Wierzbicka 1976, 1996; Zwicky 1977:715; Ingram 1978), and the claims have varied from a reference to a rather vague 'expression of person' (Benveniste) or 'the system of person' (Forchheimer) to specific remarks about the universal existence of 'distinct first and second singular independent pronouns' (Greenberg), 'pronominal categories involving at least three persons and two numbers' (Greenberg), or the 'morphosyntactic categorisation of person' (Zwicky).
From the point of view adopted here, the (cognitive) category of person exists in a language if it is possible to make a distinction between at least two of the basic participants in a speech act. This is achieved, for example, by allowing self-reference or reference to the addressee. Such reference can be made with the conventional use of any type of noun, or by using some special words that lexicalise the meanings of 'speaker (1)' and 'addressee (2)'. However, the morphosyntactic feature of person can be posited for the language only if this feature participates in agreement (or government) in the language. The morphosyntactic feature of person reflects the grammaticalisation of the category of person in the language. The existence of personal pronouns, without any influence of the category of person on decisions regarding agreement, is not sufficient to posit the person feature for the language, since the pronouns may be lexicalised meanings for the participants of the speech act.
Person as a morphosyntactic feature is typically a feature of agreement. When it is found on controllers of agreement, it is an inherent feature, and when it is found on targets of agreement - it is a contextual feature. The controllers of agreement in person are linguistic elements that express syntactic arguments - these are typically nouns or pronouns, but may also be pronominal affixes (for detailed discussion of pronominal affixes, and the diagnostics for distinguishing pronominal affixes from agreement markers, see Corbett 2003, also summarised in Corbett 2006:99-112). Note that, on this account, pro-drop phenomena (where, in terms of syntax, it may be arguable whether a given element represents an argument or does not fill the argument slot) are not treated uniformly, but need to be analysed on a language-by-language basis, according to the criteria for identifying agreement in person.
In languages where free pronouns have a generic pronominal root, typically invariant across all person-number values, with person affixes attached, the pronouns have a 'selected' feature specification for person, since the person value is selected from a range of options. Etymologically, the generic pronominal root in such languages is often the word for person, body, self or the verb 'to be' or 'exist' (Siewierska 2004:19). In languages where pronouns do not inflect for person in this way, their inherent person value can be regarded as 'fixed' (lexically supplied). It is assumed that noun phrases have inherent third person value, and that they get this specification by default (Corbett 2006:132). There are interesting cases in which this default can be overriden, involving syntactic contexts similar to apposition ('[we] women are...', etc., and the use of the vocative case when the person specification in these contexts is second singular or plural).
We have not come across instances of person being a feature of government.
Case is a feature that expresses a syntactic and/or semantic function of the element that carries the particular case value. A case value may be assigned to an element in one of three ways: contextually through government (commonly), contextually through agreement (rarely), and inherently. For every language, it is possible to work out the case assignment system which is a set of rules that determine how the syntax or semantics assign case values to the elements that can carry them. Depending on which types of assignment are used, two types of cases can be distinguished: abstract (or, grammatical) cases, and concrete (or, semantic) cases.
Abstract, or grammatical, cases in the given language are those cases whose values are assigned contextually. Case values can be assigned contextually either through government (typically by a verb or a preposition), or through agreement (e.g. in constructions with predicate nominals - nouns and adjectives - as in Polish or Slovene). Examples:
(1) nominative and accusative case values assigned by the verb, e.g. Polish:
(2) accusative case value assigned by the preposition, e.g. Polish:
(3) genitive case value on the predicate adjective matches the genitive case value of the quantified noun of the subject noun phrase, e.g. Polish (example adapted from Corbett 2006:134, who cited from Dziwirek 1990:147):
It is important to note that the generally accepted definition of case as expressing a relationship of the dependent noun to its head (Blake 1994:1; Haspelmath forthcoming p.2) corresponds to abstract cases assigned through government (i.e. as in examples 1 and 2). One of the main functions of abstract (grammatical) cases is to express grammatical relations - however, in some languages grammatical relations can be purely syntactic. In such languages the formal evidence identifies grammatical relations, not morphosyntactic cases.
Concrete, or semantic, cases in the given language, whether non-spatial or spatial cases, are those cases whose values are assigned to the elements inherently, i.e. without a governor or a controller. In instances of semantic case, the case value is imposed on the element only due to the semantics. The value may be selected from a range of available semantic values. For example, apart from assigning core grammatical relations to the principal participants and assigning case values to the nouns expressing them, the speaker may choose to use additional nominals in the same clause to express additional information. The choice of case value for such oblique nominals is similar to the choice made by the speaker between, say, the singular or plural value of number, which is also assigned to the nominal inherently.
On this view of grammatical versus semantic case, predicate-less utterances (e.g. labels, titles and other instances of citation forms which are not part of connected discourse) in which nominals carry a default case (e.g. the nominative) can be considered instances of semantic case, rather than grammatical case, because a grammatical nominative is imposed on the element by the governing verb.
When it has morphological exponence, definiteness is most commonly a morphosemantic feature which does not participate in agreement or government. However, we have come across an instance where we need to posit definiteness as a morphosyntactic feature.
In German, in order to describe nominal inflection, we need gender, number, and case. However, in order to describe adjectival inflection, after separating out gender, number and case, we are still left with three different adjectival paradigms, referred to as 'strong', 'mixed' and 'weak'. An adjective inflecting according to the strong paradigm shows full agreement features. The following is an example listing the strong paradigm for gut 'good' (all examples from Corbett 2006:95-96, and the discussion follows in part Zwicky 1986b):
The mixed paradigm, exemplified below, shows partially reduced agreement. It shares some forms with the strong paradigm (these are marked '(S)'), and some with the weak paradigm (these are marked '(W)'). The remaining forms (unmarked) are shared across all three paradigms:
Finally, the following is the weak paradigm for the same adjective. The weak paradigm shows reduced agreement:
Corbett notes that, as we progress from the strong paradigm to the weak, in each there are fewer distinct inflections (five in the strong, four in the mixed, and two in the weak). 'However, the sets of cells which are distinguished in the strong paradigm are not simply collapsed: the weak paradigm has different forms for the feminine singular and the plural, which are identical in the strong paradigm' (2006:96).
Therefore, we have to treat the choice of one of the paradigms as a choice of one of three distinct options, perhaps values of a feature. What dictates the choice of the paradigm for the adjective is the type of element in the determiner position in the phrase. The choice of the adjectival paradigm correlates with the choice of the determiner in the following way:
The correlation can be understood in terms of definiteness, even though there is no unique marker of definiteness in German on adjectives - instead, definiteness is expressed through the choice of the determiner and the selection of inflectional endings on the adjective. One way of analysing definiteness in German nominal phrases would be to see it as assigned to the whole phrase together with the (optional) determiner. However, we still have to account for the selection of the adjectival paradigm, and the observed correlations suggest strongly that we should recognise a morphosyntactic feature. It is not completely clear, however, whether we are dealing with agreement or government.
Zwicky (1986b:984-987) analyses it as government: the determiners govern the feature of definiteness on the adjectives by requiring the selection of a particular type of adjectival paradigm. The questions which arise are: if it is definiteness that is the governed feature, we should not expect to find its values on the governors. Furthermore, apart from saying that the particular determiners require the selection of the particular adjectival paradigms, it is difficult to characterise this feature in terms of its values. The best characterisation that can be given is: the definite articles govern the 'weak', or 'reduced' value of the feature of definiteness, and the indefinite articles govern the 'mixed', or 'partially reduced' value of the feature of definiteness. It appears that the appropriate assessment of this analysis has to be postponed until we have a theory of syntactic government comparable to the theory of canonical agreement proposed by Corbett (2006).
The alternative view, which is adopted here tentatively (and is open to further argumentation), is to analyse the correlations as agreement: there is systematic covariance between the controllers (the two types of determiners: definite and indefinite) and the targets (the adjectives). The exponence of definiteness on the adjectives is non-autonomous, but it is expressed through the selection of the required adjectival paradigm. In each case, the result is a particular pattern of distribution of information relevant to the concept of definiteness throughout the phrase. It seems easy to accept that definite and indefinite articles themselves express a value of definiteness (and act as controllers of agreement on adjectives), even though, on this view, we also have to accept the fact that German adjectives agree in number and gender with one controller (the noun), but in definiteness with another (the determiner). Comments on this issue will be very welcome.
The fact that the values of the proposed feature may not always correspond to (in)definiteness semantically does not pose a problem to give the feature the label 'definiteness'. In a parallel way, what is labelled as 'gender' often does not correspond to a semantically assigned gender or class. The feature of 'gender' does have a semantic core, or basis, but there are few languages with purely semantically assigned gender values. Similarly, definiteness in Germanic has some semantic basis, but we do not necessarily expect it to be semantically assigned throughout.
'Respect', or 'address', is one of the overt linguistic expressions of politeness. It indicates the speaker's social relation and attitude to the addressee and sometimes also to third persons (Corbett 2006:137). While investigating respect, we need to first distinguish respect as a condition on other features from respect as a feature in its own right. In many languages respect is expressed through the conventional use of certain other features, usually number, sometimes person. In Russian, for example, respect is shown by the speaker through the use of the plural form of the second person (vy 'you (PL)') even when addressing a single interlocutor. In Polish, to show respect, the speaker addresses the interlocutor in the 3rd person (singular or plural), using the common nouns/nominal phrases such as 'lady/madam' (pani), 'gentleman/sir' (pan), 'ladies' (panie), 'gentlemen' (panowie), or 'lady/ies and gentleman/men' (państwo) to refer to the addressee/s. The agreements with the polite pronouns or nouns are then syntactic or semantic, depending on the target (Corbett 1983:51-56; 2006:230-233).
Such manifestations of respect will not be regarded a morphosyntactic feature. Not only is there no unique exponent of it, but more importantly there is no independent reason (such as for example the outcome of resolution rules) for needing a respect feature in Russian or Polish. Furthermore, the variability of the agreement according to target suggests that respect in these languages is rather a condition on agreement, but not a morphosyntactic feature.
Another expression of respect is through honorifics - words or expressions that convey esteem or respect and are used in addressing or referring to a person. In languages with multiple honorifics, these have been sometimes analysed in terms of agreement, thus making respect a morphosyntactic feature. However, this is not justified in the languages where each honorific can be determined on pragmatic grounds and they agree only in the sense that they tend to be used in the same pragmatic circumstances (Corbett 2006:137).
Positing a respect feature is justified when the grammar needs to refer to it directly. A clear example is found in Muna, an Austronesian language spoken on Muna Island (off the southeast coast of Sulawesi). Here we need to specify respect independently of other features (notably number); furthermore, marking on the verb co-varies with the pronoun. The following example (from van den Berg 1989:51,82, cited in Corbett 2006:138) shows the number and politeness markers found on the verb kala 'go' referring to the 2nd person singular and plural (ihintu is a free pronoun which is used for emphasis and distinguishes neutral and polite forms):
Corbett (2006:138) lists three other languages for which the respect feature is needed: Maithili (an Indo-Iranian language spoken in India and Nepal) which has a particularly large set of respect values; Bavarian German, where the agreement forms for polite agreement have become differentiated from the original 3rd person forms; and Tamil, where there is a distinct form for agreement with honorifics.
Jump to top of page
We have consulted many expert colleagues about the status of grammatical categories in various languages. We have also organised a series of seminars to discuss several of the key features, as well as the theoretical and computational uses of features (see our Outputs page for a list of seminars organised within the Features project). Finally, we posted a world-wide query regarding morphosyntactic features. We would like to extend our thanks to all who responded to our query posted on the LAGB circulation list (13 Feb 2007) and the LINGUIST List (14 Feb 2007), with valuable comments, data and references:
A website dedicated to Grammatical Features, including morphosemantic features, can now be found at http://www.features.surrey.ac.uk.
The website has been undertaken as an extension to the Features project. It contains detailed descriptions (already totalling about 50,000 words), discussion, examples, and an extensive, carefully compiled bibliography pertaining to many of the key features. Currently, the website is offered as a set of articles, but it is envisaged that it will turn into a live and ever-growing resource, open for contributions, which I hope will eventually become part of the 'Linguistics Web 2.0'.
In our own transcriptions of data, we follow the Leipzig Glossing Rules ('Conventions for interlinear morpheme-by-morpheme glosses'). Data cited from other sources may follow other conventions.
How to cite the articles on this webpage:
Jump to top of page