Previous: Introduction Up: Introduction Next: Thesis and Scientific Contributions

Background

Research on human categorization in cognitive science, psychology, linguistics, anthropology, and philosophy has shown that in many cases, categories cannot be properly characterized by a set of necessary and sufficient conditions; i.e. category membership is usually not an all-or-nothing phenomenon but rather a matter of degree, and people can judge the ``typicality'' and/or degree of membership of potential category members in a consistent way (see e.g. [Lakoff 1987][Rosch 1978][Berlin \& Kay 1969][Wittgenstein 1953]). This applies not only to so-called natural categories, which correspond roughly to nouns in natural languages, but also to many categories used in science, e.g. biology [Lakoff 1987]. Perhaps the only exceptions are mathematical categories which are explicitly defined by necessary and sufficient conditions. However, even the latter might be seen as abstractions of real-world categories that would not necessarily be definable in the same way, or as being constructed on the basis of such abstractions, cf. [Aleksandrov 1956]. Some of the ideas about graded category membership have been formalized as fuzzy set theory [Zadeh 1971], which has found applications in various areas of control theory and AI, for instance [Kosko 1992].

The concepts of embodiment [Lakoff 1987], symbol grounding [Harnad 1990], and situated cognition [Suchman 1988] are related to issues in categorization. Embodiment is the notion that the shape or extension of categories (and hence the meaning of symbols representing categories, in a referential type semantics) is in part determined by the physiology of the organism doing the categorization. For instance, Berlin and Kay hypothesize that the universality of basic color category foci can be explained in terms of the underlying neurophysiological mechanism of color perception, which are the same for all people, regardless of language [Berlin \& Kay 1969].

Grounding is concerned with how symbols and their meanings are grounded in categorization and perception of the environment the organism operates in. Harnad's fundamental claim is that the symbols of ``traditional'' symbolic AI systems are only meaningful to a human observer, and not to the system itself. The meaning to a human observer arises from the fact that the symbols are systematically interpretable and have a meaning assigned to them via an external semantic mapping or model. Such a semantic model is an explanatory device, but it does not play any role in the system's internal functioning. In contrast, Harnad claims that symbols representing categories to people (like nouns in natural languages) are meaningful because they are connected to the world in a causal and non-arbitrary way, via perception. We can say that the human symbols are embodied, while machine symbols are not.

I essentially agree with Harnad's analysis. Since categories are an essential part of natural language semantics, the analysis implies that models of natural language semantics must take physiology and perception, or in general, the nature of the cognitive mechanism underlying categorization, into account. This in turn implies that a traditional model-theoretic approach to natural language semantics is inadequate. Model theory, as used in logic or modern linguistics, starts with the assumption of a (real or possible) domain consisting of discrete individuals, properties, and relations, corresponding via a static semantic mapping with constant and relation symbols in the language of interest. These models have nothing to say about how such a relation might be established or maintained in the first place. In essence this defines the central problem of semantics away. Since I take natural language semantics to be part of the domain of the general study of intelligence, we may contrast the symbol grounding or embodiment view with the ``traditional'' symbolic AI view that intelligence (or ``mental functions'') can be studied in the abstract, without reference to the organism displaying it, or the mechanism implementing it (e.g. [Newell 1979]).

Situatedness holds (among other things) that ``Communication ... is not a symbolic process that happens to go on in real-world settings, but a real-world activity in which we make use of language to delineate the collective relevance of a shared environment'' [Suchman 1988][p. 180]. Again, the environment or the ``real world'' is seen as the grounding for language, which implicitly requires perception to be taken into account.

One particular area of natural language semantics where embodiment, grounding, and situatedness seem to play an important role is that of color terms. In their anthropological and linguistic work in the late sixties, Berlin and Kay [Berlin \& Kay 1969] were looking for semantic universals in the domain of color terms, hoping to refute the Sapir-Whorf hypothesis which claims that there are no semantic universals, and that each language performs the coding of experience into language in a unique and arbitrary manner. The latter is of course in direct opposition to theories of embodied semantics, since these do predict the existence of universals, based on the observation that all people share a common physiology, regardless of the language they speak or the culture they live in. Berlin and Kay found that there are indeed semantic universals in the domain of color, particularly in the extensions of what they called ``basic color terms''. When they asked native speakers of widely differing languages to identify (1) the best examples and (2) the boundaries of basic color categories on a chart of color samples, they found that (1) the best examples (foci) of basic color categories are the same within small tolerances for speakers of any language that has (the equivalent of) the basic color term in question, and (2) there is a hierarchy of languages with respect to how many and which basic color terms they possess, such that, roughly speaking, a language that has basic color terms has all the basic color terms of any language with basic color terms, and any languages with basic color terms have the same ones (with respect to their extensions). It is of course apparent from these results that (basic) color categories are characterized by graded membership functions, with some colors clearly being non-members, some being prime examples, some being borderline examples, and with other degrees of membership in between.

[Kay \& McDaniel 1978] have made the first attempt to explain these results based on neurophysiological findings in color perception, i.e. to specify how basic color categories are embodied, or to specify a theory of symbol grounding in the domain of colors. They used a fuzzy set model in which they interpreted (stylized versions of) neural response functions as characteristic functions of fuzzy sets representing color categories. Their model is interesting, it explains some of Berlin and Kay's data, and it certainly deserves respect for being the first to attempt to explain the connection between natural language semantics and physiology, but it does leave several questions unanswered, as I will discuss in Chapter . So far no adequate model of color term semantics has been computationally defined or implemented, to my knowledge.

In this dissertation I attempt to define a computational model of color perception and color naming, i.e. a semantic model of (basic) color terms grounded in color perception, that is partly based on neurophysiological data and that can explain Berlin and Kay's and other relevant linguistic and anthropological data. In particular, the model attempts to explain the graded nature of color categories and the universality of color foci, and it allows an artificial cognitive agent to name color samples, point out examples of named colors in its environment, and select objects from its environment, specified by color name. Such an agent can participate in an experiment like Berlin and Kay's, and its performance will be consistent with human performance. An implementation for the computational model is presented, as are some experimental results.

lammens@cs.buffalo.edu