Previous: Introduction
Up: Introduction
Next: Thesis and Scientific Contributions
Research on human categorization in cognitive science, psychology, linguistics, anthropology, and
philosophy has shown that in many cases, categories cannot be properly
characterized by a set of necessary and sufficient conditions; i.e.
category membership is usually not an all-or-nothing phenomenon but rather
a matter of degree, and people can judge the ``typicality'' and/or degree
of membership of potential category members in a consistent way (see e.g.
[Lakoff 1987][Rosch 1978][Berlin \& Kay 1969][Wittgenstein 1953]). This applies not only to
so-called natural categories, which correspond roughly to nouns in
natural languages, but also to many categories used in science, e.g.
biology [Lakoff 1987]. Perhaps the only exceptions are mathematical
categories which are explicitly defined by necessary and sufficient
conditions. However, even the latter might be seen as abstractions of
real-world categories that would not necessarily be definable in the same
way, or as being constructed on the basis of such abstractions, cf.
[Aleksandrov 1956]. Some of the ideas about graded category membership have been
formalized as fuzzy set theory [Zadeh 1971], which
has found applications in various areas of control theory and AI, for
instance [Kosko 1992].
The concepts of embodiment [Lakoff 1987], symbol grounding
[Harnad 1990], and situated cognition
[Suchman 1988] are related to issues in categorization.
Embodiment is the notion that the shape or extension of categories (and hence the meaning of symbols
representing categories, in a referential type semantics) is in part
determined by the physiology of the organism doing the
categorization.
For instance, Berlin and Kay hypothesize that
the universality of basic color category foci can be explained in terms of
the underlying neurophysiological mechanism of color perception, which are
the same for all people, regardless of language [Berlin \& Kay 1969].
Grounding is concerned with how symbols and their meanings are
grounded in categorization and perception of the environment the organism
operates in. Harnad's fundamental claim is that the symbols of ``traditional''
symbolic AI systems are only meaningful to a human observer, and not
to the system itself. The meaning to a human observer arises
from the fact that the symbols are systematically interpretable and
have a meaning assigned to them via an external semantic mapping or
model. Such a semantic model is an explanatory device, but it does not play
any role in the system's internal functioning. In contrast, Harnad claims
that symbols representing categories to people (like nouns in natural
languages) are meaningful because they are connected to the world in a
causal and non-arbitrary way, via perception.
We can say that the human symbols are
embodied, while machine symbols are not.
I essentially agree with Harnad's analysis. Since categories are an
essential part of natural language semantics, the analysis implies that
models of natural language semantics must take physiology and perception,
or in general, the nature of the cognitive mechanism underlying
categorization, into account. This in turn implies that a traditional
model-theoretic approach to natural language semantics is inadequate. Model
theory, as used in logic or modern linguistics, starts with the assumption
of a (real or possible) domain consisting of discrete individuals,
properties, and relations, corresponding via a static semantic
mapping with constant and relation symbols in the language of interest.
These models have nothing to say about how such a relation might be
established or maintained in the first place. In essence this defines the
central problem of semantics away. Since I take natural language
semantics to be part of the domain of the general study of intelligence, we
may contrast the symbol grounding or embodiment view with the
``traditional'' symbolic AI view that intelligence (or ``mental
functions'') can be studied in the abstract, without reference to the
organism displaying it, or the mechanism implementing it
(e.g. [Newell 1979]).
Situatedness holds (among other things) that ``Communication ... is not a symbolic process that happens to go on in real-world settings, but a real-world activity in which we make use of language to delineate the collective relevance of a shared environment'' [Suchman 1988][p. 180]. Again, the environment or the ``real world'' is seen as the grounding for language, which implicitly requires perception to be taken into account.
One particular area of natural language semantics where embodiment,
grounding, and situatedness seem to play an important role is that of color
terms. In their anthropological and linguistic work in the late sixties,
Berlin and Kay [Berlin \& Kay 1969] were looking for semantic universals in the
domain of color terms, hoping to refute the Sapir-Whorf hypothesis which
claims that there are no semantic universals, and that each language
performs the coding of experience into language in a unique and arbitrary
manner. The latter is of course in direct opposition to theories of
embodied semantics, since these do predict the existence of universals,
based on the observation that all people share a common physiology,
regardless of the language they speak or the culture they live in. Berlin
and Kay found that there are indeed semantic universals in the domain of
color, particularly in the extensions of what they called ``basic color
terms''.
When they asked native speakers of widely differing languages to
identify (1) the best examples and (2) the boundaries of basic
color categories on a chart of color samples, they found that (1) the best
examples (foci) of basic color categories are the same within small
tolerances for speakers of any language that has (the equivalent of) the
basic color term in question, and (2) there is a hierarchy of languages
with respect to how many and which basic color terms they possess, such
that, roughly speaking, a language that has
basic color terms has all
the basic color terms of any language with
basic color terms, and any
languages with
basic color terms have the same ones (with respect to
their extensions). It is of course apparent from these results that (basic)
color categories are characterized by graded membership functions, with
some colors clearly being non-members, some being prime examples, some
being borderline examples, and with other degrees of membership in between.
[Kay \& McDaniel 1978] have made the first attempt to explain these results based
on neurophysiological findings in color perception, i.e. to specify how
basic color categories are embodied, or to specify a theory of symbol
grounding in the domain of colors. They used a fuzzy set model in which
they interpreted (stylized versions of) neural response functions as
characteristic functions of fuzzy sets representing color categories.
Their model is interesting, it explains some of Berlin and Kay's data,
and it certainly deserves respect for being the first to attempt to explain
the connection between natural language semantics and physiology, but it
does leave several questions unanswered, as I will discuss in
Chapter . So far no adequate model of color term semantics
has been computationally defined or implemented, to my knowledge.
In this dissertation I attempt to define a computational model of color perception and color naming, i.e. a semantic model of (basic) color terms grounded in color perception, that is partly based on neurophysiological data and that can explain Berlin and Kay's and other relevant linguistic and anthropological data. In particular, the model attempts to explain the graded nature of color categories and the universality of color foci, and it allows an artificial cognitive agent to name color samples, point out examples of named colors in its environment, and select objects from its environment, specified by color name. Such an agent can participate in an experiment like Berlin and Kay's, and its performance will be consistent with human performance. An implementation for the computational model is presented, as are some experimental results.