Saturday, March 9, 2013

mit01 The Manifold01 Ways of Perception

This is the html version of the file http://www.seas.upenn.edu/~ddlee/Papers/manifold_ways.pdf.
Google automatically generates html versions of documents as we crawl the web.

M是本质上的一维,虽然它被嵌入在图像空间其具有高的维数等于图像像素数目如果我们allowother类型的图像变换,例如缩放和平移然后的Msionality会增加,但wouldstill小于imagespace


2268
The Manifold Ways of Perception
H. Sebastian Seung and Daniel D. Lee
philosopher Heraclitus, observing that
the world is in eternal flux, wrote that
you can never step in the same river twice. If
he were alive today and working as a psy-
chologist, he might say that you can never see
the same face twice. lndeed, faces can grow
hair, acquire wrinkles, or be surgically en-
hanced. But facial images also vary from mo-
ment to moment, as you can demonstrate at
home while watching television. Make a
small aperture in a piece of paper, and place
it over a face on the screen. The light coming
through the aperture will vary with time,
mostly as a result of changes in the location
and orientation of the face.
The aperture might show a tooth
at one instant, and a nostril at the
next, crudely simulating the fluctua-
tions in light incident on a single
retinal photoreceptor cell. This illus-
trates that the signals carried from
the eye to the brain by the million or so ax-
ons in the optic nerve are perpetually
changing as we look at a face. Neverthe-
less, we are able to perceive that these
changing signals are produced by the same
object. This is the fundamental mystery of
perception: How does the brain perceive
constancy even though its raw sensory in-
puts are in flux? The mystery intrigues not
only scientists but also engineers, who
yearn to construct vision machines that
equal the performance of humans at visual
object recognition.
To precisely characterize the variability
of images and other perceptual stimuli, it is
essential to take a mathematical approach,
which is just what Tenenbaum et al. (I) and
Roweis and Saul (2) have done on pages
2319 and 2323 of this issue, respectively.
An image can be regarded as a collection of
numbers, each specifying light intensity at
an image pixel. But a collection of numbers
also specifies the Cartesian coordinates of a
point with respect to a set of axes. There-
fore, any image can be identified with a
point in an abstract image space.
Now consider a simple example of im-
age variability, the set M of all facial images
generated by varying the orientation of a
face (see the figure). This set is a continuous
Two-and-a-half millennia ago, the Greek
H. S. Seung is at the Howard Hughes Medical Insti-
tute and Brain and Cognitive Sciences Department,
Massachusetts institute of Technology, Cambridge,
MA 02139. USA. D. D. Lee is at Bell Labs, Lucent Tech-
nologies, Murray Hill, NJ 07974, USA.
curve in the image space. It is continuous
because the image varies smoothly as the
face is rotated. It is a curve because it is
generated by varying a single degree of
freedom, the angle of rotation. In other
words, M is intrinsically one-dimensional,
although it is embedded in image space,
which has a high dimensionality equal to the
number of image pixels. If we were to allow
other types of image transformations, such
as scaling and translation, then the dimen-
sionality of M would increase, but would
still remain far less than that of the image
space. In this generalized case, M is said to
be a manifold embedded in the image
Manifolds in visual perception. The retinal im-
age is a collection of signals from photoreceptor
cells. If these numbers are taken to be coordi-
nates in an abstract image space, then an image
is represented by a point. Only three dimensions
of the image space are depicted, but actually the
dimensionality is equal to the number of pho-
toreceptor cells. As the faces are rotated, they
trace out nonlinear curves embedded in image
space. If changes in scale, illumination, and other
sources of continuous variability are also includ-
ed, then the images would lie on low-dimen-
sional manifolds, rather than the simple one-di-
mensional curves shown.To recognize faces, the
brain must equate all images from the same
manifold, but distinguish between images from
different manifolds. How the brain represents
image manifolds is as yet unknown.According to
one hypothesis, they are stored in the brain as
manifolds of stable neural-activity patterns.
. _hotoreceptors
space. A curve is an example of a one-di-
mensional manifold, whereas a sphere is an
example of a two-dimensional manifold (3).
Although the preceding discussion is
biased toward vision, manifolds are also
relevant to other types of perception. Fur-
thermore, scientists in many fields face the
problem of simplifying high-dimensional
data by finding low-dimensional structure
in it. Therefore, the manifold learning al-
gorithms described by Tenenbaum et al.
(1) and Roweis and Saul (2) are of poten-
tially broad interest. The goal of the algo-
rithms is to map a given set of high-di-
mensional data points into a surrogate
low-dimensional space. Both start with a
preprocessing step that decides for each
data point which of the other data points
should be considered its neighbors. Then
both compute measures of the local geom-
etry of the manifold, after which the origi-
nal data points are no longer needed.
In the lsomap algorithm of
Tenenbaum et al., the local
quantities computed are the dis-
tances between neighboring data
' X2 points. For each pair of non-
neighboring data points, lsomap
finds the shortest path through
the data set connecting them, subject to
the constraint that the path must hop from
neighbor to neighbor. The length of this
path is an approximation to the distance
between its end points, as measured within
the underlying manifold. Finally, the clas-
sical method of multidimensional scaling
is used to find a set of low-dimensional
points with similar pairwise distances.
The locally linear embedding algorithm
of Roweis and Saul computes a different
local quantity, the coefficients of the best
approximation to a data point by a weight-
ed linear combination of its neighbors.
Then the algorithm finds a set of low-di-
mensional points, each of which can be
linearly approximated by its neighbors
with the same coefficients that were deter-
mined from the high-dimensional data
points. Both algorithms yield impressive
results on some benchmark artificial data
sets, as well as on “real world” data sets.
importantly, they succeed in learning non-
linear manifolds, in contrast to algorithms
such as principal component analysis,
which can only learn linear manifolds.
Because manifolds are fundamental to
perception, the brain must have some way
of representing them. Clues to the nature
of this representation may come from
studies of how information is encoded in
large populations of neurons. Population
activity is typically described by a collec-
tion of neural firing rates, and so can be
represented by a point in an abstract space
with dimensionality equal to the number
22 DECEMBER 2000 VOL 290 SCIENCE www.sciencemag.org
EDlTS: (BUSH) HARRY CAHLUCK/AP PHOTO: iGORE) HILLERY SMITH GARRISON/AP PHOTO


of neurons. Neurophysiologists have often
found that the firing rate of each neuron in
a population can be written as a smooth
function of a small number of variables,
such as the angular position of the eye (4)
or direction of the head (5). This implies
that the population activity is constrained
to lie on a low-dimensional manifold
What is the connection between such
neural manifolds and the image manifolds
we have just discussed? According to a
well-known idea, memories are stored in
brain dynamics as stable states, or dynami-
www.sciencemag.org SCIENCE VOL290 22 DECEMBERZOOO
cal attractors (6). Because the possible im-
ages of an object lie on a manifold, it has
been hypothesized that a visual memory is
stored as a manifold of stable states, or a
continuous attractor (7). Recent studies of
neural manifolds suggest that continuous
attractors actually do exist in the brain (8,
9). Whether they are the basis of visual and
other types of perception remains to be re-
solved. If the answer is affirmative, then
manifolds will prove to be crucial for un-
derstanding how perception arises from the
dynamics of neural networks in the brain.
References
1. J. Tenenbaum, V. de Silva, J. C. Langford, Science 290,
2319 (2000).
2. S. Roweis, L. Saul, Science 290. 2323 (2000).
3. K. Devlin. Mathematics: The Science of Pattern: (Sci-
entific American Library) New York. 1997].
4. J. L. McFarland, A. F. Fuchs, j. Neurophysioi. 68. 319
(1992).
. j. S.Taube, Frog. Neurobioi. 55. 225 (1998).
6. J. J. Hopfield, Proc. Natl’. Acad. Sci. U.S.A. 79, 2554
(1982).
7. H. S. Seung, Adv. Neural info. Proc. Syst. 10, 654,
(1998).
8. H. S. Seung, Proc. Natl. Acad. Sci. U.S.A. 93, 13339,
(1996).
9. K. Zhang. J. Neurosci. 16,2112 (1996).
2269

No comments:

Post a Comment