phymath999: mit01 The Manifold01 Ways of Perception

Saturday, March 9, 2013

mit01 The Manifold01 Ways of Perception

This is the html version of the file http://www.seas.upenn.edu/~ddlee/Papers/manifold_ways.pdf.
Google automatically generates html versions of documents as we crawl the web.

M是本质上的一维，虽然它被嵌入在图像空间中，其具有高的维数等于图像像素“的数目。如果我们到allowother类型的图像变换，例如缩放和平移，然后的M维sionality会增加，但wouldstill仍远小于的imagespace

Page 1

2268

The Manifold Ways of Perception

H. Sebastian Seung and Daniel D. Lee

philosopher Heraclitus, observing that

the world is in eternal ﬂux, wrote that

you can never step in the same river twice. If

he were alive today and working as a psy-

chologist, he might say that you can never see

the same face twice. lndeed, faces can grow

hair, acquire wrinkles, or be surgically en-

hanced. But facial images also vary from mo-

ment to moment, as you can demonstrate at

home while watching television. Make a

small aperture in a piece of paper, and place

it over a face on the screen. The light coming

through the aperture will vary with time,

mostly as a result of changes in the location

and orientation of the face.

The aperture might show a tooth

at one instant, and a nostril at the

next, crudely simulating the ﬂuctua-

tions in light incident on a single

retinal photoreceptor cell. This illus-

trates that the signals carried from

the eye to the brain by the million or so ax-

ons in the optic nerve are perpetually

changing as we look at a face. Neverthe-

less, we are able to perceive that these

changing signals are produced by the same

object. This is the fundamental mystery of

perception: How does the brain perceive

constancy even though its raw sensory in-

puts are in ﬂux? The mystery intrigues not

only scientists but also engineers, who

yearn to construct vision machines that

equal the performance of humans at visual

object recognition.

To precisely characterize the variability

of images and other perceptual stimuli, it is

essential to take a mathematical approach,

which is just what Tenenbaum et al. (I) and

Roweis and Saul (2) have done on pages

2319 and 2323 of this issue, respectively.

An image can be regarded as a collection of

numbers, each specifying light intensity at

an image pixel. But a collection of numbers

also specifies the Cartesian coordinates of a

point with respect to a set of axes. There-

fore, any image can be identified with a

point in an abstract image space.

Now consider a simple example of im-

age variability, the set M of all facial images

generated by varying the orientation of a

face (see the figure). This set is a continuous

Two-and-a-half millennia ago, the Greek

H. S. Seung is at the Howard Hughes Medical Insti-

tute and Brain and Cognitive Sciences Department,

Massachusetts institute of Technology, Cambridge,

MA 02139. USA. D. D. Lee is at Bell Labs, Lucent Tech-

nologies, Murray Hill, NJ 07974, USA.

curve in the image space. It is continuous

because the image varies smoothly as the

face is rotated. It is a curve because it is

generated by varying a single degree of

freedom, the angle of rotation. In other

words, M is intrinsically one-dimensional,

although it is embedded in image space,

which has a high dimensionality equal to the

number of image pixels. If we were to allow

other types of image transformations, such

as scaling and translation, then the dimen-

sionality of M would increase, but would

still remain far less than that of the image

space. In this generalized case, M is said to

be a manifold embedded in the image

Manifolds in visual perception. The retinal im-

age is a collection of signals from photoreceptor

cells. If these numbers are taken to be coordi-

nates in an abstract image space, then an image

is represented by a point. Only three dimensions

of the image space are depicted, but actually the

dimensionality is equal to the number of pho-

toreceptor cells. As the faces are rotated, they

trace out nonlinear curves embedded in image

space. If changes in scale, illumination, and other

sources of continuous variability are also includ-

ed, then the images would lie on low-dimen-

sional manifolds, rather than the simple one-di-

mensional curves shown.To recognize faces, the

brain must equate all images from the same

manifold, but distinguish between images from

different manifolds. How the brain represents

image manifolds is as yet unknown.According to

one hypothesis, they are stored in the brain as

manifolds of stable neural-activity patterns.

. _hotoreceptors

space. A curve is an example of a one-di-

mensional manifold, whereas a sphere is an

example of a two-dimensional manifold (3).

Although the preceding discussion is

biased toward vision, manifolds are also

relevant to other types of perception. Fur-

thermore, scientists in many fields face the

problem of simplifying high-dimensional

data by finding low-dimensional structure

in it. Therefore, the manifold learning al-

gorithms described by Tenenbaum et al.

(1) and Roweis and Saul (2) are of poten-

tially broad interest. The goal of the algo-

rithms is to map a given set of high-di-

mensional data points into a surrogate

low-dimensional space. Both start with a

preprocessing step that decides for each

data point which of the other data points

should be considered its neighbors. Then

both compute measures of the local geom-

etry of the manifold, after which the origi-

nal data points are no longer needed.

In the lsomap algorithm of

Tenenbaum et al., the local

quantities computed are the dis-

tances between neighboring data

' X2 points. For each pair of non-

neighboring data points, lsomap

finds the shortest path through

the data set connecting them, subject to

the constraint that the path must hop from

neighbor to neighbor. The length of this

path is an approximation to the distance

between its end points, as measured within

the underlying manifold. Finally, the clas-

sical method of multidimensional scaling

is used to find a set of low-dimensional

points with similar pairwise distances.

The locally linear embedding algorithm

of Roweis and Saul computes a different

local quantity, the coefficients of the best

approximation to a data point by a weight-

ed linear combination of its neighbors.

Then the algorithm finds a set of low-di-

mensional points, each of which can be

linearly approximated by its neighbors

with the same coefficients that were deter-

mined from the high-dimensional data

points. Both algorithms yield impressive

results on some benchmark artificial data

sets, as well as on “real world” data sets.

importantly, they succeed in learning non-

linear manifolds, in contrast to algorithms

such as principal component analysis,

which can only learn linear manifolds.

Because manifolds are fundamental to

perception, the brain must have some way

of representing them. Clues to the nature

of this representation may come from

studies of how information is encoded in

large populations of neurons. Population

activity is typically described by a collec-

tion of neural firing rates, and so can be

represented by a point in an abstract space

with dimensionality equal to the number

22 DECEMBER 2000 VOL 290 SCIENCE www.sciencemag.org

EDlTS: (BUSH) HARRY CAHLUCK/AP PHOTO: iGORE) HILLERY SMITH GARRISON/AP PHOTO

Page 2

of neurons. Neurophysiologists have often

found that the firing rate of each neuron in

a population can be written as a smooth

function of a small number of variables,

such as the angular position of the eye (4)

or direction of the head (5). This implies

that the population activity is constrained

to lie on a low-dimensional manifold

What is the connection between such

neural manifolds and the image manifolds

we have just discussed? According to a

well-known idea, memories are stored in

brain dynamics as stable states, or dynami-

www.sciencemag.org SCIENCE VOL290 22 DECEMBERZOOO

cal attractors (6). Because the possible im-

ages of an object lie on a manifold, it has

been hypothesized that a visual memory is

stored as a manifold of stable states, or a

continuous attractor (7). Recent studies of

neural manifolds suggest that continuous

attractors actually do exist in the brain (8,

9). Whether they are the basis of visual and

other types of perception remains to be re-

solved. If the answer is affirmative, then

manifolds will prove to be crucial for un-

derstanding how perception arises from the

dynamics of neural networks in the brain.

References

1. J. Tenenbaum, V. de Silva, J. C. Langford, Science 290,

2319 (2000).

2. S. Roweis, L. Saul, Science 290. 2323 (2000).

3. K. Devlin. Mathematics: The Science of Pattern: (Sci-

entific American Library) New York. 1997].

4. J. L. McFarland, A. F. Fuchs, j. Neurophysioi. 68. 319

(1992).

. j. S.Taube, Frog. Neurobioi. 55. 225 (1998).

6. J. J. Hopfield, Proc. Natl’. Acad. Sci. U.S.A. 79, 2554

(1982).

7. H. S. Seung, Adv. Neural info. Proc. Syst. 10, 654,

(1998).

8. H. S. Seung, Proc. Natl. Acad. Sci. U.S.A. 93, 13339,

(1996).

9. K. Zhang. J. Neurosci. 16,2112 (1996).

2269

phymath999

Saturday, March 9, 2013

mit01 The Manifold01 Ways of Perception

No comments:

Post a Comment