The 2001 edition of this course offers an advanced survey of selected topics of current interest in the neural and computational modeling of mammalian vision. This year's topics include motion perception, eye movements, attention and figure-ground phenomena. Several classes will be held at laboratories of nearby institutions. Students are expected to have a sufficient interdisciplinary grounding in the fundamentals of mammalian vision to read primary research sources extensively, and will be required to present short oral critiques of selected readings to the class. A term project that combines a problem statement, literature review, and either (1) simulation of a model or (2) a design for a pyschophysical experiment is also required.
FREQUENTLY-ASKED QUESTIONS about CN 730
Information for GUEST SPEAKERS
Dates of DELIVERABLES for student research reports
Abstract: The brain constructs a representation of the visual world
from
the outputs of neurons with very small receptive fields. This
invariably
introduces errors into the measurements of the most basic visual quantities,
and the resulting confusion is often called the aperture problem.
I will
talk about how area MT in the macaque resolves the aperture problem
for
moving stimuli.
Background:
Albright TD, Stoner GR (1995) Visual motion perception.
Proc Natl Acad
Sci U S A 1995 Mar 28;92(7):2433-40
Allman, J. M., F. Miezin, and E. McGuinness (1985) Direction- and
velocity-specific responses from beyond the classical receptive field
in the
middle temporal visual area (MT). Perception 14:105-26.
Core:
Albright TD. Direction and orientation selectivity of neurons
in visual
area MT of the macaque. Journal of Neurophysiology, 1984;
52(6):1106-30.
Lorenceau J, Shiffrar M, Wells N, Castet E. Different motion
sensitive
units are involved in recovering the direction of moving lines. Vision
Research, 1993; 33(9):1207-17.
Pack CC, Born RT. Temporal dynamics of a neural solution to the
aperture
problem in macaque visual area MT. Nature, in press.
Born, R. T., J. M. Groh, R. Zhao, and S. L. Lukasewycz (2000) Segregation
of
object and background motion in visual area MT: effects of microstimulation
on eye movements. Neuron. 2000 Jun;26(3):725-34. Download
pdf file.
Supplementary:
Britten, K. H., W. T. Newsome, M. N. Shadlen, S. Celebrini, and J. A.
Movshon (1996) A relationship between behavioral choice and the visual
responses of neurons in macaque MT. Vis. Neurosci. 13:87-100.
Newsome, W T, R H Wurtz, M R Dursteler, and A Mikami (1985) Deficits
in
visual motion processing following ibotenic acid lesions of the middle
temporal visual area of the macaque monkey. J. Neurosci. 5:825-40.
Masson GS, Rybarczyk Y, Castet E, Mestre DR (2000) Temporal dynamics
of
motion integration for the initiation of tracking eye movements at
ultra-short latencies. Visual Neuroscience.
Watanabe, T. & Miyauchi, S. (1998). Role of attention and form in visual motion processing: Psychophysical and brain imaging studies. In High-level motion processing-Computational, neurobiological and psychophysical perspectives (Ed. Takeo Watanabe), MIT Press, pp95-114.
Watanabe, T. et al (1998). Attention-dependent differential activation within the motion pathway. Proceedings of the National Academy of Science of USA, 95, 11489-11492. download pdf
Watanabe, T. et al (1998). Attention-regulated activity in human primary visual cortex. Journal of Neurophysiology, 79, 2218-2221. download pdf
Feb 7 -- David Somers
Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., Rosen, B. R., & Tootell, R. B. (1995). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science, 268(5212), 889-893.
Somers, D.C., Dale, A.M., Seiffert, A.E., Tootell, R.B.H.
Functional MRI Reveals Spatially Specific Attentional Modulation in
Human Primary Visual Cortex download
pdf Proc. Nat'l Acad. Sci. (USA), 96, 1663-1668, 1999.
See also: PNAS Commentary by Posner & Gilbert download
pdf
Supplementary articles suggested by students:
Montero VM, (2000), Attentional activation of the
visual thalamic reticular nucleus depends on
'top-down' inputs from the primary visual cortex via
corticogeniculate pathways, Brain Res, May download
pdf
2;864(1):95-104.
Tootell RBH, Hadjikhani NK, Mendola JD, et al.
From retinotopy to recognition: fMRI in human visual cortex
TRENDS COGN SCI 2: (5) 174-183 MAY 1998 download
pdf
LEARNING-BASED NEURAL REPRESENTATIONS OF BIOLOGICAL MOTION
The human visual system has an astonishing capability for the
analysis of complex biological motion stimuli.
The underlying neural mechanisms are still largely unknown.
A neural model is presented that is consistent with the
facts know from the neurophysiology of the ventral and dorsal
pathway that reproduces a variety of psychophysical
and neurophysiological results on biological motion perception.
The model is based on the assumption that complex movement patterns
are encoded on the basis of learned prototypical example patterns.
This assumption is analogous to the representation of the
shape of 3D-objects by learned 2D-prototypical views in the
ventral pathway, that is strongly supported by recent
psychophysical and neurophysiological evidence The model makes
a
number of predictions that can be tested psychophysically,
neurophysiologically, and using FMRI methods.
The assumption of a representation of articulated movement patterns
by learned prototypical examples can also be exploited in the domain
of computer vision. The second part of the talk treats a new
method that allows to define morphable models for spatio-temporal
patterns by linear combination of prototypical example
movements. It is demonstrated that this method has
a broad application spectrum. One application in the field of
computer graphics is the synthesis of new movement patterns
by motion morphing. Other applications in the field of computer vision
are the classification of movement patterns, and in particular
the estimation of parameters that characterize the style
of complex movements.
Main readings:
Giese, M.A. Neural Model for the Recognition of Biological Motion. In:
Dynamische Perzeption, G. Baratoff and H. Neumann (eds.), Infix
Verlag, Berlin, 105-110, 2000. download
PostScript file
Giese, M.A. and T. Poggio. Morphable Models for the Analysis and Synthesis
of
Complex Motion Pattern, International Journal of Computer
Vision, 38, 1, 59-73, 2000. download
g'zipped PostScript file
Additional readings:
Giese, M.A. Dynamic Neural Field Theory of Motion Perception,
Kluwer Academic
Publishers, Dordrecht, Netherlands, 1999.
Johansson, G. (1973) Visual perception and a model for its analysis.
Perception and Psychophysics, 14, 201-211.
Perrett DI, Smith PA, Mistlin AJ, Chitty AJ, Head AS, Potter DD,
Broennimann R, Milner AD, Jeeves MA (1985) Visual analysis of body
movements by neurones in the temporal cortex of the macaque monkey:
a
preliminary report. Behav Brain Res. 1985 Aug;16(2-3):153-70.
Pinto, J. and Shiffrar, M. (1999) Subconfigurations in the human form
in the perception of biological motion displays. Acta Psychologica,
102, 293-318. download
pdf
Riesenhuber, M. and T. Poggio. Hierarchical Models of Object Recognition
in
Cortex, Nature Neuroscience, 2, 1019-1025, 1999. download
pdf
Riesenhuber, M., and T. Poggio. Models of Object Recognition, Nature
Neuroscience, 3 Supp., 1199-1204, 2000. download
pdf
Hour 1:
-----------------------------------------------------------------
"How to Tell Shading from Paint"
Bill Freeman
Mitsubishi Electric Research Labs (MERL)
Abstract:
When people study a picture, they can judge whether it depicts
a
shaded, 3-dimensional surface, or simply a flat surface with
markings
or paint on it. This task--distinguishing shading from
paint--is essential for interpreting images. We seek to
get a
computer to make the same judgements. We use as "ground
truth" a
database of pictures that human subjects had labelled according
to
their "shadedness" (from Freeman and Viola '98).
We use a machine learning approach. We generate a training
set of
synthetic examples of images that are either caused by shading
or
paint, from which we derive probabilistic interpretations for
a given
local patch of image. We use a Markov network to model
the images and
underlying scenes, and use Bayesian belief propagation to efficiently
propagate the local probabilistic evidence across the image.
The machine learning approach focusses attention on representations.
We contrast two different approaches. One uses a pixel-based
image
representation and solves for the shape and reflectance at each
position. The second approach represents image data by
a cascaded
energy model, and represents the scene only by a label for the
cause
of the image information at each position, scale, and orientation
of a
steerable pyramid. We compare the methods, and show results
from
each approach.
Joint work with Egon Pasztor (MIT Media Lab) and Matt Bell (Stanford).
References:
Bell and Freeman, Learning local evidence for shading and reflectance,
http://www.merl.com/reports/TR2001-04/
Freeman, Pasztor, and Carmichael, Learning Low-Level Vision,
Intl. J. Computer Vision, October, 2000.
http://www.merl.com/reports/TR2000-05/
Freeman and Viola, Bayesian model of surface perception, NIPS 1998
http://www.merl.com/reports/TR98-05/
Hour 2:
-----------------------------------------------------------------
Baback Moghaddam <baback@merl.com>
Title: Gender Classification with Support Vector Machines (SVMs)
Pointers:
"Gender Classification with Support Vector Machines," Moghaddam B. and
Yang M-H., in Proceedings of the 4th IEEE Int'll Conf. on Face and
Gesture Recognition, FG2000, Grenoble, France, March 2000.
http://www.merl.com/reports/TR2000-01/index.html
C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern
Recognition. Knowledge Discovery and Data Mining, 2(2), 1998.
http://www.kernel-machines.org/papers/Burges98.ps.gz
SVM home page:
http://svm.first.gmd.de/
Chest computed tomography (CT) has become a well-established means of
diagnosing pulmonary metastasis of oncology patients and evaluating
response to treatment regimens. Since diagnosis and prognosis
of
cancer generally depend upon growth assessment, repeat CT studies are
used to determine growth rates of pulmonary nodules.
The long-term objective of our project is to offer the radiologist a
fully automated computer vision system that detects and compares
pulmonary nodules in repeat CT studies. Such a system would provide
a
quantitative and efficient tool for the radiologist to analyze CT
scans and may therefore indirectly impact patients' treatment regimen.
I will describe a prototype system for the analysis of pulmonary
nodule location, shape, and volumetric growth and then present some
recent results on automatic image registration.
Reading:
J. P. Ko and M. Betke, "Chest CT: Automated Nodule Detection and
Assessment of Change over Time-Preliminary Experience." Radiology,
218,
267-273, January 2001. download
pdf
Readings:
1. Blasdel GG. 1997. Strategies of visual perception suggested by
optically imaged patterns of functional architecture in monkey visual
cortex. In: Imaging Brain Structure and Function (Lester DS, Felder
CC,
and Lewis EN, ed.). Ann. N.Y. Acad. Sci. 820: 170-195. download
pdf
2. Obermayer K, and Blasdel GG. 1993. Geometry of orientation and ocular
dominance columns in monkey striate cortex. J. Neuroscience. 13(10):
4114-4129. download
pdf
3. Blasdel GG. 1992. Differential imaging of ocular dominance columns
and orientation selectivity in monkey striate cortex. J. Neuroscience.
12(8): 3115-3138.
4. Blasdel GG, Campbell D. Functional retinotopy of monkey visual
cortex. J. Neurosci. 2000.
5. Blasdel GG, Salama G. Voltage-sensitive dyes reveal a modular
organization in monkey striate cortex. Nature. 321:579-585.
1986.
Readings:
Assad, J.A. and Maunsell, J. H. Neuronal correlates of inferred motion in primate posterior parietal cortex. Nature 373: 518-521 (1995).
Eskandar, E. N. and Assad, J.A. Dissociation of visual, motor and predictive signals in parietal cortex during visual guidance Nature Neuroscience2:88-93 (1999). download pdf
Nakayama, K., He, Z. J., and Shimojo, S. (1995). Visual surface representation: A critical link between lower-level and higher-level vision. In S. M. Kosslyn and D. N Osherson, Eds., Visual cognition. Cambridge, MA: MIT Press, 1995.
Nakayama, K. and Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257, 1357-1363.
Duncan, R. O., Albright, T. D., & Stoner, G. R. (2000). Occlusion and the interpretation of visual motion: perceptual and neuronal effects of context. J Neurosci, 20(15), 5885-5897. download pdf
Bakin, J. S., Nakayama, K., & Gilbert, C. D. (2000). Visual responses in monkey areas V1 and V2 to three-dimensional surface configurations. J Neurosci, 20(21), 8188-8198. download pdf
Zhou, H., Friedman, H. S. and von der Heydt, R. Coding of Border Ownership in Monkey Visual Cortex. J Neurosci, 20(17):6594-6611. download pdf
Fay, D.A.,Waxman, A.M., Aguilar, M., Ireland, D.B., Racamato, J. P., Ross, W.D., Streilein, W.W., and Braun, M. I. Fusion of Multi-Sensor Imagery for Night Vision: Color Visualization, Target Learning and Search. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.
Ross, W.D., Waxman, A.M., Streilein, W.W., Aguilar, M., Verly, J., Liu, F., Braun, M.I., Harmon, P., and Rak, S. Multi-Sensor 3D Image Fusion and Interactive Search. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.
Streilein, W.W., Waxman, A.M., Ross, W.D., Liu, F., Braun, M.I., Fay, D.A., Harmon, P., and Read, C.H. Fused Multi-Sensor Image Mining for Feature Foundation Data. Proceedings of the Third International Conference on Information Fusion. Paris, France, July 10-13, 2000.
Schiller, P. (1998). The neural control of visually guided eye movements.
In John Richards (Ed.) Cognitive Neuroscience of Attention.
Hillsday,
NJ: Earlbaum.
See also: Schiller lab web site, especially pages on eye movements.
Last Updated 2 April, 2001
This page is maintained by Ennio Mingolla
Please direct all queries and bug reports to: ennio@cns.bu.edu