The Eyes
The Cog Shop
MIT Artificial Intelligence Laboratory
545 Technology Square, #920
Cambridge, MA 02139
|
|
To approximate the complexities of the human visual system, we designed Cog's head and
visual system around five goals:
- Binocular
- Active
- Compact
- Wide Field of View
- High Resolution Area
A binocular camera arrangement is necessary because disparity and approximate depth
will be important factors for discriminating objects. We also require the system to be
active, that is, the eyes should have a human-like speed and range of motion. The system
should be compact enough to fit on the robot's head, and to allow the cameras to move with
reasonable speed. As with human vision, the system should have both a wide field of view
(for detection of motion and objects in the far fields) and very high resolution
capability. (Humans and other animals have both a wide, low-resolution field of view and a
very narrow, high resolution field of view called the fovea.)
The camera system has four degrees of freedom (DOF) consisting of two active
"eyes". To mimic human eye movements, each eye can rotate about a vertical axis
(pan DOF) and a horizontal axis (tilt DOF). Human eyes actually have more than two degrees
of freedom, but the pan and tilt DOFs are sufficient to scan the visual space. To
approximate the range of motion of human eyes, mechanical stops were included on each eye
to permit a 120 degree pan rotation and a 60 degree tilt rotation. Each eye consists of
two black and white CCD cameras. Together, the camera ensemble approximates the wide
peripheral view and high resolution fovea region as described below. Small remote head
cameras were chosen so that each eye is compact and lightweight. Each camera is
finger-sized, measuring approximately 17 mm in diameter and 53 mm in length (without
connector), and weighs only 25 grams.
The lower camera of each eye gives Cog a wide peripheral field of view 88.6 degrees (V)
by 115.8 degrees (H). Although this is narrower than human peripheral vision, it is
difficult to buy a lens with a wider field of view. The lens can focus from 10 mm to
infinity. The upper camera of each eye gives Cog a high resolution fovea. The lens has a
15 mm focal length with a 18.4 degrees (V) by 24.4 degrees (H) field of view. This
provides a fovea region significantly larger than that of the human eye, which is
approximately 0.3 degrees. The lens focuses objects at a distance range of 90 mm to
infinity. We could have simplified our design by using a single camera per eye. However,
by using two cameras per eye we have a much higher resolution fovea than the single camera
eye. We feel the increase in angular resolution significantly increases the functionality
of the system at discerning fine features (faces and textures). This added functionality
outweighs the additional mechanical complexity. The two images shown below were captured
simultaneously from the wide-angle and foveal cameras in Cog's left eye:
By minimizing the inertia of each eye, and using thin, flexible cables, the eyes can
move quickly using small motors. Each fully assembled eye (cameras, connectors, and
mounts) occupies a volume of approximately 42 mm(V) by 18 mm(H) by 88 mm(D) and weighs
about 130 grams. Although significantly heavier and larger than their human counterpart,
they are smaller and more lightweight than other active vision systems. To maintain an
anthropomorphic appearance, the eyes were mounted in a head slightly larger than a
human's.
On average, the human eye performs 3 to 4 full range saccades per second. Given this
goal, Cog's eye motor system is designed to perform three 120 degree pan saccades per
second and three 60 degree tilt saccades per second (with 250 ms of stability in between
saccades). To meet this requirement, Maxon 3.2 Watt motors with a 19.2:1 reduction were
selected for the pan motors, and Maxon 2.5 Watt motors with 16.58:1 reduction were
selected for the tilt motors.
|