The motivation behind creating Cog is the hypothesis that:  
    
      Humanoid intelligence requires humanoid interactions with the world. 
     
    (Read  The Cog Project: Building a Humanoid
    Robot for more details.) 
    Avoiding flighty anthropomorphism, you can consider Cog to be a set of sensors and
    actuators which tries to approximate the sensory and motor dynamics of a human body.
    Except for legs and a flexible spine, the major degrees of motor freedom in the trunk,
    head, and arms are all there. Sight exists, in the form of video cameras. Hearing and
    touch are on the drawing board. Proprioception in the form of joint position and torque is
    already in place; a vestibular system is on the way. Hands are being built as you read
    this, and a system for vocalization is also in the works. Cog is a single hardware
    platform which seeks to bring together each of the many subfields of Artificial
    Intelligence into one unified, coherent, functional whole.  
     
    Why build a human-like robot? 
    In thinking about human level intelligence, there are two sets of reasons one might
    build a robot with humanoid form. 
    If one takes seriously the arguments of Johnson and Lakoff, then the form of our bodies
    is critical to the representations that we develop and use for both our internal
    thought (whatever that might mean...) and our language. If we are to build a robot
    with human like intelligence then it must have a human like body in order to be able to
    develop similar sorts of representations. However, there is a large cautionary note to
    accompany this particular line of reasoning. Since we can only build a very crude
    approximation to a human body there is a danger that the essential aspects of the human
    body will be totally missed. There is thus a danger of engaging in cargo-cult science,
    where only the broad outline form is mimicked, but none of the internal essentials are
    there at all. 
    A second reason for building a humanoid form robot stands on firmer ground. An
    important aspect of being human is interaction with other humans. For a human-level
    intelligent robot to gain experience in interacting with humans it needs a large number of
    interactions. If the robot has humanoid form then it will be both easy and natural
    for humans to interact with it in a human like way. In fact it has been our observation
    that with just a very few human-like cues from a humanoid robot, people naturally fall
    into the pattern of interacting with it as if it were a human. Thus we can get a large
    source of dynamic interaction examples for the robot to participate in. These examples can
    be used with various internal and external evaluation functions to provide experiences for
    learning in the robot. Note that this source would not be at all possible if we simply had
    a disembodied human intelligence. There would be no reason for people to interact with it
    in a human-like way. 
     
    Why not just simulate it?
    One might argue that a well simulated human face on a monitor would be as engaging as a
    robot---perhaps so, but it might be necessary to make the face appear to be part of a
    robot viewed by a distant TV camera, and even then the illusion of reality and engagedness
    might well disappear if the interacting humans were to know it was a simulation. These
    arguments, in both directions are speculative of course, and it would be interesting,
    though difficult, to carry out careful experiments to determine the truth. Rather than
    being a binary truth, it may well be the case that the level of natural interaction is a
    function of the physical reality of the simulation, leading to another set of difficult
    engineering problems. Our experience, a terribly introspective and dangerous
    thing in general, leads us to believe that a physical robot is more engaging than a screen
    image, no matter how sophisticated. 
    But in any case...
    It turns out to be easier to build real robots than to simulate complex intereactions
    with the world, including perception and motor control. Leaving those things out would
    deprive us of key insights into the nature of human intelligence. 
    Already from our work with Cog we have discovered how being forced to learn an occular
    motor map so that the robot can saccade high resolution cameras (something that wouldn't
    be necessary to do in simulation) gives us a basis for learning visually guided
    manipulation skills. 
    To do a worthwhile simulation you have to understand all the issues relevant to the
    simulation beforehand; but as far as human level intelligence is concerned, that is
    exactly what we are trying to find out--the relevant issues. 
       |