Natural Tasking of Robots Based on Human Interaction Cues

MIT Computer Science and Artificial Intelligence Laboratory
The Stata Center
32 Vassar Street
Cambridge, MA 02139
USA

PI: Rodney A. Brooks

[Project Overview], [Approach], [Research Questions], [Achieved Deliverables], [Future Deliverables], [People], [Publications]

Cog turns a crank
M4 robot head drawing
Kismet plays with a frog
Coco the gorilla robot

Achieved Deliverables

1999-2000 | 2000-2001 | 2001-2002 | 2002-2003

2003-2004:

Accomplishments are on 7 robotic platforms: Cog, Cardea, Coco, a robot head named Mertz, a new humanoid named Domo, a human wearable/hybrid system named Duo and an unnamed 5 DOF hand. This work was done between July 2003 and July 2004.

The 'Yet Another Robot Platform' open software library is used on 6 platforms

Software written in C and C++ that provides routines for robot platform development in terms of inter-process communication, vision and control and has operating system services support for Windows NT and QNX4 and QNX6 is running on multiple robotic platforms at MIT: Cog, Coco, Domo, Mertz, Cardea and Duo, and in Europe.

Door Shoving by A Self-Balancing Mobile Humanoid

Cardea , a prototype humanoid based on a Segway RMP extended with contact and IR sensing, simple torso and 3 DOF arm manipulation, vision using 1 fixed mounted camera and a single DOF camera and entirely on-board computation, can navigate an office corridor, find a partially closed door, shove it open and pass through.

Watch it in action.

Emergency Kickstands within Safety System for Segway RMP base

Two emergency kickstands for Cardea deploy when a 'sniffer' detects software definable error conditions indicating the platform is falling over. They are part of a complete safety system that overrides robotic control when the RMP over-tilts. First, the system relies on RMP self-balancing. When self-balancing fails, the kickstands eject. Safety is also ensured via radio controlled Emergency stop (E-stop).

For more information.

A Lightweight Computational Hardware Architecture Supporting Humanoid Mobility and Manipulation

A computational hardware architecture consisting of a network of distributed, onboard lightweight 8-bit computational elements that supports behavior, sensorimotor and RMP controllers, power circuitry and debugging demonstrates humanoid navigation and manipulation.

A Prototype Camera-Arm Platform Integrating a Visual System and a Motor System Running on an Embedded Architecture

The design of embedded brushless motor amplifiers, DSP motor controllers and sensor conditioning is integrated with the ALIVE hardware and software architecture. A 5 DOF force controllable prototype arm, with series elastic actuators, a differentially driven shoulder and a virtually centered elbow, runs on the embedded architecture using virtual spring control and a 'CreaL' (creature language) behavioral controller. It can track in conjunction with a 2 DOF active vision system running on a laptop. It can reach towards and poke an object using visual and color information and estimating the position of its hand via forward kinematics in visual coordinate space.

Watch it in action.

A Creature-based Approach to Robotic Existence

Mertz, an active-vision humanoid head platform, fulfills an immediate goal of running continuously for days without supervision at a variety of locations. Mertz is designed with fault prevention strategies in mind, It can instantly startup and perform joint calibration. It has circuitry to protect against power cycles and abrupt shutdown. Its vision system is adaptable to different lighting conditions and backgrounds.

Domo: A Force Sensing and Compliant Humanoid Platform

Completed the design, fabrication and assembly of a new force sensing and compliant humanoid platform, named Domo, for exploring general dextrous manipulation, visual perception and learning. Domo incorporates force sensors and compliance in most of its joints to act safely in an unstructured environment. It consists of a two 6 DOF arms, two 4 DOF hands, a 7 DOF head, a 2 DOF neck, 58 proprioceptive sensors and 24 tactile sensors. Twenty-four DOF use force controlled compliant actuators. Its realtime sensorimotor system is managed by an embedded network of DSP controllers. Its vision system, which (2 cameras, 3 DOF) utilizes the YARP software library, and its cognitive system run on a small, networked cluster of PC's.

Overcoming Mechanical Modes of Failure

Domo achieves mechanical robustness: geartrain failures are mitigated by using ball screws and elastic spring elements, motor winding reheating is avoided by current limits in its brushless DC motor amplifiers and prevention of stall currents, cable breakage and wire strain susceptibility have been reduced, and maintenance is easier by the design of modular subsystems.

Two Force Controlled Arms

Domo's arm design focuses on force control. An arm is passively or actively compliant and able to directly sense and command torques at each joint. This design forgoes the conventional emphasis on end effector stiffness and precision to, instead, mimic human capabilities. It relies on advanced linear Series Elastic Actuators.

A Robust Multi-Layered Sensorimotor and Cognitive System:

Domo has been designed with four layers of sensorimotor and cognitive systems: physical for sensors, motors and interface electronics, DSP for real time control, a sensorimotor abstraction layer for interfacing between the DSP and cognitive layers, and a cognitive layer. It emphasizes robustness to common modes of failure, real-time control of time critical resources and expandable computational capability. This runs on a combination of special purpose embedded hardware communicating through a CAN bus or Firewire, in the case of cameras, to a cluster of Linux nodes.

Advanced Design of Elastic Force Sensing Actuators with Embedded Amplifiers

Design of new version of SEA using a) linear ball screws for greater efficiency and shock tolerance b) a cable drive transmission allowing actuator mass to be moved far from the end point reducing energy consumption and hence needing lower wattage motors, plus allowing modular and standardized packaging implying easier maintenance and reuse. A novel force sensing compliant (FSC) actuator places the spring element between the motor housing and the chassis ground which allows continuous rotation at the motor output. The FSC actuator is compact due to use of torsion springs. Embedded custom brushless motor amplifiers and sensory signal amplifiers that reduce wiring run-length and thus simplify cable routing and lead to better robustness are incorporated.

A 5 DOF Sensor Rich hand with Series Elastic Actuation

Design, fabrication and assembly a 5 DOF sensor rich hand with simple, scalable force actuators . Three fingers with 8 force sensing axes and 5 position sensors , each consisting of 2 coupled and decoupling links driven by a compact, inexpensive rotary series elastic actuator which makes the hand mechanically compliant and force controllable. The last two links of each finger are equipped with dense arrays of force sensing resistors.

DUO: A Human/Wearable Hybrid for Learning About Common Manipulable Objects

Duo consists of a glasses mounted digital camera connected to a backpack holding a laptop which communicates wirelessly to a computer cluster. It also has four orientation sensors that are head, wrist, upper arm and torso mounted. Duo passively and actively observes the manipulation of objects in natural, unconstrained environments. It measures the kinematic configuration of its wearer's head, torso and dominant arm while watching its wearer's workspace through a head mounted camera. It requests helpful actions from its wearer through speech via headphones. It can segment common manipulable objects with high quality.

Using Cast Shadows for Visually-Guided Touching

The shadow cast by a robot's own body is used to help direct its arm twoards, across, and away from an unmodeled surface without damaging it. The shadow is detected by a camera and used to derive a time-to-contact estimate which, when combined with the 2D tracked location of the arm's endpoint in the camera image is sufficient to allow 3D control relative to the surface.

Exploiting Amodal Cues for Robot Perception

Rhythmically moving objects, such as tools and toys, are detected, segmented and recognized by the sounds they generate as they move. This method does not require accurate sound localization but can complement that information. It is selective and robust in the face of distracting motion and sounds. This perceptual tool is required for a robot to learn to use tools and toys through demonstration.

Object Segmentation by Demonstration

A human teaches Cog how to segment objects from arbitrarily complex non-static images by waving the object to introduce it. An algorithm detects the skin color of the human's arm, and tracks its motion. Then the object's compact cover is extracted using the periodic trajectory information.

Figure/Ground Segmentation from Human Cues

In order to infer large scale depth and build 3-dimensional maps, Cog exploits its human helper's arm as a reference measure while measuring the relative size of objects on a monocular image. It is also able to perform figure/ground segregation on typical heavy objects in a scene, such as furniture and perform 3D object and scene reconstruction. This argues for solving a visual problem not simply by controlling the perceptual system, but actively changing the environment through experimental manipulation.

A Learning Framework for a Humanoid Robot Inspired by Developmental Learning

For Cog to learn about its physical surroundings, a human helps Cog to correlate its own senses, to control and integrate situational cues from its surrounding world and to learn about out-of-reach objects and the different representations in which they might appear. The strategies for this learning are inspired by child development theory which defines a separation and individuation developmental phase.

On-line Parameter Tuning of Neural Oscillators

Cog employs neural oscillators in its arm that are capable of adapting to the dynamics of the arm's controlled system. After using a time-domain analysis to intuitively tune the parameters of neural oscillators, Cog plays a rhythmic musical instrument such as a drum or tamborine.

Learning Task Sequences from Scratch

Task sequencing requires recognizing an object, identifying it with some associated action then learning the sequence of events and objects that characterize the task. For example, a saw must be recognized and moved back and forth on the correct plane to complete the task of sawing. Cog can learn task sequences from human-robot interaction cues. A human teaches the robot new objects such as tools and toys and their functionality. The robot explores the world and extends its knowledge of the objects' properties. It acquires recognition of multi-modal percepts by manipulating the tools and toys.

[Back to Top]

[Project Overview], [Approach], [Research Questions], [Achieved Deliverables], [Future Deliverables], [People], [Publications]