Dr. Takaaki Kuratate in conversation with his Mask-bot self. |
Now, this is not new to its robotic predecessors that are also equipped with amazing Artificial Intelligence (AI), but what sets Mask-bot apart is that it can instantly construct and project a static video image of anyone's face (from a photo) on a 3D surface, and it moves its virtual head a little and raises its eyebrows as you speak, to create the impression that it understands, although it actually doesn't yet.
It also projects an image from behind, making it more realistic, unlike Disney animatronics characters for example, which are projected from the front, and works in daylight. It's also more flexible than existing humanoid robots, which use a complex set of mechanical parts and must be custom-designed.
Avatars for video conferencing
According to Dr. Takaaki Kuratate, Mask-bot could soon be deployed in video conferences. "You can create a realistic replica of a person that actually sits and speaks with you at the conference table. You can use a generic mask for male and female, or you can provide a custom-made mask for each person."
But a more advanced version of Mask-bot doesn't even require a video image of the person speaking. A program can also convert a normal two-dimensional photograph into a correctly proportioned projection for a three-dimensional mask complete with facial expressions and voice. A talking-head animation engine filters an extensive series of face motion data from a variety of people collected by a motion capture system and selects the facial expressions that best match a specific phoneme being spoken. Examples can be found here.
The computer extracts a set of facial coordinates from each of these expressions, which it can then assign to any new face, bringing it to life. Emotion synthesis software then delivers the visible emotional nuances that indicate, for instance, when someone is happy, sad, or angry.
Synthesized voice
An advanced version of Mask-bot is said to also have the ability to reproduce content typed via a keyboard. A text-to-speech system converts text in English, Japanese, and soon German to audio female or male voice, which can be quiet or loud, happy or sad. Mask-bot doesn't actually understand anything; it just listens and makes pretend responses as part of a fixed programming sequence.
Meanwhile, the Munich researchers are working on Mask-bot 2, a mobile version. The mask, projector, and computer control system will all be contained inside a robot costing around EUR 400 (Mask-bot 1 is 3,000 EUR).
"Mask-bot will influence the way in which we humans communicate with robots in the future," predicts Prof. Gordon Cheng, head of the ICS team. "These systems could soon be used as companions for older people who spend a lot of time on their own," says Kuratate.
No comments:
Post a Comment