Gesture interface uses a camera to recognize and interpret simple hand gestures to control vehicle functions like entertainment or climate controls."The hard part is making sure there are distinct configurations of the hands and motions of the hands that the system recognizes," says Ed Schlesinger of Carnegie Mellon University."Once you do that you can tie them to anything you want. You could allow customers to define what each gesture means just as they now can with the buttons on a video game."
Although hand gestures and driving are typically thought of only in the context of fists and extended middle fingers, researchers at Carnegie Mellon University (CMU) have a slightly different perspective, one that could result in improved use of the various functions in a vehicle without driver distraction (a cause, of course, of the aforementioned fists and fingers). “Turning knobs and pressing buttons and even stepping on the brake or accelerator are all gestures, so we use them to control everything in the car,” says Tsuhan Chen, professor of electrical and computer engineering at CMU. But they’re taking things a step further, because the work on gesture interfaces at CMU doesn’t include physical touch. “The idea is that rather than fumbling with dials you can make an appropriate gesture to make things happen,” says Ed Schlesinger, co-director of the General Motors Collaborative Laboratory at CMU.
Helping Hands. The technology behind gesture interface is that of image recognition. CMU researchers use a small camera to track the movements of a hand and write software algorithms that result in the recognition and interpretation of various gestures. “If you are trying to identify a hand, there are some key locations you must find, like the center of the palm, and that may only be a set of a half a dozen points,” explains Schlesinger. “Once you know how those points are moving spatially, then you have essentially recognized the whole gesture. So it is not like the system has to identify everything about your hand in great detail to figure out the gesture.” Chen adds, “We started out with simple things like using the index finger to point since it is one of the most natural gestures. We can track that very precisely now. Then we gradually expanded the vocabulary of the system to recognize an open palm and a fist. It’s like sign language. You have a fixed alphabet, but by combining different letters you create an unlimited amount of words. We are not to that extreme yet, but that is where we are going.”
The hardware for the demonstration system that CMU has assembled was purposely kept simple and cheap, since the goal is to create a system that would meet automakers’ competitive cost demands. The camera used costs less than $5 and the software is run on standard-issue laptops. Fitted on a Pontiac Montana minivan, the system has a camera positioned in the center console area pointed up at the roof so that the space in which the driver makes command gestures is essentially the same as where a gearshift lever might be. The thinking is to keep the operation of the system as familiar and natural as possible so that the driver won’t be distracted from watching the road. (Another benefit is that since the gestures are made at a low level in the center of the vehicle other drivers are not likely to see them and interpret them as digital expletives.)
The demonstration unit currently controls a cell phone mounted in the vehicle and allows drivers to literally wave off incoming calls with a dismissive motion. An extended index finger combined with a clockwise rotation can raise the stereo volume, while a counterclockwise motion lowers it. The chance of making a random gesture that the system would interpret as a command is currently kept low by limiting the field of view of the camera to a relatively narrow space. “However,” says Schlesinger, “if we have an extremely robust system that can pick up gestures anywhere in the car and you tend to talk a lot with your hands, then that could be an issue.” One way around that is to designate a specific gesture as a cue that tells the system that the motions to follow are to be interpreted as commands.
Gesture vs. Voice et al. Isn’t voice recognition in cars solving this interface problem? Chen answers that (1) cars are noisy, so that technology needs much more work and (2) many people simply don’t like the idea: “I would feel strange if I had to talk to my car,” he says. What about force feedback-based systems like BMW’s controversial iDrive that aim to keep the driver’s eyes on the road by allowing him to feel his way through control menus? Chen thinks they are far more limited than gesture interface, and that tactile feedback notwithstanding, drivers still tend to want to look at a screen while making selections.
However, no one at CMU thinks that gesture interface will be the sole control method in future vehicles. Chen says, “I fully believe in the end the best solution will be a combination of both voice control and hand gestures.” And Schlesinger explains, “It is not clear what will be the winning way that we will interact with the automobile. But having various types of interfaces allows each one to work better because it is in its own context. If the context is limited, for example if the car knows that the only thing you use voice for is navigation, then it tends to be more accurate, because the space it has to search in is smaller.”
Hurdles. When will gesture interface be production ready? “The fundamental technology is ready now,” says Chen. Reliability—e.g., discerning individual digits; using infrared cameras in place of optical systems to avoid miscues based on changes in light levels—is now being addressed. “The big hurdle is justifying the cost of the camera in the car,” says Schlesinger. He says that gesture interface alone may not be seen as sufficiently beneficial to the average customer to convince automakers to include it on vehicles, but that it could piggyback on another use. For example, since facial recognition requires a camera and similar software, automakers could sell a security feature that would authorize vehicle operation based on facial features. Once the camera is in the car for that purpose, gesture interface becomes an inexpensive addition.
Customer acceptance could be another problem. Just as there are people who don’t want to talk to their cars there will be some who don’t want to wave at them either. But Schlesinger dismisses that concern. “People like it,” he claims.
|GM's PITTSBURGH BRAIN TRUST|
About three years ago General Motors decided it could use a little help in trying to define the future of automotive electronics, so it turned to Carnegie Mellon University (CMU). Together, the institutions formed the General Motors Collaborative Laboratory at Carnegie Mellon. "The purpose of the lab is to bring all aspects of information technology into the automobile from software reliability issues and x-by-wire systems, to human-vehicle interaction and wireless multimedia. GM came to us saying that they realized that they had to bring IT into the automobile in a big way," says its co-director Ed Schlesinger. He continues: "They asked us to come up with ideas and give them a menu of possibilities from which they can pick and choose the things they think will really take off."
GM has been happy enough with its relationship that it recently extended the lab's contract for another five years. Schlesinger says that the lab has shipped both hardware and software to its counterparts at GM R&D for possible use in concept vehicles. "Our goal is to have CMU fingerprints all over GM vehicles," he says.