Computers have looked much the same for nearly twenty years. We are so used to the screen, keyboard and mouse that we forget that this appearance is merely a fashion and an accident of history. We should be able to communicate with a computer by speech and by gestures. Instead of using special tools like a keyboard, the computer can be programmed to determine our desires by watching and listening.
The Watching Window is an experiment in this idea of natural communication. The user is « watched » by two tiny TV cameras and the computer must deduce from gestures what the user's requirements are.
In this simple demonstration, the computer is simulating a window into a 3D world. The TV cameras track your hands and eyes and the display is changed accordingly. If you move your head, you see the simulated world from a different angle. You can interact with the display by pointing at simulated objects. The 3D effect is made even more convincing with the use of stereo glasses.
We are developing software to find and track human hands and eyes in a general way. In this demonstration, we separate the human figure, foreground, from whatever was in the camera view before, background. This separated image is a silhouette of the human figure. Our software attempts to identify the likely positions of head and hands from this silhouette.
(Above) Examples of extracted silhouettes with head and hands identified.
As you move around a real object, you see it from different sides. You can look at objects in the watching window the same way. Lower your head and you get a view from underneath. To achieve this the computer must work out what the object would look like from your point of view. This is what we display on the screen.
(Above) From different eye points, the box looks different. We draw on the screen what you should see from that direction.