The Virtual Reality Gorilla-Rhino Test
A couple of months ago, while I was in Atlanta for the Electronic Entertainment Expo, I had occasion to put my hand in a rhinoceros' mouth.
At first hearing this sounds like a highly dangerous thing to do. A rhinoceros is a 2500-pound animal with molars that could crush a human hand to hamburger without the rhino even noticing.
However, rhinos are herbivores, and their teeth are set far back in their jaws. They have a pointed, prehensile upper lip, which they use to grip plants and pull them up for eating. The lower lip is large and flat, and they have a long, triangular tongue, but no front teeth at all. Their lips are smooth and sensitive, but with a great deal of strength behind them - imagine a human lip backed by the musculature of a strong man's biceps. But for all that, they are remarkably delicate and precise.
It's possible, with rhinos that have been appropriately trained, to feed them slices of apple by hand. You put the apple on your open hand, and put your hand partway into its mouth. The rhino uses its upper lip to scoop the apple slice off your hand, and then crunches it up. This process, not coincidentally, involves a certain amount of rhino saliva, but I've known Newfoundland dogs that were a lot worse.
The occasion of this extraordinary experience was a visit to Zoo Atlanta in the company of one of its resident animal psychologists, right at rhino feeding time. A backstage tour of a zoo is an experience not to be missed, if you ever get the opportunity. I took a day off from E3 to go put my hand in a rhino's mouth, and in retrospect it was much more interesting than anything I saw on the show floor.
(Incidentally, if any of you are animal-rights advocates, have no fear: an apple is an appropriate thing to feed a rhinoceros, even though it's not native to Africa; and this was not a tourist gimmick. Training rhinos to eat out of a human's hand may sound unnatural and exploitive, but a trained rhino can also be hand-fed medicine, if necessary, which would otherwise have to be administered with a dart gun. This method is much easier on all concerned, including the rhino.)
Another of the things I got to see at Zoo Atlanta was the "virtual gorilla," a virtual reality demonstration. Using a VR headset, you can wander around the zoo's gorilla enclosure. Inside are simulated gorillas that react to you as if you were really there. At first they ignore you, but as you get closer they become wary and suspicious. They can tell if you're looking directly at them, which they interpret as hostile if you do it long enough. If you frighten them badly, the system gives you a "time-out" - you're automatically transported out of the enclosure. The gorillas' emotional state is shown via a colored dot over their heads, but more experienced users can turn this off and get cues directly from their body language and vocalizations.
A couple of years ago it seemed as if virtual reality was about to become a big deal in the game industry. There were four or five different hardware vendors at the Computer Game Developers' Conference, up from one or two. Point-of-view shooters like Quake seemed tailor-made for the new technology. The helmets were lighter and more sensitive, and the displays had higher resolutions than the earlier VR units, introduced ten years ago. And it was no longer necessary to stand near a polhemus, a magnetic device which detects which way the helmet is facing.
Two years later, however, there seems to be good deal less fuss. The hot new devices are 3D accelerator cards and possibly force-feedback joysticks and the hot new game feature is multiplayer Internet play. VR has slipped into the background a bit.
[Regarding this hardware - predictions have a way of looking very foolish after five years or even five months, but I'll risk it - I think force-feedback joysticks are a niche item rather like rudder pedals and Thunder Seats, those chairs with big speakers in them. Serious simulator enthusiasts will buy them, but they won't come standard with every system. 3D accelerators, on the other hand, are going to change the game-playing world in the same way that on-board math coprocessors did. The users may not know exactly why their games are so much faster and prettier, but they'll appreciate it all the same. Within a year you won't be able to buy a video card without 3D acceleration.]
Much as I enjoyed playing with the gorilla simulator, it pointed up several of the major weaknesses of current VR technology; weaknesses which to my mind are responsible for its failure to catch on.
One of the worst of these is "VR sickness," essentially identical to motion sickness. VR sickness is caused by a number of factors. One is called "visual stress," conditions which cause the human eye to work overtime. Some of these conditions are poor focus, poor resolution, and poor frame rate. With an image in poor focus, the iris of the eye tries constantly to bring the image into focus, even when it cannot be done. Poor resolution causes the viewer to strain to see details that simply aren't present. Poor frame rate can cause the image to jump or flicker.
Another source of VR sickness is the disconnection between what the viewer is seeing and what he is feeling with the rest of his senses. If his eyes perceive motion but the semicircular canals in his ears - which provide the sense of balance - do not, the result can be nausea. Similarly, if the computer driving the display is not fast enough, the image displayed as the viewer turns his head may lag behind where it should be. And the position sensors in the helmets may be subject to inaccuracies, causing the display to present the wrong image for the amount that the viewer has turned his head.
Another limitation of the VR gear that I've seen so far is that it doesn't provide much in the way of peripheral vision. The helmets I've looked through have reproduced the experience of looking at a monitor, and that, of course, is one of the great limitations of any "immersive" computer game - the need to look at a monitor, with a very narrow field of view. But peripheral vision is invaluable in immersive environments. It's not enough to be able to turn your head and see something; you need to be able to see things out of the corner of your eye, while you're still looking forward.
The great advantage that VR is supposed to provide, apart from adjusting the image as you turn your head, is stereoscopic, three-dimensional vision. 3D is not used in the movies as much as it could be - it's more of a gimmick than anything else - but movies are non-interactive. To justify the additional cost, the directors of 3D movies tend to heighten the experience in a lot of silly ways, like throwing things at the camera. In an interactive environment, though, real stereoscopic vision could make things vastly more exciting. Once again, you're up against a frame-rate problem - in order to provide stereo vision you have to render twice as many frames in a given amount of time. Most game designers would rather spend those CPU cycles on higher resolutions, more detailed images, or artificial intelligence.
Now most of the problems I've mentioned so far have been hardware-related, and I expect them to be solved reasonably soon. Issues like resolution, frame rate, and positional accuracy can all be addressed with enough research and computing horsepower. And I think it's quite possible that we'll see yet another kind of 3D accelerator card in a year or two: the stereoscopic 3D accelerator card, which renders two images at the same time, one for each eye, and has twin video outputs.
But there's another class of problems that are inherent limitations of the medium itself, and those , I think, will ultimately limit the value of VR. One of the things that I noticed about the gorilla simulator was that you could not express yourself to the gorillas in any other way than moving around and looking at them or away. Like humans, gorillas are sensitive to facial expressions and body language, but the system did not allow me to transmit these to them. Virtual reality requires more than an output device sending me information; it also requires an input device by which I can communicate with my simulated environment. Part of any reality is your own presence in it, and your appearance is important. We could add motion capture suits to capture the rudiments of body language (although fine details like the slumped shoulders of dejection or the out-thrust chest of belligerence would require a great many sensors, not to mention a very tight suit). We could also add a microphone for speech, which goes a long way with humans, but such a system would still lack the capacity to capture facial expressions. For ordinary conversation, or for multiplayer games where appearance matters, a camera pointed at the participants is probably a better way to go. Since VR goggles obscure the eyes, they worsen rather than improve the quality of communication among people using them. The blossoming of CU-See Me reflectors on the Internet shows that people do like to see the others they're talking to.
But VR has a tougher problem still, which I'll introduce with an anecdote. There's a wonderful science museum in Toronto called the Centennial Centre for Science and Technology. When I was there - this was about 1974 - one of the exhibits was a curious object called a "sensory homunculus." It was a statue of a nude man, but distorted out of shape to demonstrate skin sensitivity, the larger features being the ones with the most sensory nerves. Somewhat to my surprise, it did not have an enormous penis. (OK, OK, I was 14 at the time, and in any case it's conceivable that a certain delicacy had prevailed; this was nearly 25 years ago, after all. There was no female homunculus.) The statue had a small body, short legs and arms, a very large head, huge lips, and enormous hands.
We derive a great deal of information from tactile feedback, and it's a difficult and expensive thing to simulate with equipment, especially since there are four kinds of sense nerves: heat, cold, pressure, and pain. Force-feedback joysticks, steering wheels, and waldoes do a nice job of telling us when we're encountering resistance, but they can't present the fine resolution of real fingers. We can tell the difference, blindfolded, between real leather and Naugahyde, or gasoline and kerosene, just by our sense of touch. Think of the infinite variety of textures you experience every day, and then think about what kinds of equipment would be necessary to simulate them all. This, I think, is VR's biggest challenge - to be able to present the sensations of wet and dry, cold and warm, rough and smooth, hard and soft, and so on and so on.
Therefore, I propose that we establish a standard test for VR systems, which I'll call the Virtual Reality Gorilla/Rhinoceros Test. The rhinoceros half of the test determines the output quality of the VR equipment; the gorilla half, the input quality. It's very simple. When a VR device can give you the sensation of hand-feeding a rhinoceros (cool, damp, crisp apple and warm, wet, firmly rubbery rhino mouth, not to mention the sight, sound, and smell of the rhino and the apple); and when a VR device will allow you to present yourself to the virtual world as a gorilla, complete with body language, facial expressions, and vocalizations, then it will truly be entitled to call itself Virtual Reality. But for now, at least in the gaming world, VR still remains chiefly a gimmick, in the same class as force-feedback joysticks and Thunder Seats.
PS. I'm willing to serve as a test-bed for anybody who's ready to try it on their equipment. After all, I do have experience in these things.