Global Algorithm 1.3: The Aesthetics of Virtual Worlds: Report From Los Angeles

Lev Manovich

Welcome to a Virtual World! Strap on your avatar! Don't have the programming skills or time to build your own? No problem. We provide a complete library of pre-assembled characters; one of them is bound to fit you perfectly. Join the community of like-minded users who agree that three-dimensional space is more sexy! Yes, there is nothing more liberating than flying through a 3D scene, executing risky maneuvers and going for the kill. Mountains and valleys can represent files on a network, financial investments, enemy troops, the body of a virtual sex partner - it does not really matter. Zoom! Roll! Pitch! Not enough visual realism? For just an extra $9.95 a month you can upgrade your rendering speed to a blistering 490,000 polygons a second, increasing the quality of the experience a staggering 27.4%! And for another $4.95 you will get a chance to try a new virtual world every month, including a mall, a brothel, the Sistine Chapel, Paris during the Revolution of 1789, and even the fully navigable human brain. A 3D networked virtual world is waiting for you; all we need is your credit card number.

This advertisement is likely to appear on your computer screen quite soon, if it has not already. Ten years after William Gibson's fictional description of cyberspace1 and five years after the first theoretical conferences on the subject, 2 cyberspace is finally becoming a reality. More than that, it promises to become a new standard in how we interact with computer - a new way to work, communicate and play.

Virtual Worlds: History and Current Developments

Although a few networked multi-user graphical virtual environments had already been constructed in the 1980s, they were specialized projects involving custom hardware and designed for particular groups of users. In Lucasfilm's Habitat, described by its designers as a "many-player online virtual environment," a few dozen players used their home Commodore 64 computers to connect to a central computer running a simulation of a two-dimensional animated world. The players could interact with the objects in this world as well as with each other's graphical representations (avatars).3 Conceptually similar to Habitat but much more upscale in its graphics was SIMNET (Simulation Network) developed by DARPA (U.S. Defense Advanced Research Projects Agency). SIMNET was probably the first working cyberspace - the first collaborative three-dimensional virtual environment. It consisted of a number of individual simulators linked to a high-speed network. Each simulator contained a copy of the same world database and the virtual representations of all the other simulators. In one of SIMNET's implementations, over two hundred M-1 tank crews, located in Germany, Washington D.C., Fort Knox, and other places around the world, were able to participate in the same virtual battle.4

I remember attending a panel at a SIGGRAPH conference where a programmer who worked for Atari in the early 1980s argued that the military stole the idea of cyberspace from the games industry, modeling SIMNET after already existing civilian multi-participant games. With the end of the Cold War, the influences are running in the opposite way. Many companies that yesterday supplied very expensive simulators to the military are busy converting them into location-based entertainment systems (LBE). In fact, one of the first such systems which opened in Chicago in 1990 - BattleTech Center from Virtual World Entertainment, Inc. - was directly modeled on SIMNET.5 Like SIMNET, BattleTech Center comprised a networked collection of futuristic cockpit models with VR gear. For $7 each, a number of players could fight each other in a simulated 3D environment. By 1995, Virtual World was operating dozens of centers around the world that depended, as in SIMNET's case, on proprietary software and hardware.6

In contrast to such custom-built and expensive location-based entertainment systems, the Internet provides a structure for 3D cyberspace that can simultaneously accommodate millions of users, that is inherently modifiable by them, and runs on practically every computer. A number of researchers and companies are already working to turn this possibility into reality.

Among the attempts to spatialize the Net, the most important is VRML (Virtual Reality Modeling Language), which was conceived in the spring of 1994. According to the document defining Version 1.0 (May 26, 1995), VRML is "a language for describing multi-participant interactive simulations - virtual worlds networked via the global Internet and hyperlinked with the World Wide Web."7 Using VRML, Internet users can construct 3D scenes hyperlinked to other scenes and to regular Web documents. In other words, 3D space becomes yet another media accessible via the Web, along with text, sounds, and moving images. But eventually a VRML universe may subsume the rest of the Web. So while currently the Web is dominated by pages of text, with other media elements (including VRML 3D scenes) linked to it, future users may experience it as one gigantic 3D world which will contain all other media, including text. This is certainly the vision of VRML designers who aim to "create a unified conceptualization of space spanning the entire Internet, a spatial equivalent of WWW."8 They see VRML as a natural stage in the evolution of the Net from an abstract data network toward a "'perceptualized' Internet where the data has been sensualized,"9 i.e., represented in three dimensions.

VRML 1.0 makes possible the creation of networked 3D worlds but it does not allow for the interaction between their users. Another direction in building cyberspace has been to add graphics to already popular Internet systems for interaction, such as chat lines and MUDs. Worlds Inc., which advertises itself as "a publisher of shared virtual environments"10 has created WorldChat, a 3D chat environment which has been available on the Internet since April 1995. Users first choose their avatars and then enter the virtual world (a space station) where they can interact with other avatars. The company imagines "the creation of 3-D worlds, such as sports bars, where people can come together and talk about or watch sporting events online, or shopping malls."11 Another company, Ubique12, created technology called Virtual Places which also allows the users to see and communicate with other users' avatars and even take tours of the Web together.13

Currently, the most ambitious full-scale 3D virtual world on the Internet is AlphaWorld, sponsored by Worlds Inc. At the time of this writing, it featured 200,000 buildings, trees and other objects, created by 4,000 Internet users. The world includes a bar, a store which provides prefabricated housing, and news kiosks which take you to other Web pages.14

The movement toward spatialization of the Internet is not an accident. It is part of a larger trend in cyberculture - spatialization of all representations and experience. This trend manifests itself in a variety of ways.

The designers of human-computer interfaces are moving from 2D toward 3D - from flat desktops to rooms, cities, and other spatial constructs.15 Web designers also often use pictures of buildings, aerial views of cities, and maps as front ends in their sites. Apple promotes Quicktime VR, a software-only system which allows the user of any personal computer to navigate a spatial environment and interact with 3D objects.

Another example is the emergence of a new field of scientific visualization devoted to spatialization of data sets and their relationships with the help of computer graphics. Like the designers of human-computer interfaces, the scientists assume that spatialization of data makes working with it more efficient, regardless of what this data is.

Finally, in many computer games, from the original "Zork" to the best-selling CD-ROM "Myst," narrative and time itself are equated with movement through space (i.e., going to new rooms or levels.) In contrast to modern literature, theater, and cinema, which are built around the psychological tensions between characters, these computer games return us to the ancient forms of narrative where the plot is driven by the spatial movement of the main hero, traveling through distant lands to save the princess, to find the treasure, to defeat the Dragon, and so on.

A similar spatialization of narrative has defined the field of computer animation throughout its history. Numerous computer animations are organized around a single, uninterrupted camera move through a complex and extensive set. A camera flies over mountain terrain, moves through a series of rooms, maneuvers past geometric shapes, zooms out into open space, and so on. In contrast to ancient myths and computer games, this journey has no goal, no purpose. It is an ultimate "road movie" where the navigation through the space is sufficient in itself.

Aesthetics of Virtual Worlds

The computerization of culture leads to the spatialization of all information, narrative, and, even, time. Unless this overall trend is to reverse suddenly, the spatialization of cyberspace is next. In the words of the scientists at Sony's Virtual Society Project, "It is our belief that future online systems will be characterized by a high degree of interaction, support for multi-media and most importantly the ability to support shared 3D spaces. In our vision, users will not simply access textual based chat forums, but will enter into 3D worlds where they will be able to interact with the world and with other users in that world." What will be the visual aesthetics of spatialized cyberspace? What would these 3D worlds look like? In answering this question I will try to abstract the aesthetic features common to different virtual worlds already in existence: computer games; CD-ROM titles; virtual sets in Hollywood films; VR simulations; and, of course, virtual worlds on the Internet such as VRML scenes, WorldChat, and Quicktime VR movies. I will also consider the basic technologies and techniques used to construct virtual spaces: 3D computer graphics; digitized video; compositing; point and click metaphor. What follows are a few tentative propositions on the visual aesthetics of virtual worlds.

1. Realism as Commodity

Digit in Latin means number. Digital media reduces everything to numbers.

This basic property of digital media has a profound effect on the nature of visual realism. In a digital representation, all dimensions that affect the reality effect - detail, tone, color, shape, movement - are quantified. As a consequence, the reality effect produced by the representation can itself be related to a set of numbers.

For a 2D image, the crucial numbers are its spatial and color resolution: the number of pixels and the number of colors per pixel. For instance, a 640 x 480 image of an object contains more detail and therefore produces a stronger reality effect than a 120 x 160 image of the same object. For a 3D model, the level of detail, and consequently the reality effect, is specified by 3D resolution: the number of points the model is composed of.

Spatial, color, and 3D resolutions describe the realism of static representations: scanned photographs; painted backgrounds; renderings of 3D objects; and so on. Once the user begins to interact with a virtual world, navigating through a 3D space or inspecting the objects in it, other dimensions become crucial. One of them is temporal resolution. The more frames a computer can generate in a second, the smoother the resulting motion. Another is the speed of the system's response: if the user clicks on an image of a door to open it or asks a virtual character a question, a delay in response breaks the illusion. Yet another can be called consistency: if moving objects do not cast shadows (because the computer can't render them in real time) while the static background has them, the inconsistency affects the reality effect.

All these dimensions are quantifiable. The number of colors in an image, the temporal resolution the system is capable of, and so on can be specified in exact numbers.

Not surprisingly, the advertisements for graphics software and hardware prominently display these numbers. Even more importantly, those in the business of visual realism - the producers of special effects, military trainers, digital photographers, television designers - now have definite measures for what they are buying and selling. For instance, the Federal Aviation Administration, which creates the standards for simulators to be used in pilot training, specifies the required realism in terms of 3D resolution. In 1991 it required that for daylight, a simulator must be able to produce a minimum of 1,000 surfaces or 4,000 points.16 Similarly, a description of the Compu-Scene IV simulator from GE Aerospace states that a pilot can fly over a geographically accurate 3D terrain that includes 6,000 features per square mile.17

The numbers which characterize digital realism simultaneously reflect something else: the cost involved. More bandwidth, higher resolution and faster processing result in a stronger reality effect - and cost more.

The bottom line: the reality effect of a digital representation can now be measured in dollars. Realism has became a commodity. It can be bought and sold like anything else.

This condition is likely to be explored by the designers of virtual worlds. If today users are charged for the connection time, in the future they can be charged for visual aesthetics and the quality of the overall experience: spatial resolution; number of colors; complexity of characters (both geometric and psychological); and so on. Since all these dimensions are specified in software, it becomes possible to automatically adjust the appearance of a virtual world on the fly, boosting it up if a customer is willing to pay more.

In this way, the logic of pornography will be extended to the culture at large. Peep shows and sex lines charge their customers by the minute, putting a precise cost on each bit of pleasure. In virtual worlds, all dimensions of reality will be quantified and priced separately.

Neal Stephenson's 1992 "Snow Crash" provides us with one possible scenario of such a future. Entering the Metaverse, the spatialized Net of the future, the hero sees "a liberal sprinkling of black- and-white people - persons who are accessing the Metaverse through cheap public terminals, and who are rendered in jerky, grainy black and white."18 He also encounters couples who can't afford custom avatars and have to buy off-the-shelf models, poorly rendered and capable of just a few standard facial expressions - virtual world equivalents of Barbie dolls.19

This scenario is gradually becoming a reality. A number of online stock-photo services already provide their users with low-resolution photographs for a small cost, charging more for higher resolution copies. A company called Viewpoint Datalabs International is selling thousands of ready-to-use 3D geometric models widely used by computer animators and designers. For most popular models you can choose between different versions, with more detailed versions costing more than less detailed ones.20

Romanticism, Adorno, Photoshop Filters: From Creation to Selection

Viewpoint Datalabs' models exemplify another characteristic of virtual worlds: they are not created from scratch but assembled from ready-made parts. Put differently, in digital culture creation has been replaced by selection.

E. H. Gombrich's concept of a representational schema and Roland Barthes's "death of the author" helped to sway us from the romantic ideal of the artist creating totally from scratch, pulling images directly from his imagination.21 As Barthes puts it, "[t]he Text is a tissue of quotations drawn from the innumerable centers of culture."22 Yet, even though a modern artist may be only reproducing or, at best, combining in new ways pre-existing texts and idioms, the actual material process of art making supports the romantic ideal. An artist operates like God creating the universe - he starts with an empty canvas or a blank page. Gradually filling in the details, he brings a new world into existence.

Such a process of art making, manual and painstakingly slow, was appropriate for the age of pre-industrial artisan culture. In the twentieth century, as the rest of the culture moved to mass production and automation, literally becoming a "culture industry," art continued to insist on its artisan model. Only in the 1910s when some artists began to assemble collages and montages from already existing cultural "parts," was art introduced to the industrial method of production.

In contrast, electronic art from its very beginning was based on a new principle: modification of an already existing signal. The first electronic instrument designed in 1920 by the legendary Russian scientist and musician Leon Theremin contained a generator producing a sine wave; the performer simply modified its frequency and amplitude.23 In the 1960s video artists began to build video synthesizers based on the same principle. The artist was no longer a romantic genius generating a new world purely out of his imagination; he became a technician turning a knob here, pressing a switch there - an accessory to the machine.

Substitute a simple sine wave by a more complex signal (sounds, rhythms, melodies) and add a whole bank of signal generators and you have a modern music synthesizer, the first instrument which embodies the logic of all new media: not creation but selection.

The first music synthesizers appeared in the 1950s. They were followed by video synthesizers in the 1960s, followed by DVE (Digital Video Effects) in the late 1970s (the banks of effects used by video editors), and followed, in turn, by computer software such as 1984s MacDraw that already come with a repertoire of basic shapes. The process of art making has finally caught up with modern times. It has become synchronized with the rest of modern society where everything is assembled from ready-made parts; from objects to people's identities. The modern subject proceeds through life by selecting from numerous menus and catalogs of items - be it assembling an outfit, decorating the apartment, choosing dishes from a restaurant menu, choosing which interest groups to join. With electronic and digital media, art-making similarly entails choosing from ready-made elements: textures and icons supplied by a paint program; 3D models which come with a 3D modeling program; melodies and rhythms built into a music program.

While previously the great text of culture from which the artist created his own unique "tissue of quotations" was bubbling and shimmering somewhere below consciousness, now it has become externalized (and greatly reduced in the process) - 2D objects, 3D models, textures, transitions, effects which are available as soon as the artist turns on the computer. The World Wide Web takes this process to the next level: it encourages the creation of texts that completely consist of pointers to other texts that are already on the Web. One does not have to add any new content; it is enough to select from what already exists.

This shift from creation to selection is particularly apparent in 3D computer graphics - the main technique for building virtual worlds. The amount of labor involved in constructing three-dimensional reality from scratch in a computer makes it hard to resist the temptation to utilize pre-assembled, standardized objects, characters, and behaviors readily provided by software manufacturers - fractal landscapes, checkerboard floors, complete characters and so on.24 Every program comes with libraries of ready-to-use models, effects or even complete animations. For instance, a user of the Dynamation program (a part of the popular Wavefront 3D software) can access complete pre-assembled animations of moving hair, rain, a comet's tail or smoke, with a single click.

If even professional designers rely on ready-made objects and animations, the end users of virtual worlds, who usually don't have graphics or programming skills, have no other choice. Not surprisingly, Web chat-line operators and virtual world providers encourage users to choose from the libraries of pictures, 3D objects, and avatars they provide. Ubique's site features "Ubique Furniture Gallery" where one can choose images from such categories as "office furniture," "computers and electronics," and "people icons."25 VR-SIG from the UK provides VRML Object Supermarket while Aereal delivers the Virtual World Factory. The latter aims to make the creation of a custom virtual world particularly simple: "Create your personal world, without having to program! All you need to do is fill-in-the-blanks and out pops your world.26 Quite soon we will see a whole market for detailed virtual sets, characters with programmable behaviors, and even complete worlds (a bar with customers, a city square, a famous historical episode, etc.) from which a user can put together his own "unique" virtual world.

While a hundred years ago the user of a Kodak camera was asked just to push a button, he still had the freedom to point the camera at anything. Now, "you push the button, we do the rest" has become "you push the button, we create your world."

3. Brecht as Hardware

Another aesthetic feature of virtual worlds lies in their peculiar temporal dynamic: constant, repetitive shifts between an illusion and its suspense. Virtual worlds keep reminding us of their artificiality, incompleteness, and constructedness. They present us with a perfect illusion only to reveal the underlying machinery.

Web surfing provides a perfect example. A typical user may be spending equal time looking at a page and waiting for the next page to download. During waiting periods, the act of communication itself - bits traveling through the network - becomes the message. The user keeps checking whether the connection is being made, glancing back and forth between the animated icon and the status bar. Using Roman Jakobson's model of communication functions, we can say that communication comes to be dominated by contact, or the phatic function - it is centered around the physical channel and the very act of connection between the addresser and the addressee.27

Jakobson writes about verbal communication between two people who, in order to check whether the channel works, address each other: "Do you hear me?" and "Do you understand me?" But in Web communication there is no human addresser, only a machine. So as the user keeps checking whether the information is coming, he actually addresses the machine itself. Or rather, the machine addresses the user. The machine reveals itself, it reminds the user of its existence - not only because the user is forced to wait but also because he is forced to witness how the message is being constructed over time. A page fills in part by part, top to bottom; text comes before images; images arrive in low resolution and are gradually refined. Finally, everything comes together in a smooth sleek image - the image which will be destroyed with the next click.

Will this temporal dynamic ever be eliminated? Will spatialized Net become a perfect Utopian city rather than remain a gigantic construction site?

An examination of already existing 3D virtual worlds suggests a negative answer to this question. Consider the technique called "distancing" or "level of detail" which for years has been used in VR simulations and is now being adapted to 3D games and VRML scenes. The idea is to render the models more crudely when the user is moving through virtual space; when the user stops, detail gradually fills in. Another variation of the same technique involves creating a number of models of the same object, each with progressively less detail. When the virtual camera is close to an object, a highly detailed model is used; if the object is far away, a lesser detailed version is substituted to save unnecessary computation.

A virtual world which incorporates these techniques has a fluid ontology that is affected by the actions of the user. As the user navigates through space the objects switch back and forth between pale blueprints and fully "fleshed-out" illusions. The immobility of a subject guarantees a complete illusion; the slightest movement destroys it.

Navigating a Quicktime VR movie is characterized by a similar dynamic. In contrast to the nineteenth-century panorama that it closely emulates, Quicktime VR continuously deconstructs its own illusion. The moment you begin to pan through the scene, the image becomes jagged. And, if you try to zoom into the image, all you get are oversized pixels. The representational machine keeps hiding and revealing itself.

Compare this dynamic to traditional cinema or realist theater which aims at all costs to maintain the continuity of the illusion for the duration of the performance. In contrast to such totalizing realism, digital aesthetics have a surprising affinity to twentieth century leftist avant-garde aesthetics. Bertolt Brecht's strategy to reveal the conditions of an illusion's production, echoed by countless other leftist artists, became embedded in hardware and software themselves. Similarly, Walter Benjamin's concept of "perception in the state of distraction"28 found a perfect realization. The periodic reappearance of the machinery and the continuous presence of the communication channel in the message prevent the subject from falling into the dream world of illusion for very long, making him alternate between concentration and detachment.

While virtual machinery itself already acts as an avant-garde director, the designers of interactive media (games, CD-ROM titles, interactive cinema, and interactive television programs) often consciously attempt to structure the subject's temporal experience as a series of periodic shifts. The subject is forced to oscillate between the roles of viewer and user, shifting between perceiving and acting, between following the story and actively participating in it. During one segment the computer screen presents the viewer with an engaging cinematic narrative. Suddenly the image freezes, menus and icons appear and the viewer is forced to act: make choices; click; push buttons. (Moscow media theorist Anataly Prokhorov describes this process as the shift from transparency to opacity - from a window into a fictional 3D universe to a solid surface, full of menus, controls, text and icons.29 Three-dimensional space becomes a surface; a photograph becomes a diagram; a character becomes an icon.)

Can Brecht and Hollywood be married? Is it possible to create a new temporal aesthetic based on such cyclical shifts? So far, I can think of only one successful example - a military simulator, the only mature form of interactive media. It perfectly blends perception and action, cinematic realism and computer menus. The screen presents the subject with an illusionistic virtual world while periodically demanding quick actions: shooting at the enemy; changing the direction of a vehicle; and so on. In this art form, the roles of viewer and actant are blended perfectly - but there is a price to pay. The narrative is organized around a single and clearly defined goal: staying alive.

4. Riegl, Panofsky, and Computer Graphics: Regression in Virtual Worlds

The last aesthetic principle of virtual worlds that I will address can be summarized as follows: virtual spaces are not true spaces but collections of separate objects. Or: there is no space in cyberspace.

To explore this thesis further we can borrow the categories developed by art historians early in this century. The founders of modern art history (Alois Riegl, Heinrich Wolfflin, and Erwin Panofsky) defined their field as the history of the representation of space. Working within the paradigms of cyclic cultural development and racial topology, they related the representation of space in art to the spirit of entire epochs, civilizations, and races. In his 1901 "Die Sptrmische Kunstindustrie," Riegl characterized humankind's cultural development as the oscillation between two extreme poles, two ways to understand space, which he called "haptic" and "optic." Haptic perception isolates the object in the field as a discrete entity, while optic perception unifies objects in a spatial continuum. Riegl's contemporary, Heinrich Wolfflin, similarly proposed that the temperament of a period or a nation expresses itself in a particular mode of seeing and representing space. Wolfflin's "Principles of Art History" (1913) plotted the differences between Renaissance and Baroque on five dimensions: linear/painterly; plane/recession; closed form/open form; multiplicity/unity; and clearness/unclearness. Finally, another founder of modern art history, Erwin Panofsky, contrasted the "aggregate" space of the Greeks with the "systematic" space of the Italian Renaissance in a famous essay "Perspective as a Symbolic Form" (1924-1925). Panofsky established a parallel between the history of spatial representation and the evolution of abstract thought. The former moves from the space of individual objects in antiquity to the representation of space as continuous and systematic in modernity; in Panofsky's neologisms, from "aggregate" space to "systematic" space. Correspondingly, the evolution of abstract thought progresses from ancient philosophy's view of the physical universe as discontinuous to the post-Renaissance understanding of space as infinite, ontologically primal in relation to bodies, homogeneous, and isotropic - in short, as "systematic."

We don't have to believe in grand evolutionary schemes but we can retain the categories themselves. What kind of space is a virtual space? At first glance, 3D computer graphics, the main technology of creating virtual spaces, exemplify Panofsky's concept of Renaissance "systematic" space which exists prior to the objects. Indeed, the Cartesian coordinate system is hardwired into computer graphics software and often into the hardware itself.30 When a designer launches a modeling program, he is typically presented with an empty space defined by a perspectival grid, the space that will be gradually filled by the objects he will create. If the built-in message of a music synthesizer is a sine wave, the built-in world of computer graphics is an empty Renaissance space, the coordinate system itself.

Yet computer generated worlds are actually much more "haptic" and "aggregate" than "optic" and "systematic." The most commonly used 3D computer graphics technique to create 3D worlds is polygonal modeling. The virtual world created using this technique is a vacuum filled with separate objects defined by rigid boundaries. A perspective projection creates the illusion that these objects belong together but in fact they have no connection to each other. What is missing is space in the sense of space-environment or space-medium: the environment between objects; an atmosphere which unites everything together; the effects of objects on one another.

Another basic technique used in creating virtual worlds - compositing (superimposing, keying) - also leads to an "aggregate" space. It involves superimposing animated characters, still images, Quicktime movies, and other graphical elements over a separate background. A typical scenario may involve an avatar animated in real time in response to the user's commands. The avatar is superimposed over a picture of a room. An avatar is controlled by the user; a picture of a room is provided by a virtual world operator. Because the elements come from different sources and are put together in real time, the result is a series of 2D planes rather than a real 3D environment.

In summary, although computer-generated virtual worlds are usually rendered in linear perspective, they are really collections of separate objects, unrelated to one another. In view of this, commonly expressed arguments that 3D computer graphics send us back to Renaissance perspectivalism, and, therefore, from the viewpoint of twentieth-century abstraction, should be considered regressive, turn out to be groundless. If we are to apply the evolutionary paradigm of Panofsky to the history of virtual computer space, it has not even achieved its Renaissance yet. It is still on the level of Ancient Greece, which could not conceive of space as a totality.

And if the World Wide Web and VRML 1.0 are any indication, we are not moving any closer toward systematic space; instead, we are embracing "aggregate" space as a new norm, both metaphorically and literally. The "space" of the Web in principle can't be thought of as a coherent totality: it is a collection of numerous files, hyperlinked but without any overall "perspective" to unite them. The same holds for actual 3D spaces on the Internet. A VRML file which describes a 3D scene is a list of separate objects which may exist anywhere on the Internet, each created by a different person or a different program. The objects have no connection to each other. And since any user can add or delete objects, no one may even know the complete structure of the scene.

The Web has already been compared to the American Wild West. The spatialized Web as envisioned by VRML (itself a product of California) even more closely reflects the treatment of space in American culture: the lack of attention to space which is not functionally used. The territories that exist between privately owned houses and businesses are left to decay. The VRML universe simply does not contain space as such - only objects which belong to different individuals.

And what is an object in a virtual world? Something which can be acted upon: clicked; moved; opened - in short, used. It is tempting to interpret this as regression to the world view of an infant. A child does not think of the universe as existing separately from himself - it appears as a collection of unrelated objects with which he can enter in contact: touch; suck on; grab. Similarly, the user of a virtual world tries to click on whatever is in front of him; if the objects do not respond, he is disappointed. In the virtual universe, Descartes's maxim can be rewritten as follows: "I can be clicked on, therefore I exist."

5. The Whole Picture

I have discussed different aesthetic features of 3D virtual worlds. But what would a future full-blown virtual world feel like? What would be its overall gestalt?

One example of a highly detailed virtual world, complete with landscapes and human beings, is provided by Disney's 1995 "Toy Story," the first completely computer-animated feature length film. Frighteningly sterile, this is a world in which the toys and the humans look absolutely alike, the latter appearing as macabre automatons.

If you want to experience cyberspace of the future today, visit the place where "Toy Story" was financed - Los Angeles. The city offers a precise model for the virtual world. There is no center, no hint of any kind of centralized organization, no traces of the hierarchy essential to traditional cities. One drives to particular locations defined strictly by their street addresses rather than by spatial landmarks. A trendy restaurant or club can be found in the middle of nowhere, among the miles of completely unremarkable buildings. The whole city feels like a set of particular points suspended in a vacuum, similar to a bookmark file of Web pages. You are immediately charged on arrival to any worthwhile location, again as on the Web (mandatory valet parking). There you discover the trendy inhabitants (actors, singers, models, producers) who look like some new race, a result of successful mutation: unbelievably beautiful skin and faces; fixed smiles; and bodies whose perfect shapes surely can't be the result of human evolution. They probably come from the Viewpoint catalog of 3D models. These are not people but avatars: beautifully rendered with no polygons spared; shaped to the latest fashion; their faces switching between a limited number of expressions. Given the potential importance of any communicative contact, subtlety is not tolerated: avatars are designed to release stimuli the moment you notice them, before you have time to click to the next scene.

The best place to experience the whole gestalt is in one of the outdoor cafes on Sunset Plaza in West Hollywood. The avatars sip cappuccino amidst the illusion of 3D space. The space is clearly the result of a quick compositing job: billboards and airbrushed cafe interior in the foreground against a detailed matte painting of Los Angeles with the perspective exaggerated by haze. The avatars strike poses, waiting for their agents (yes, just like in cyberspace) to bring valuable information. Older customers look even more computer generated, their faces bearing traces of extensive face-lifts. You can enjoy the scene while feeding the parking meter every twenty minutes. A virtual world is waiting for you; all we need is your credit card number.


1. William Gibson, Neuromancer (New York: Ace Books, 1984).

2. Michael Benedikt, ed., Cyberspace: First Steps (Cambridge, Mass.: The MIT Press, 1991).

3. Chip Morningstar and F. Randall Farmer, "The Lessons of Lucasfilm's Habitat," in Cyberspace: First Steps, ed. Michael Benedict (Cambridge, MA: The MIT Press, 1991), 273-302.

4. Howard Rheingold, Virtual Reality (New York: Simon & Schuster, 1991), 360-361.

5. See Tony Reveaux, "Virtual Reality Gets Real," New Media (January 1993), 39.

6. Virtual World Entertainment, Inc., Press Release, SIGGRAPH '95, Los Angeles, August 6-11, 1995.

7. Gavin Bell, Anthony Parisi and Mark Pesce, "The Virtual Reality Modeling Language. Version 1.0 Specfication," May 26, 1995. WWW document.

8. Mark Pesce, Peter Kennard and Anthony Parisi, "Cyberspace." WWW document.

9. Bell, Parisi and Pesce.

10. See <>.

11. Richard Karpinski, "Chat Comes to the Web," Interactive Age (July 3, 1995), 6.

12. See <>.

13. In September of 1995, Ubique was purchased by America Online - a significant development since America Online is already the most graphically oriented among the commercial networks based in the U.S.

14. See <>.

15. For instance, Silicon Graphics developed a 3D file system which was showcased in the movie Jurassic Park. The interface of Sony's MagicLink personal communicator is a picture of a room while Apple's eWorld greets its users with a drawing of a city.

16. Barbara Robertson, "Those Amazing Flying Machines," Computer Graphics World (May 1992), 69.

17. Ibid.

18. Neal Stephenson, Snow Crash (New York: Bantam Books, 1992), 43.

19. Ibid., 37.

20. See <>.

21. E.H. Gombrich, Art and Illusion (Princeton: Princeton University Press, 1960); Roland Barthes, "The Death of the Author," in Image, Music, Text, ed. Stephen Heath (New York: Farrar, Straus and Giroux, 1977).

22. Barthes, 142.

23. Bulat Galeyev, Soviet Faust. Lev Theremin - Pioneer of Electronic Art (in Russian) (Kazan, 1995), 19.

24. For a more detailed analysis of realism in 3D computer graphics, see Lev Manovich, "Assembling Reality: Myths of Computer Graphics," Afterimage 20, no. 2 (September 1992), 12-14.

25. See <>.

26. See <>.

27. See Roman Jakobson, "Closing Statement: Linguistics and Poetics," in Style In Language, ed. Thomas Sebeok (Cambridge, Mass.: The MIT Press, 1960).

28. Walter Benjamin, "The Work of Art in the Age of Mechanical Reproduction," in Illuminations, ed. Hannah Arendt (New York: Schochen Books, 1969).

29. Private communication, September 1995, St. Petersburg.

30. See Lev Manovich, "Mapping Space: Perspective, Radar and Computer Graphics," in SIGGRAPH '93 Visual Proceedings, ed. Simon Penny (New York: ACM, 1993).

Lev Manovich is a theorist and critic of new media. He is currently working on two books: a collection of essays on digital realism and a history of the social and cultural origins of computer graphics technologies, entitled The Engineering of Vision from Constructivism to Virtual Reality, (University of Texas Press, forthcoming). He is Assistant Professor in the Visual Arts Department at the University of California, San Diego.