Moses A. Boudourides


Computer Technology Institute (CTI) and

University of Patras, Department of Mathematics

265 00 Patras, Greece





Richness and Presence


            One of the most important themes of communication theory concerns the range of cross-media comparisons, i.e., comparisons among various forms and modes of communication (Lea, 1991). Clearly such comparisons are related to media choice, media substitution and bear various practical implications. According to many theorists, it is attributes of the media that drive media comparisons and influence media choice. In particular, the two critical factors, which are in general considered to characterize communication media, are the extent to which a medium conveys the ‘social presence’ of the participants and possesses ‘rich information’ reducing communicational uncertainty.

            The social presence theory was developed by Short, Williams and Christie (1976) and indicates the degree to which communicants psychologically perceive the presence of the other through expressions of warmth, intimacy, familiarity etc. (see Rice and Love, 1987, for a summary of studies measuring social presence in computer-mediated communication). Clearly social presence depends on the medium. For instance, it is minimal or almost absent in a typed document or in an e-mail; but telephones carry voice although no visual cues; furthermore, video-mediated communication supports transmission of both voice and some physical images. As for the effects of the medium, Short (1974) suggested that lower social presence resulted in greater persuasion and social influence (in contrast to the prevailing assumptions about CMC, as remarked by Spears, Lea and Postmes, 2000).

            Media richness has been developed by Daft and Lengel (1984, 1986) and is based on the theory of organizational information processing according to which uncertainty and equivocality reduction is the main goal of communication. Daft and Lengel proposed four factors determining media richness: speed of feedback, channel mode (visual, audio or mixed), personal focus and language use. In this way, rich media facilitate communication as long as they support feedback, multiple cues, personal focus and language variety, while, on the opposite, lean media rely on rules, forms and procedures. Daft and Lengel considered the following ranking of media in order of richness: face-to-face (FTF), telephone, personal written documents (e.g., letters, memos etc.), impersonal unaddressed documents (e.g., reports, bulletins, etc.) and numeric documents. Sitkin, Sutcliffe and Barrios-Choplin (1992) expanded the media ranking to include e-mail and videoconferencing (also adding a fifth factor of media richness: communication target). In further extensions of this theory the role of symbolic interactions has been emphasized so that by shifting the analysis from individuals and tasks towards interactions they could study the social construction of communication processes (Trevino, Daft and Lengel, 1990).

            However, a number of criticisms have been raised against these relatively narrow and deterministic theories of social presence and media richness expressing an opposition to their rationalist concerns for organizational efficiency and productivity (Lea, 1991). Thus, in at least what concerns e-mail, trying to keep distances away from both positivist and interpretivist approaches, Ngwenyama and Lee (1997) have followed the perspective of Habermas’ critical social theory, on which they have articulated an alternative definition of media richness and implemented a corresponding empirical study of organizational communication.

            The previous approaches of media comparisons might appear somehow awkward in the sense of that they might appear as putting comparisons prior to proper definitions. Now the problem of media definitions is that one has to carefully balance their dual constitution: media are both human experiences and technological artefacts. Grounding media definitions merely on incommensurate technologies would be futile, inoperational and it would hinder any media comparisons. What remains is defining media in terms of human experience and, according to Jonathan Steuer (1995), presence is the key concept for such a definition. In fact, presence can be thought of as the experience of a physical environment, the sense of being in an environment. Defined so, presence does not refer to placing someone physically at some location and around some surroundings, but to the perception of those surroundings as mediated by both automatic and controlled mental processes (Gibson, 1979). Said otherwise, presence is one’s sense of being in an environment, independently if one’s body is physically located there. According to Heeter (1992), there are three distinct types of presence that contribute to the experience of ‘being there’: subjective personal presence, social presence and environmental presence. Lombard and Ditton (1997) claim that people conceptualize presence in the following six ways: (1) as social richness, (2) as realism, (3) as transportation, (4) as immersion, (5) as social actor within medium and (6) as medium being a social actor.

Now, as in media experiences perception is mediated by a communication technology, the result is that, experiencing a medium, one is forced to perceive simultaneously two separate environments: the physical environment of one’s actual presence and the environment presented through the medium. In other words, presence refers to the natural perception of an environment and telepresence refers to the mediated perception of an environment. So, telepresence signifies the extent to which one feels present in a mediated environment rather than in an immediate physical environment (Steurer, 1995). Loomis (1992) associated this phenomenon to that of distal attribution or externalization, which refers to the referencing our perceptions to an external space beyond the limits of the sensory organs themselves. According to Loomis, telepresence differs from distal attribution to the extent of how aware the user is of linkages between the local and the remote environments: telepresence supports only the interpretation of being somewhere else while distal attribution occurs when the user is aware of the linkages between environments.

            It is a fact that all mediated experiences are first compared over person experiences and, in this sense, face-to-face communication represents a model for all interactive communication (Durlak, 1987). However, since telepresence is necessarily a mediated experience, it will also be affected by properties of the involved medium. Jonathan Steuer (1995) considers that telepresence is a function of both the representational powers of technology and the individual perceiver. He identifies two technological dimensions determining telepresence: vividness and interactivity. Vividness refers to the ability of a technology to produce a sensorially rich mediated environment. Interactivity refers to the degree to which users of a medium can influence and participate to modify in real time the form or content of the mediated environment. Two important determinants of vividness are sensory breadth and depth. Breadth is related to the number of simultaneous sensor modalities of presented information. Depth is related to the amount of information (the resolution or the bandwidth) provided within the available modalities. Three important determinants of interactivity are speed, range and mapping. Speed of interaction is equivalent to response time (with real-time interaction being an upper limit). Range of interaction refers to the number of possibilities for action at any given time. Mapping of interaction refers to the ability of the system to map its controls to changes in the mediated environment in a natural and predictable way. Traditional media (e.g., print, radio, telephone) and e-mail are relatively low in vividness (both breadth and depth), while new media (e.g., videoconferencing, virtual reality) are high. However, some low in vividness media can be more interactive than other relatively high in vividness media (e.g., computer-mediated communication vs. print, radio, television and cinema).

            As the locus of telepresence is the perceiver, it varies across individuals and depends on immediate situational factors, ongoing personal concerns, the number of actors, the social and cultural environment etc. Of course, telepresence should be distinguished from unmediated ‘real’ experiences and also purely psychic phenomena (such as dreams or hallucinations). But the experiential nature of human interaction should enrich the discussion about communication and engagement. A good example is given by the work of Brenda Laurel (1990; 1991). Laurel described media use in terms of mimesis, seeing the relationship between user and technology as action in a play, and encouraged users to develop a first-person, rather than third-person, relationship with her or his mediated environments. Similarly, Joan Mazur (2000) has recently urged media theorists and developers of distributed visual and virtual environments to capitalize on insights from film theories and cinematic techniques. In the same direction, Bennington and Gay (2000) have drawn upon phenomenological and surrealist film theory to explore the perceptual, expressive, intentional and interpretive dimensions of interactive media, complementing current semiotic explorations of hypertext and hypermedia.


Video-Mediated Communication (VMC)


            Video-mediated communication (VMC) or videoconferencing is a synchronous (real-time) communication system simultaneously transmitting both video and audio. By the advent of information and communication technologies, this transmission is transacted between computers and through computer networks. Because of the computer involvement, sometimes VMC is referred to as ‘desktop videoconferencing’ or ‘computer-mediated visual communication.’ In fact, thanks to the emergence of the new digital technologies, video, sound, text, graphics, animation and other multimedia can be computationally manipulated and transmitted across high bandwidth computer networks. Even at viable network bandwidth, improved compression algorithms, more efficient network protocol standards and faster computers are soon expected to provide affordable VMC systems of acceptable quality. For instance, the introduction of multicast media transmission over the TCP/IP protocol (Eriksson, 1994) and over high-speed computer networks as ATM is expected to bring more video and multimedia facilities to the Internet users.

            The first VMC system appeared in the mid-1960’s: AT&T’s PicturePhone (Falk, 1973). However, PicturePhone failed to gain commercial success and up to now videoconferencing is far from been considered as been massively used in comparison to other and older communication technologies. According to Hubert Knoblauch, until 1999 only about ten thousand video-phones have been sold world-wide and there are no indications of increase (Knoblauch, 1999). In general, high costs in computing and network infrastructures together with uncertainty over the benefits of collaborative multimedia are attributed as significant barriers to an extensive adoption and use of VMC (Tang and Isaacs, 1993).

            As a matter of fact, the first research in the 1970’s on the effects of various media on collaborative activity has not been encouraging for the value of video (Williams, 1997). In their studies of problem-solving tasks in various communication modes including typewriting, video only, voice and video, and face-to-communication, Ochsman and Chapanis (1974) found that video has no significant effects over audio on communication times or behavior. Conrath and co-workers (1977) in their evaluation of four different telemedicine systems (audio only, audio plus black and white still frame images, audio plus full-motion black and white video, audio plus full-color video) found no significant differences in diagnostic accuracy and the effectiveness of patient management. However, when they surveyed the patients’ attitudes about the four systems, they found a preference towards the more sensory rich modes. In a more recent study, Gale (1990) compared three ways of computer supported collaborative work: sharing data only (in a whiteboard), sharing data and audio, and sharing data, audio and video. He concluded that there were no significant differences in the quality of the output or the time to complete the tasks. But he did find that collaborators’ perceptions of productivity increased with bandwidth and, so, he suggested that higher bandwidth media would enable groups to perform more social activities.

            By an ethnomethodological perspective and using conversation analysis, Heath and Luff (1991; 1992a; 1992b) have studied the transformations of visual conduct caused by video technology. Comparing to the ways by which talk is managed and regulated in face-to-face conversation, they found a relative impotence of gesture and an ineffectiveness of gaze in VMC systems. Similarly, but from a different methodological perspective, Sellen (1992) and O’Conaill, Whittaker and Wilbur (1993) have investigated the impact of video on verbal conduct. Leaving aside the human communicational aspects of mediated interaction, Gaver (1992) concentrated on the “affordances” of the video technology. Focusing on the two-dimensionality of video images and the discontinuity of movement in video space, he drew a number of implications on the ways and the properties of the medium that afford actions to individuals or not.

            In short, the majority of the VMC studies (cf. Finn, Sellen and Wilbur, 1997) concentrates on a comparison between face-to-face (FTF) and video-mediated communication and almost all of them agree to that even high-quality VMC cannot replicate FTF communication. VMC has been accused of restricting conversational coordination and interaction, depriving spontaneity although satisfactorily supporting the transmission of social cues and affective information (depending on the used technological system). Due to limited access to global visual conduct including the environment of the communicants, their peripheral perceptions of co-participants and their conversational routines, VMC promotes communicative asymmetries more than FTF or telephone calls (Heath, Luff and Sellen, 1997). Beyond all these and even if new technological achievements could overpass all these hang-ups, it would remain unsolved the serious problem of the video intrusion into the private space of individuals leading to a number of perplexing ethical and legal issues (Mackay, 1995).

             A plausible interpretation of the previous rather negative evaluation of the VMC usability was given by Tang and Isaacs (1993; Isaacs and Tang, 1994). According to these researchers, most of the previous studies have used artificial groups working on short contrived tasks (unrelated to their actual work) and they measured the product (e.g., decisions, solutions, completion times) of their fabricated interactions. Isaacs and Tang argue that the value of video would be more likely to be visible in actual work activity of real working groups by studying the process of interactions (perceptions of productivity, task focus, degree of interactivity etc.) among the people in such groups. To support their claims, they set out a number of real experiments of collaborative activity, where they found that the video channel was used to help mediate interactions substituting shorter, two-person meetings, longer phone calls and e-mail usage (Tang and Isaacs, 1993). Moreover, they found that a video channel enhances the ability to show understanding, forecast responses, give non-verbal information, improve verbal descriptions and express attitudes (Isaacs and Tang, 1994). However, comparing VMC to FTF, they realized a difficulty in the former to notice peripheral cues, ‘control the floor,’ have side conversations, point to things and manipulate real-world objects (Isaacs and Tang, 1994). Similarly, Fish, Kraut, Root and Rice (1992; 1993) evaluated a particular VMC system (called “Cruiser”) for its adequacy to support informal communication. They found that this video system increased the spontaneity and frequency of communication, supported social relationships, was capable in coping with the most complex and equivocal communications problems and was helpful in integrating new members into the working groups. In fact, they claimed that it was used more like a telephone or e-mail than like physically mediated FTF communication. Such positive evaluations of VMC are also shared by a number of studies in the edited volume of Finn et al. (1997). There are claims that the long-term use of high-quality video in work organizations indeed appears to favor VMC over audio and other less rich modes of communication.

            The fact is that the communicational space of VMC is quite different from the one of other modes of communication. Although the video-mediated interpersonal space maintains a sense of telepresence or copresence through the visibility of gestures and facial expressions of communicants, it is far from reproducing the various everyday non-verbal cues. Because of the discontinuity between the interconnected remote spaces in VMC, the resulting extended communicational space appears distorted and asymmetrical in many respects. To understand these distortions and asymmetries, Harrison, Ishii and Chignell (1994) have chosen to study and to experiment upon a framework of interpersonal space consisting of three dimensions: interpersonal distance, angles of orientation and gaze.

·         Interpersonal distance or proximity is one of the simplest examples of non-verbal communication and it concerns the distance between interacting conversants. Its study, named proxemics after Edward Hall (1966), refers to the ways people perceive and use space as a communicative device. For instance, Hall considers four distances and concomitant voice levels that usually people employ: intimate, personal, social and public distances. Of course, interpersonal distance is conditioned by personality, personal relationships, culture and communication and produces various psychological and behavioral effects. One such effect is persuasion, as, for example, sales people know what it means to approach customers closer than in other occasions. According to the studies and experiments of Grayson and Coventry (1998), proxemic information is preserved in VMC and produces similar effects to FTF interactions but less pronounced (since video is conveying only visual proxemic information compared to the multimodal one of FTF interactions).

·         Angle of orientation determines the relative positions between conversants and in dyadic interactions is configured according to four ways: face-to-face, at right angles, side-by-side and back-to-back. Differences in choices of these positions are influenced by the type of task, the status and relationship between individuals, culture and other social factors (not to mention spatial or environmental constraints partially occluding the view between conversants).

·         Gaze (as a non-mutual looking) and eye contact (as a simultaneous mutual interpersonal looking) are important attributes of visual communication. Gaze direction regulates the conversation flow, provides feedback about what is being discussed, communicates facial expressions and emotions, directs attention, indicates attentiveness and shapes the enacted interpersonal relationship (Argyle and Dean, 1965; Argyle, Ingham, Alkena and McCallin, 1973). Video gaze cues have been studied by Colston and Schiano (1995) in an experiment, where observers rated the difficulty people had in solving problems, based either just upon how long the person looked at each problem, or also how long her or his gaze lingered on it before being asked to move on. Their first results showed a linear relationship between gaze duration and rated difficulty with lingering as an added significant factor.

Morikawa and Maesako (1998) added some more dimensions of interpersonal space, other than facial expressions, the most important being:

·         Gestures playing a significant role in communication processes as cues of body language. According to Heath and Luff (1991; 1992a), it becomes harder to understand gestures transmitted by video. Morikawa and Maesako (1998) indicate that the ambiguity in the interpretation of gestures in VMC might result from the lack of control of global and peripheral environmental information between the remotely located conversants.

Coming back to the role of real-time video as an interpersonal communication technology, Steve Whittaker (1996) reviewed and assessed three distinct hypotheses about the role of video in communicational processes (further exploring the corresponding design implications in each of them):

·         First, the non-verbal communication hypothesis, according to which the role of video is to supplement audio. Examining claims that the visual channel supports the transmission of three different non-verbal types of cues, cognitive, turn-taking and social or affective, Whittaker concluded that in them the importance of video over audio is rather overestimated.

·         Second, the role of video for connection and opportunistic communication in the sense that, instead of enhancing a pre-established audio connection, video can be used to establish remote opportunistic communications, by providing information about other participants’ availability for communication. Whittaker distinguished three particular mechanisms of video for connection, glance, open links and awareness applications, and discussed their utility together with relevant design issues.

·         Third, “video-as-data” is the hypothesis that the video image is used to transmit information about the work objects themselves, rather than information about the participants, creating a dynamic shared workspace and simulating a shared physical environment. However, in the last two hypotheses there are also outstanding social issues about privacy and access that have yet to be addressed (Nardi, Schwarz, Kuchinsky, Leichner, Whittaker and Sclabassi, 1993; Nardi, Kuchinsky, Whittaker, Leichner and Schwarz, 1997).

In parallel directions, Dourish, Adler, Bellotti and Henderson (1996) have explored the long-term use of “media spaces,” i.e., collaborative, networked, multimedia computer environments (Bly, Harrison and Irwin, 1993), starting out from the following three positions, which differ from traditional perspectives:

·         Face-to-face communicative behavior in the real world is not always an appropriate baseline for the evaluation of mediated communication.

·         A set of complex and intricate communicative behaviors, pertinent to the nature of the medium, emerges in a coevolution of the involved people with the work practices.

·         Media spaces connect not only individuals but the wider social groups to which they belong.

In this way, Dourish et al. developed a framework of four perspectives - individual, interactional, communal and societal - along which they analyzed the dynamics of media spaces. Their emphasis on the societal perspective is grounded on Spears’ and Lea’s (1993) studies of the ‘social’ in computer-mediated communication. Incidentally, many of those considering the turn to the social have been attracted to ethnomethodology trying to resort to its resources for insights about the organization of the work of design (Button and Dourish, 1996; Dourish and Button, 1998).


Collaborative Virtual Environments (CVEs)


            A collaborative virtual environment (CVE) is an artificial space where several people interact and work together through networked computers and virtual reality systems (Benford, Bowers, Fahlén, Mariani and Rodden, 1994a). In this sense, CVEs constitute shared virtual worlds, i.e., computer-generated spaces whose occupants are represented to one another in three-dimensional graphical form. The cooperative applications supported by CVEs range from training, visualization, simulation and design to telework, telemedicine, distance learning and entertainment.

Each occupant of a CVE can control her or his viewpoint and can interact with others and with representations of data and software into a common display space. Occupants are graphically represented through avatars, located in positions and orientations and possessing viewpoints, which are all intended to be seen by everybody in the system. In other words, CVEs provide ‘user embodiment,’ i.e., the provision of users with appropriate body images so as to represent them to others and also to themselves (Benford, Bowers, Fahlén, Greenhalgh and Snowdon, 1995). For such a process of direct and sufficiently rich embodiment to be effective, Benford et al. (1995) have identified a list of embodiment design issues which should be considered by the designers of CVE systems. This list includes: presence, location, identity, activity, availability, history of activity, viewpoint, actionpoint, gesture, facial expression, voluntary versus involuntary expression, degree of presence, reflecting capabilities, physical properties, active bodies, time and change, manipulation of views of others, representation across multiple media, autonomous and distributed body parts, truthfulness and efficiency.

As a result, since users in a CVE are all embodied in it so that their location and orientation can be represented, a degree of mutual awareness of each other’s activity may arise or be easily supported. This is in contrast to multimedia systems, in which data and communicational information are typically displayed in separate windows. Furthermore, CVEs may provide a shared spatial environment where people can interact by employing more communicative resources than in other technical systems. For instance, participants can have a degree of control over what they view in a CVE, which is not in general possible within VMC or media spaces supported by a fixed camera and monitor system. Accordingly, turn-taking in social interaction in a CVE does not depend on ‘floor-control’ policies usually employed in VMC. Bowers, Pycock and O’Brien (1996) have systematically studied problems with turn-taking and participation in CVEs through certain qualitative, interpretive methodologies of social interaction (empirical techniques derived from conversation analysis). In addition, they also examined how the simple polygonal shapes by means of which users were represented are deployed in social interaction. Even when these embodiments are implemented with very minimal shapes, it is quite surprising that they found some familiar coordination of body movement to be observed at the virtual space too.

            Because of user embodiment within CVEs, one would be tempted to assume that in a sense users leave the physical world behind when entering the virtual world in order to communicate and to collaborate with others. Quite on the contrary, the experiments of Greenhalgh and Benford (1995) with the Massive system showed that in order to make sense of a user’s actions in a virtual world, other users require an understanding of actions and events within that user’s local physical environment. By a conversation analysis of transcripts of meetings in a CVE, Bowers et al. (1996) argued that the perceived trustworthiness of an embodiment can be influenced by real world events such as users leaving their embodiments unoccupied when attending to real world interactions or several users sharing a single embodiment. Similarly, using an observational analysis of interaction in and through the virtual world, Hindmarsh, Fraser, Heath, Benford and Greenhalgh (1998) observed: problems due to fragmented views of embodiments in relation to shared objects; participants compensating with spoken accounts of their actions; and difficulties in understanding others’ perspectives.

            Benford, Greenhalgh, Reynard, Brown and Koleva (1998) have classified shared-space technologies (including VMC, media spaces, CVEs, telepresence systems and collaborative augmented environments) according to three dimensions: transportation, artificiality and spatiality.

·         Transportation characterizes the sense of difference between local and remote in a shared-space. In fact, it concerns the extent to which a group of users and objects leave behind their local space and enter into some virtual space in order to meet with others, versus the extent to which they remain in their local space and the remote users and objects are brought to them. This concept is similar to virtual reality’s immersion. However, transportation differs from immersion in as far as it allows the possibility of introducing together remote users and objects in the virtual environment in which a user is immersed.

·         Artificiality concerns the extent to which a space is either virtual-synthetic or real-physical, i.e., based on the physical world. At the one extreme lies a wholly virtual-synthetic environment and at the other a wholly real-physical environment. For instance, VMC and telepresence applications are typical of the physical extreme, while CVEs devoted to abstract data visualization or computer art are examples of the synthetic extreme.

·         Spatiality concerns the level of support for fundamental physical spatial properties such as containment, topology, distance, orientation and movement. At the one extreme lies the notion of place, as a containing context for users, and at the other the notion of space, as a context providing a consistent, navigable and shared spatial frame of reference. Thus, in minimal videoconferencing (single camera per group of communicants, no shared data space), the only perceived space which is independent of communicants is the place where they and their cameras are located. Although one can be seen to move within the video image, other remote communicants cannot interpret this with respect to their own frame of reference. Now, by creating a shared drawing surface between two video views, the spatiality of videoconferencing is extended to allow some movements and gestures of the communicants to be visible within a shared space. So, at the other end, in fully spatial shared-spaces, communicants can explore a common spatial environment, independently moving their own viewpoints, while at the same time being aware of the viewpoints of others through their avatars.

Furthermore, Benford, Greenhalgh and Loyd (1997a) have managed to extend their spatial model by introducing a framework for supporting crowds of participants in CVEs. By an explicit crowd mechanism, they have accommodated the circumstances of formation and activation of different kinds of crowd with different effects of mutual awareness and communication. In a previous study, Benford, Bowers, Fahlén and Greenhalgh (1994b) had described realizations of a spatial model of interaction supporting people’s social skills in crowded CVEs. Whereas much previous user interface design work was concentrated on people’s spatial skills (e.g., their ability to spatially classify and navigate), these researchers focused on the social spatial skills, i.e., how people use space to manage interaction with one another. After Anthony Giddens (1984), it is fully understood how space constitutes a key resource for establishing and enabling actions. Similarly, Benford et al. (1994b) claim that space also enables different modes of participation in, and awareness of, actions. In particular, it provides peripheral awareness of the presence and actions of others, allowing people to ‘see at a glance’ what is happening. In addition, space enables people to negotiate access to common resources, as we see them using their body positions, orientations, gaze direction etc. in order to control turn-taking, queuing, jostling, even scrumming and to join or leave from a conversation group (for instance, the ‘social dance’ taking place at cocktail parties).

At this point we are going to discuss the critical conceptual differences between ‘space’ and ‘place.’ According to Steve Harrison and Paul Dourish (1996), although spatial metaphors are the prevailing ones to support interaction, it is actually a notion of place that frames interactive behavior. In fact, these authors argue that the critical property which designers are seeking (and they call it ‘appropriate behavioral framing’) is not rooted in the properties of space at all. They claim that, in contrast to space, place is the desired notion, as a set of common and shared cultural understanding about behavior and action (in their motto: “space is the opportunity; place is the understood reality”). What they mean is that a place is a valued space, which is invested with understandings of behavioral appropriateness, cultural expectations etc. (“we are located in ‘space,’ but we act in ‘place’”) and, as an example of the cultural content of virtual places, they refer to concerns about privacy in media spaces. In other words, a place is generally a space with something added (social meaning, convention, cultural understandings about role, function, nature etc.). They remind the term ‘locales’ adopted by Anthony Giddens (1984) to capture a similar sense of behavioral framing and constituting the meaningful content of interaction. Furthermore, Harrison and Dourish investigate two complex forms of places: (i) space-less places (as the USENET newsgroups or the Internet mailing lists and social navigation through information collections on the basis of information derived from the activity of others, which is a placeful navigation without physical space) and (ii) hybrid physical/virtual spaces, which technology can create and in which new ‘cyborg’ places can emerge.

Returning now to their previous classification, through that scheme, Benford, Greenhalgh, Reynard, Brown and Koleva (1998) managed to establish general relationships between physical and shared-spaces and even to think about hybrid approaches combining different kinds of spaces. These hybrid spaces represent forms of mixed reality as being shared spaces that combine the physical and the synthetic, the local and the remote. Moreover, driven by the concerns of supporting new forms of awareness and communication between occupants of many distributed spaces, they have explored a particular style of mixed reality by creating transparent boundaries between the physical and the synthetic. Thus, instead of being superimposed in a single display (as previous approaches were doing), two spaces are placed adjacent to one another and then stitched together by creating a ‘window’ between them. Such a construction of a transparent physical-synthetic boundary is based on a combination of projecting graphics into the physical space and texturing video into the virtual space. In more details, the moving synthetic space (with the avatars within it) is transmitted across the network, rendered and then projected into the physical space. At the same time, a live video image of the physical space is transmitted across the network and then displayed in a synthetic space through a process of dynamically texture mapping the incoming frames so that it appears as an integrated part of the virtual environment. Furthermore, Benford et al. (1998) identified some general properties of these mixed-reality boundaries, including their degree of transparency, the possibilities for interaction with and through them, and the location of multiple boundaries within a single space. As an application, Benford, Snowdon, Brown, Reynard and Ingham (1997b) have presented some new forms of interface to the World-Wide Web based on the construction of some innovative mixed-reality boundaries on the Internet.

Coming back now to the issue of the social character of virtual environments, two interesting questions are whether and in which sense they can be perceived as a social space and regarded as constituting social systems. Phillip Jeffrey and Gloria Mark (1998) observed two virtual worlds to investigate how social norms involved with personal and group space, privacy, crowding and territoriality affect people during interaction and navigation. What they found is pretty similar to what is observed in physical environments: for instance, people were disturbed when their personal and group spaces were violated and when spaces were crowded. Privacy was also a major concern and it was indicated through positioning and other signals. So, their interpretation is that virtual environments are, indeed, perceived by people as a social space. As for the second question, Barbara Becker and Gloria Mark (1998; 1999) claim that one criterion of social systems is the presence of social conventions, which serve as a basis for common communication. In fact, they assume that social conventions and rules are a fundamental precondition for the stability, efficiency and inner coherence of a social system (cf. Giddens, 1990). In particular, in social philosophy, social conventions have been described as normative rules of conduct, based on implicit ethical imperatives (cf. Habermas, 1987). Thus, social rules are the underlying preconditions of communication, because the way people communicate is embedded in social practice and specific life styles, determined by implicit social conventions. Therefore, regarding virtual environments as specific forms of social systems, Becker and Mark undertake an exploration of the implicit and explicit social conventions in order to understand the particular social practice within these environments. By an ethnomethodological analysis of three different online environments, Becker and Mark (1998; 1999) have tried to identify a number of social conventions rooted in the studied virtual environments. Furthermore, they examined the role of technology in shaping such behavioral conventions. They found two alternative hypotheses for this role: (1) that technology creates a sense of social presence that influences behavior and (2) that people use the available functionality that requires the least cognitive effort in order to achieve their goals.

            Turning to an investigation of the social conditions of work in CVEs, Fitzpatrick, Kaplan, Mansfield and Tolone (1995; 1996), through a case study of a group of systems administrators, have explored the differences between collaborative work when it is carried out in the virtual and in the physical domain. These researchers have based their framework to ground an understanding of collaborative work upon Anselm Strauss’ symbolic interactionist school and in particular his theory of social worlds (Strauss, 1978; Clarke, 1991) and his theory of action (Strauss, 1993). In this sociological context, a social world is an interactive unit defined by a group of people sharing a commitment to collective action and, so, requiring the coordination of separate perspectives and the sharing of resources. Although not necessarily conforming to geographical or organizational boundaries, social worlds are constrained instead by the limits of effective communication. Moreover, people can belong to multiple social worlds simultaneously. A locale or place signifies the site and means that a particular social world has mapped itself to, in order to carry out its collective tasks. In Fitzpatrick’s et al. case study, the virtual medium provides a site and means to support other components of social world interactions. The various social worlds interacting in and through the virtual environment map their interest or focus to parts of the system, which become locales for their world. Thus, Fitzpatrick et al. (1995) suggest a new interpretation of spatial metaphors for the design and construction of collaborative systems based on locales and on centres rather than boundaries. In this way, they try to give a different meaning to space, not any longer as a simulation of the physical or a structuring of the interface but as a deep construction of the nature of work based on membership and participation to social worlds.




Argyle, M., & J. Dean (1965). Eye contact, distance, and affiliation. Sociometry, 28, 289-304.

Argyle, M., R. Ingham, F. Alkena & M. McCallin (1973). The different functions of gaze. Semiotica, 7, 10-32.

Becker, B., & G. Mark (1998). Social conventions in collaborative virtual environments. In E. Churchill & D. Snowdon (eds.), Proceedings of CVE’98. Manchester.

Becker, B., & G. Mark (1999). Constructing social systems through computer-mediated communication. In E. Churchill & D. Snowdon (eds.), Virtual Reality Society Journal. London: Springer Verlag.

Benford, S., J. Bowers, L.E. Fahlén, J. Mariani & T. Rodden (1994a). Supporting co-operative work in virtual environments. The Computer Journal, 37(8), 653--668.

Benford, S., J. Bowers, L.E. Fahlén & C. Greenhalgh (1994b). Managing mutual awareness in collaborative virtual environments. Proceedings of ACM SIGCHI VRST’94 Conference on Virtual Reality and Technology, August 1994, Singapore. New York: ACM Press.

Benford, S., J. Bowers, L.E. Fahlén, C. Greenhalgh & D. Snowdon (1995). User embodiment in collaborative virtual environments. In Proceedings of ACM’95 Conference on Human Factors in Computing Systems, v. 1, pp. 242-249. New York: ACM Press.

Benford, S., C. Greenhalgh & D. Loyd (1997a). Crowded collaborative virtual environments. In Proceedings of ACM CHI’97 Conference on Human Factors in Computing Systems, pp. 59-66. New York: ACM Press.

Benford, S., D. Snowdon, C. Brown, G. Reynard and R. Ingram (1997b). Visualizing and populating the Web: Collaborative virtual environments for browsing, searching and inhabiting webspace. In Proceedings of JENC’8 8th Joint European Networking Conference, May 1997, Edinburgh.

Benford, S., C. Greenhalgh, G. Reynard, C. Brown & B. Koleva (1998). Understanding and constructing shared spaces with mixed-reality boundaries. ACM Transactions on Computer-Human Interaction, 5(3), 185-223.

Bennington, T.L., & G. Gay (2000). Mediated perceptions: Contributions of phenomenological film theory to understanding the interactive video experience. Journal of Computer-Mediated Communication [online], 5(4):

Bly, S.A., S.R. Harrison & S. Irwin (1993). Media Spaces: Bringing people together in a video, audio, and computing environment. Communications of the ACM, 36(1), 28-47.

Bowers, J., J. Pycock & J. O’Brien (1996). Talk and embodiment in collaborative virtual environments. In Proceedings of ACM CHI’96 Conference on Human Factors in Computing Systems, v. 1, pp. 58-65. New York: ACM Press.

Button, G., & P. Dourish (1996). Technomethodology: Paradoxes and possibilities. In Proceedings of ACM CHI’96 Conference on Human Factors in Computing Systems, v. 1, pp. 19-26. New York: ACM Press.

Clarke, A.E. (1991). Social worlds/arenas theory as organizational theory. In D.R. Maines (ed.), Social Organization and Social Processes: Essays in Honor of Anselm Strauss, pp. 119-158. New York: Aldine de Gruyter.

Colston, H.L., & D.J. Schiano (1995). Looking and lingering as conversational cues in video-mediated communication. In Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, v. 2, pp. 278-279. New York: ACM Press.

Conrath, D.W., E.V. Dunn, W.G. Bloor & B. Tranquada (1977). A clinical evaluation of four alternative telemedicine systems. Behavior Science, 22, 12-21.

Daft, R.L., & R.H. Lengel (1984). Information richness: A new approach to managerial behavior and organizational design. In B. Staw & L.L. Cummings (eds.), Research in Organizational Behavior, vol. 6, pp. 191-233.Greenwich, Conn.: JAI Press.

Daft, R.L., & R.H. Lengel (1986). Organizational information requirement, media richness and structural determinants. Management Science, 32(5), 554-571.

Dourish, P., A. Adler, V. Bellotti & A. Henderson (1996). Your place or mine? Learning from long-term use of audio-video communication. Computer-Supported Cooperative Work, An International Journal, 5(1), 36-62.

Dourish, P., & G. Button (1998). “Technomethodology”: Foundational relationships between ethnomethodology and system design. Human-Computer Interaction, 13(4), 395-432.

Durlak, J.T. (1987). A typology for interactive media. In M.L. McLaughlin (ed.), Communication Yearbook 10, pp. 743-757. Newbury Park, CA: Sage.

Eriksson, H. (1994). MBONE: The multicast backbone. Communications of the ACM, 37(8), 54-60.

Falk, H. (1973). PicturePhone and beyond. IEEE Spectrum, 45-49.

Finn, K.E., A.J. Sellen, S.B. Wilbur (eds.) (1997). Video-Mediated Communication. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.

Fish, R.S., R.E. Kraut, R.W. Root & R.E. Rice (1992). Evaluating video as a technology for informal communication. In Proceedings of CHI’92 Human Factors in Computing Systems, pp. 37-48. New York: ACM Press.

Fish, R.S., R.E. Kraut, R.W. Root & R.E. Rice (1993). Video as a technology for informal communication. Communications of the ACM, 36(1), 48-61.

Fitzpatrick, G., W.J. Tolone & S.M. Kaplan (1995). Work, locales and distributed social worlds. In Proceedings of the Fourth European Conference on Computer-Supported Cooperative Work, September 1995, Stockholm, pp. 1-16. Dordrecht: Kluwer Academic Publishers.

Fitzpatrick, G., S. Kaplan & T. Mansfield (1996). Physical spaces, virtual places and social worlds: A study of work in the virtual. In Proceedings of ACM Conference on Computer-Supported Collaborative Work CSCW’96, pp. 334-343. New York: ACM Press.

Gale, S. (1990). Human aspects of interactive multimedia communication. Interacting with Computers, 2(2), 175-189.

Gaver, W. (1992). The affordances of media spaces for collaboration. In Proceedings of ACM Conference on Computer-Supported Cooperative Work CSCW’92, pp. 17-24. New York: ACM Press.

Gibson, J. (1979). The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.

Giddens, A. (1984). The Constitution of Society. Cambridge: Polity Press.

Giddens, A. (1990).  The Consequences of Modernity. Stanford, CA: Stanford University Press.

Grayson, D., & L. Coventry (1998). The effects of visual proxemic information in video mediated communication. SIGCHI Bulletin, 30(3), 30-39.

Greenhalgh, C., & S. Benford (1995). MASSIVE: A collaborative virtual environment for teleconferencing. ACM Transactions on Computer-Human Interaction, 2(3), 239-261.

Habermas, J. (1987). Theory of Communicative Action, 2 vols. Cambridge: Polity Press.

Hall, E.T. (1966). The Hidden Dimension. Garden City, NY: Doubleday.

Harrison, B., H. Ishii & M.H. Chignell (1994). An empirical study of orientation of shared workspaces and interpersonal spaces in video-mediated collaboration. Telepresence Technical Report OTP-94-2. Ontario Telepresence Project.

Harrison, S., & P. Dourish (1996). Re-place-ing space: The roles of place and space in collaborative systems. In Proceedings of ACM CSCW’96 Conference on Computer-Supported Cooperative Work. New York: ACM Press.

Heath, C., & P. Luff (1991). Disembodied conduct: Communication through video in a multi-media environment. In Proceedings of ACM Conference on Human Factors in Computing Systems CHI’91, pp. 99-103. New York: ACM Press.

Heath, C., & P. Luff (1992a). Media space and communicative asymmetries: Preliminary observations of video-mediated interaction. Human-Computer Interaction, 7(3), 315-346.

Heath, C., & P. Luff (1992b). Explicating face to face interaction. In N. Gilbert (ed.), Researching Social Life, pp. 306-327. London: Sage.

Heath, C., P. Luff & A. Sellen (1997). Reconfiguring media space: Supporting collaborative work. In Finn, K.E., A.J. Sellen, S.B. Wilbur (eds.), Video-Mediated Communication, pp. 323-347. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.

Heeter, C. (1992). Being there: The subjective experience. Presence: Teleoperators and Virtual Environments, 1(2), 262-271.

Hindmarsh, J., M. Fraser, C. Heath, S. Benford & C. Greenhalgh (1998). Fragmented interaction: Establishing mutual orientation in virtual environments. Proceedings of ACM CSCW’98 Conference on Computer-Supported Cooperative Work, pp. 217-226. New York: ACM Press.

Isaacs, E.A., & J.C. Tang (1994). What video can and cannot do for collaboration: A case study. Multimedia Studies, 2, 63-73.

Jeffrey, P., & G. Mark (1998). Constructing social spaces in virtual environments: A study of navigation and interaction. In K. Höök, A. Munro & D. Benyon (eds.), Workshop on Personalised and Social Navigation in Information Space, pp. 24-38. Stockholm: Swedish Institute of Computer Science.

Knoblauch, H. (1999). Book review of ‘Video-Mediated Communication.’ Computer Supported Cooperative Work, 8, 299-301.

Laurel, B. (ed.) (1990). The Art of Human-Computer Interface Design. Reading, MA: Addison-Wesley.

Laurel, B. (1991). Computers as Theatre. Reading, MA: Addison-Wesley.

Lea, M. (1991). Rationalist assumptions in cross-media comparisons of computer-mediated communication. Behaviour & Information Technology, 10(2), 153-172.

Lombard, M., & T. Ditton (1997). At the heart of it all: The concept of presence. Journal of Computer-Mediated Communication [online], 3(2):

Loomis, J.M. (1992). Distal attribution and presence. Presence: Teleoperators and Virtual Environments, 1(1), 113-119.

Mackay, W.E. (1995). Ethics, lies and videotape... Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, v. 1,  pp. 138-145. New York: ACM Press.

Mazur, J.M. (2000). Applying insights from film theory and cinematic technique to create a sense of community and participation in a distributed video environment. Journal of Computer-Mediated Communication [online], 5(4):

Morikawa, O., & T. Maesako (1998). HyperMirror: Toward pleasant-to-use video mediated communication system. In Proceedings of ACM CSCW’98 Conference on Computer-Supported Cooperative Work, pp. 149-158. New York: ACM Press.

Nardi, B., H. Schwarz, A. Kuchinsky, R. Leichner, S. Whittaker & R. Sclabassi (1993). Turning away from talking heads: An analysis of “video-as-data.” In Proceedings of CHI’93 Human Factors in Computing Systems, pp. 327-334. New York: ACM Press.

Nardi, B.A., A. Kuchinsky, S. Whittaker, R. Leichner & H. Schwarz (1997). Video-as-Data: Technical and Social Aspects of a Collaborative Multimedia Application. In Finn, K.E., A.J. Sellen, S.B. Wilbur (eds.), Video-Mediated Communication, pp. 487-518. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.

Ngwenyama, O.K., & A.S. Lee (1997). Communication richness in electronic mail: Critical social theory and the contextuality of meaning. MIS Quarterly, 21(2), 145-167.

Ochsman, R.B., & A. Chapanis (1974). The effects of 10 communication modes on the behavior of teams during co-operative problem-solving. International Journal of Man-Machine Studies, 6, 579-619.

O’Conaill, B., S. Whittaker & S. Wilbur (1993). Conversations over video conferences: An evaluation of the spoken aspects of video-mediated communication. Human-Computer Interaction, 8(4), 389-428.

Rice, R.E., & G. Love (1987). Socio-emotional content in a computer-mediated communication network. Communication Research, 14(1), 85-105.

Sellen, A. (1992). Speech patterns in video-mediated communications. In Proceedings of ACM Conference on Human Factors in Computing Systems CHI’92, pp. 49-59. New York: ACM Press.

Short, J.A. (1974). The effect of medium of communication on experimental negotiation. Human Relations, 27, 225-234.

Short, J., E. Williams & B. Christie (1976). The Social Psychology of Telecommunications. New York: John Wiley.

Sitkin, S., K. Sutcliffe & J. Barrios-Choplin (1992). Determinants of communication media choice in organizations: A dual function perspective. Human Communication Research, 18, 463-498.

Spears, R., & M. Lea (1993). Social influence and the influence of the ‘social’ in computer-mediated communication. In M. Lea (ed.), Contexts of Computer-Mediated Communication, pp. 30-65.New York: Harvester Wheatsheaf.

Spears, R., M. Lea & T. Postmes (2000). Social psychological theories of computer-mediated communication: Social pain or social gain? In W.P. Robinson & H. Giles (eds.), The Handbook of Language and Social Psychology (2nd edition). Chichester: Wiley.

Steuer, J. (1995). Defining virtual reality: Dimensions determining telepresence. In F. Biocca & M.R. Levy (eds.), Communication in the Age of Virtual Reality, pp. 33-56. Hillsdale, NJ: Lawrence Erlbaum Associates.

Strauss, A. (1978). A social world perspective. Studies in Symbolic Interaction, 1, 119-128.

Strauss, A. (1993). Continual Permutations of Action. New York: Aldine de Gruyter.

Tang, J.C., & E. Isaacs (1993). Why do users like video? Studies of multimedia-supported collaboration. Computer-Supported Cooperative Work: An International Journal, 1(3), 163-196.

Trevino, L.K., R.L. Daft & R.H. Lengel (1990). Understanding managers’ media choices: A symbolic interactionist perspective. In J. Fulk & C. Steinfield (eds.), Organizations and Communication Technology, pp. 71-94. Newbury Park, CA: Sage Publications.

Whittaker, S. (1996). Rethinking video as a technology for interpersonal communication: Theory and design implications. Journal of Human-Computer Studies, 42, 501-529.

Williams, E. (1977). Experimental comparisons of face-to-face and mediated communication: A review. Psychological Bulletin, 84(5), 963-976.

Back to Rich Media