Interactive virtual environments for blind children: usability and cognition


Jaime Sánchez and Loreto Jorquera

Department of Computer Science, University of Chile

Blanco Encalada 2120, Santiago, Chile



It is known that blind children represent spatial environments with cognitive difficulty. This can be decreased if they are exposed to interactive experiences with acoustic stimuli delivered through spatialized sound software. The aim of this research was to implement a field study to detect and analyze cognitive and usability issues involved in the use of an aural environment and the issues of representing navigable structures with only spatial sound. The research was implemented by exposing the children to a spatialized acoustic environment in a challenge-action game software and complementing this with a kit of cognitive representation tasks, which includes corporal exercises and experiences with the interaction with concrete representation materials such as sand, clay, styrofoam, and Lego® bricks. The cognitive kit also included learning activities to represent the perceived environment, the organization of the space, and problem solving related to the interactions with the software. The usability testing of the environment was an explicit issue in the research by using both qualitative and quantitative methods including interviews, survey methods, logging actual use, still pictures, and video tape recording session analysis. The results of the study revealed that blind children can achieve the construction of mental structures rendered with only spatialized sound and that spatial imagery is not purely visual by nature, but can be constructed and transferred through spatialized sound delivered by a computer software. Our hypothesis was fully confirmed revealing that each blind child passes four clear cognitive stages in their interaction with the sound environment and performing cognitive tasks: entry, exploration, adaptation, and appropriation.

Keywords: virtual acoustic environment, spatial representation, blind children, audio-based navigation, usability, spatialized sound


1. Introduction

Integrating Virtual Reality (VR) to assistive aids for users with special needs appears to be an attractive combination that has been received media attention interest recently. The main stream comes from the fact that VR or more accurately Virtual Environments (VE) provide novel methods for visualizing and interacting with complex data sets. Recently, VR and VE are used to enhance or ameliorate the cognitive and navigation problems of blind users. If we study how this technology assist blind users we can classify it according to the following type of applications:

The use of VR as interactive technology to explore users representing and interpreting symbolic objects and simulated environments in their minds has received recently some attention in the literature. It is known that blind children represent spatial environments with cognitive difficulty. From the point of view of our research we propose that it is possible to render spatial structures without visual cues. This idea gives a chance to train the child to acquire these skills by exposing him to interactive experiences with acoustic stimuli delivered through spatialized sound software. A few studies have approached this issue by using interactive applications that integrate virtual reality and cognitive tasks to enhance or to test spatial orientation skills [Chance 1998, Loomis 1998]. In this study, by using acoustic interactive virtual environments fully described elsewhere [Lumbreras & Sánchez, 1998, 1999a], we attempt to assess cognitive and usability issues involved in rendering of spatial structures in the mind of blind children through acoustic navigable virtual environments.

The focus of this research was to implement a study to detect and analyze cognitive and usability issues involved in the use of an aural environment and the characteristics of representing navigable structures through spatial sound. This study stems from a challenging pilot research project [Lumbreras & Sánchez, 1998, 1999a] to a full fledged contextualized and field-testing research with eleven blind children during six months in a Chilean blind school. In this work we also describe the cognitive effects on children when interacting with this software environment in terms of the development of cognitive spatial representations as a result of interacting with acoustic environments [Lumbreras & Sánchez 1999b; Sánchez & Lumbreras 1999; Sánchez 2000 a, 2000b, 2000c]. The idea behind this is to motivate and engage blind children to explore and interact with virtual entities in challenge-action software, and to construct cognitive spatial mental representations invisibly.

2. Methodology

The research was implemented in a Chilean blind school. We extract an intentional sample of eleven first and second grade children coming from social and deprived suburbs. The size of the sample was established by considering some restrictions such as: school time slots and space available, structural and topological conditions of the school, PC availability, and the number of special education teachers and assistants we work with. Some of these children live in their homes and others live in the school in an internship system that allows them to go home during the weekends. The ages of the children spanned from seven to twelve years old. There were children with total blindness and some with residual vision. In addition, some of them have some degree of cognitive and learning disabilities. Diagnostic tests were applied to develop a profile of each child that included ages, type of blindness, affective background (maturity, tolerance to frustrations, irritability, emotional capacity, self-esteem), and cognitive (Wechsler verbal scale) and psychomotor level.

The software used to render the spatial environment was built on the concept of a large set of monoaural clips of sound preprocessed with FIR -Finite Impulse Response- filters based on HRTFs –Head Related Transfer Functions- to render the spatialized sound. The software runs in a PC Windows 95 with a standard stereo sound board. Besides to the virtual environment designed for a pilot experience [Lumbreras & Sánchez, 1998, 1999a], we built a kit of cognitive representation tasks (see Fig.1). They include different levels of exposure to the interaction with the acoustic environment, experiences with corporal movements of children in the schoolyard, model building with sand tables and other concrete materials such as clay, styrofoam, Lego bricks, plastic images, wood cubes, metal pins, small balls, and sand paper. Six tasks were carefully planned with cognitive objectives, time, and number of pedagogical sessions and methodological procedures. The kit also includes learning activities to represent the perceived environment, the organization of the space, and problem solving related to the interactions with the software. The usability testing of the environment was an explicit issue in the research by using both qualitative and quantitative research methods including interviews, survey methods, logging actual use, still pictures and video tape recording session analysis.


Fig 1 Sequence of activities from the interaction with the virtual environment to the building of the perceived space

Fig 2 Left, spatial map embedded in the virtual environment to be represented by the blind children. Right, final map built by a blind girl with LEGO bricks.


3. Results


3.1. Cognitive achievement

As a result of the empirical research we have identified four clear cognitive stages in the interaction of the child with the spatialized sound environment and as a result of the mental mapping of the spatial structure. The achievement of the stages was manifested by different levels of fidelity in the model design in comparison with the ideal software embedded structure (see Fig 2). The stages are:

The entry was the initial interaction with the 3D virtual environment. The child learns different spatial concepts that are prerequisite to make an adequate organization of the environment. The child develops self-esteem and autonomy when interacting with the environment. The child starts representing the space mentally through the story of the software. The concrete representations are very incomplete and unclear. The main characteristic of this level is that the child interacts with the software and is highly motivated when doing this. The child’s level of control of the software is low.

The exploration was represented by a general representation of the structure embedded in the sound environment made through corporal movement of the child in the schoolyard (see Fig 3, right). The main attention is centered on the exploration of the software. The child reaches the representation of the main corridor but no details and secondary corridors are represented with concrete materials (see Fig 3, left). The child develops spatial notions through virtually moving through a virtual space oriented by acoustic information. The child can represent the mental images of the virtual story space with concrete materials exclusively based on acoustic information, showing coherence between concrete representations and story virtual space navigated through acoustic tools. The interaction with the story of the software gives confidence and autonomy to the blind child. The child’s level of control of the software is medium.

Fig 3 Left, this concrete representation reaches only the idea of a main corridor. Right, corporal movement made by a blind girl to make a general representation of the map embedded in the sound environment.

The adaptation was attained by the representation of all corridors. The child understands that the structure embedded in the acoustic environment is composed of one main corridor and two secondary corridors that are perpendicular to the main one. The child solves the conflict created by the entrance to the secondary corridors by understanding that this involves a change in the orientation of his/her movements. Even though the corridors are mapped the representation is incomplete (see Fig 4, left). The child’s level of control of the software is high. The child makes more complex representations of the virtual environment navigated virtually.

Fig 4 Representations with clay when the child is in the adaptation stage (left) and the appropriation stage (right).

A complete understanding and mapping of the structure embedded in the sound environment was reached in the appropriation stage. The main characteristic is the child's mastery of the environment. The child understands the story of the software and uses it effortlessly as an interesting tool. The child maps a main corridor with the possibility to move throughout divergent corridors passing by doors that make changes in the orientation of the movement evidencing a full comprehension of the story virtual space (see Fig 4, right). The child’s level of control of the software is complete. As a result, the child constructs a complete mental image of the spatialized environment and represents it concretely with high fidelity. The model is built with high quality and complexity in terms of the spatial structure of the navigated story, elements used, and entities involved.

3.2. Levels of cognitive achievement

Not all children attained the level of appropriation. This was explained by the fact that there were other uncontrolled (but known) variables playing a significant role in the expression of our dependent variable, level of mental spatial representation. Actually, the sample was diverse. Some children have total blindness and some have residual vision. Some could use their residual vision to interact with the visual interface and localize visually the keystrokes of the keyboard. Others have just color, light/shadow vision with almost no effect in the interaction. Two children have mental disorders and spasm hemiplegy.

We identify three levels of achievements in our research: high, medium, and incomplete. High achievement was reached by students with residual vision and totally blind. They represent the virtual space with high fidelity, comprehend completely the spatial structure of the software and move easily through the software making flexible changes in the orientation of the movements. They make correct design mock-ups with corridors, door distribution in the space, organizing elements and entities, and showing correctly the entry and end points of the story. There is a high coherence between the concrete representations and the virtual space navigated through sound. The narratives given by the children when explaining their work include important details such as different modes to end the story depending on the corridor they enter.


Required skill

To move through the environment

Software mastering, understanding of the interaction, sound semantic and positional recognition

To move from point A to point B

Landmark sequence organization and recognition, rotation understanding, path integration

To represent the embedded structure

Mental mapping of the topology, distance perception

To choose the best path from point A to point B

Full mental representation of the spatial structure, travel planning

Table 1 Cognitive skills to reach space-based tasks. To move through the environment is the lowest objective and path selection is the highest.

Students with residual vision attained the level of medium achievement. They comprehend the existence of lateral and main corridors but cannot orient him/herself when passing through lateral doors. They are unable to make a complete second change in the orientation when moving through the virtual space. They do it with only one lateral corridor. The cognitive mapping of the story complexity is incomplete.

Students with cognitive and learning disabilities reached the level of incomplete achievement. Because of their limited cognitive development the performance stop when we added complexity: more corridors, more elements, more entities, and more orientation movements. The children cannot relate corridors, they perceived them as entirely separated from the main corridor.

Fig 5. A model of the typical child progress through time

3.3. Comparative results between sighted and blind children

A group of sighted children of the same ages and characteristics of our blind children sample coming from a regular primary school were exposed to the interaction with the sound environment. Our goal was to check how they behave and represent an audio-based virtual environment and compare their representations with the ones made by blind children.

As a result, on the one hand, the sighted child (left side of figure 6) only represents the main corridor and some entities in the surrounding space (sand). The child represents the navigated environment partially, including some elements that do not exist in the software (clay). The representation is unclear, including different elements of the environment (Lego).

On the other hand, the blind child represents completely the main and additional corridors (sand). The child maps the main corridor with high fidelity, but additional corridors are incompletely represented (clay). The child represents an exact map of the environment with the main and additional corridors displayed as in the original model embedded in the software (Lego).

Fig 6 Comparative representations with sand, clay, and Lego between sighted (left) and blind (right) children interacting with the spatialized sound virtual environment.

3.4 Logging actual use analysis

We recorded the child’s interaction a logging file to study the progress through subsequent sessions. The file log captures each event –user event or system event. With an ad-hoc analysis tool we study variables such as the decreasing of time localization through sessions, error rate and pattern of interaction. As we have described, the virtual environment includes a main corridor with two divergent second corridors. To simplify the graphs we show only some interactions that happened in the main corridor. The child can perform and receive events from three different surrounding voxels labeled as left, middle, and right. In the following figures, each icon represents an event. The small box located in the upper left side of each icon is different grayed to indicate the class of the event.

Fig 7. Above, the child´s surrounding voxels drawed on the picture. Below, trace of the first interaction with the audio virtual environment.

A. There are almost twenty seconds without user activity, may be the child is confused.

B. A box of bullets is encountered. The child tries pick up the bullet box in the wrong voxel

C. The same pattern of B is displayed. The child cannot localize accurately the representative sound.

D. With a few steps the child arrives to the first door.

E. The child tries to open the door after 5 seconds. The first try was made in the wrong voxel

Fig 8. After some training the child is able to pick up the objects almost immediately without errors. Notice that to reach to the end of the corridor the child took only 25 seconds instead of more than 60 seconds in the first try –displayed in fig.7-.

Fig. 9. Trace of the pattern of interaction to reach the end of the main corridor. At the end, the child encounters the final goal -the center of the flying source-.

A. The child localizes the box of bullets and picks it up without errors

B. The child fails to detect the position of the door. After some tries he can open the door

C. A monster appears and the child fails the shooting pointing a wrong voxel. Then he fixes his error and destroys the monster after three shots

D. The child opens the last door of the main corridor

E. The mutant appears and moves from voxel to voxel after some seconds. Then the child destroys it

F. The child arrives to the end of the main corridor and destroys the center of the flying source


Fig. 10 Icons used in the logging analysis


4. Discussion

This research revealed that it is possible to achieve the construction of mental structures rendered with spatialized sound in conjunction with a set of cognitive tasks. The sound environment by itself does not make any difference in the development of spatial structures in blind children. Cognitive tasks with pedagogical implementations are critical to get good results.

Spatial imagery can be transferred through spatialized sound delivered by virtual sound environments and appropriate methodology. However, not all the children reached the same cognitive stage in their spatial representation. The highest level we defined, the appropriation level, can be attained by most children but they need different rhythm, pace, and emphasis. When the blind child has another deficit besides to blindness the development of spatial structures is more complicated and requires a dedicated coaching with extensive time and slow pace. We believe that all blind children can reach the stage of appropriation if there is a careful design that maps the needs, background, and requirements of each child case by case.

As a result of exposing blind children to the aural environment, we believe that each child possesses both unique skills and pace referred to mental and spatial development. This has a major impact on the quality of the topological features of the model obtained in comparison to the ideal reference spatial structure embedded in the software.

An interesting testing came out when we compared sighted and blind children. The picture shows us that sighted children (and probably adults) do not rely on sound to construct their spatial structures as blind children could do –see fig. 6. Learning through sound is very poor in sighted children. Sound environments such as the one used in this study can be used to enrich the cognitive experience of sighted children heavily based on images. Perhaps there is an entire new story in terms of using sound not just for emotions as we see mostly today, but rather to help to construct richer mental experiences.

Finally, we have confirmed the results obtained in a past pilot testing. This research design was deeper, longer, more systematic, cognitive focused, and full of diverse experiences with concrete materials. We also arrive to a clear picture of the role of virtual sound environments in the construction of mental spatial structures. With a set of clear concepts we are moving to a direction of making more powerful and flexible sound maps designed by teachers and parents by using virtual soundscape editors. Right now we are building these tools based on Microsoft DirectX 3D Sound. We also are studying another effects of acoustic environments such the impact on the construction of temporal cognitive structures and the mapping of real spaces such as the subway and the school through sound. Finally, we need to know how exact are the representations made by the children. Then we are evaluating the exactness of the child’s models in relation to the software embedded ones by defining metrics measures.


5. References