Spatial Audio and Immersion

Maverick Kuhn ’20

Virtual reality is an ever-expanding market, taking the game industry by storm and hoping to revolutionize many other industries, including education. VR allows you to transport into an entirely new reality. There, you can interact with objects and move through 3D space. When trying to improve upon VR experiences, many people focus on the minute details of creating more realistic and immersive visuals. However, it is crucial that we don’t overlook audio. In fact, it has been shown that in both video and VR production, audio is the number one biggest component of believable immersion. A user is much more likely to enjoy a video if it has high quality sound and okay video quality rather than low quality sound and high quality video. Furthermore, a virtual reality environment’s immersion relies heavily on how the audio is propagated within the scene.

When we normally put on headphones, we listen to audio that is dimensionally flat; sound will come at us from either the left or right ear but won’t give directional much information beyond that. Spatial audio, or 360 audio, is sound that provides spatial information about the environment around you. 360 audio is communicated through how the audio reacts when you shift your perspective and position in space. Imagine a speaker in front of you – the audio is coming from straight on. However, if you turn your head left, you should hear the speaker predominantly from your right ear. If you move your head closer to the speaker, the sound should get louder. In order to achieve this responsive effect, the user’s head needs to be tracked in the 3D virtual space.  This is feasible using the Vive headset.

Spatial audio allows us to specify exactly where that sound is coming from in 3D space (e.g. high up in the distance behind you ) and it will react to how the user moves around the environment. The software we are using for spatial audio is already a part of Unity, the game engine program we are using for the Gaspee experience. Unity allows us to place audio sources in 3D space, and its algorithms will automatically adjust the volume according to spatial parameters. For example, let’s say we want to hear the crackles of a realistic campfire. In order to do this we need to place a 3D model of a fire inside our 3D environment. Next we are able to assign an audio file to that 3D model. From there, we can set a falloff graph, which determines how dramatically the volume adjusts according to the user’s proximity to the object, and can be linear, logarithmic, or custom. After this has been set, the user can walk all around the fire and the audio will adjust accordingly, making the fire seem actually ‘there.’

Spatial audio provides a huge benefit to immersion for VR. In order to create increasingly realistic environments, we must continue improving the design of responsive spatial audio.

I would like to leave you with an article about a new innovation in 3D audio. Valve has recently bought a technology that allows sound to react to the physical objects in the environment (e.g. how a sound changes when it bounces of a wall or moves through bushes). This will allow for unprecedented realism in VR audio and is bringing us one step closer to complete immersion.