By Claudio Santos
Virtual reality brings a lot of new terminology to the post process, and we’re all having a hard time agreeing on the meaning of everything. It’s tricky because clients and technicians sometimes have different understandings of the same term, which is a guaranteed recipe for headaches in post.
Two terms that I’ve seen being confused a few times in the spatial audio realm are Gaze Activation and Focus. They are both similar enough to be put in the same category, but at the same time different enough that most of the times you have to choose completely different tools and distribution platforms depending on which technology you want to use.
Focus is what the Facebook Spatial Workstation calls this technology, but it is a tricky one to name. As you may know, ambisonics represents a full sphere of audio around the listener. Players like YouTube and Facebook (which uses ambisonics inside its own proprietary .tbe format) can dynamically rotate this sphere so the relative positions of the audio elements are accurate to the direction the audience is looking at. But the sounds don’t change noticeably in level depending on where you are looking.
If we take a step back and think about “surround sound” in the real world, it actually makes perfect sense. A hair clipper isn’t particularly louder when it’s in front of our eyes as opposed to when its trimming the back of our head. Nor can we ignore the annoying person who is loudly talking on their phone on the bus by simply looking away.
But for narrative construction, it can be very effective to emphasize what your audience is looking at. That opens up possibilities, such as presenting the viewer with simultaneous yet completely unrelated situations and letting them choose which one to pay attention to simply by looking in the direction of the chosen event. Keep in mind that in this case, all events are happening simultaneously and will carry on even if the viewer never looks at them.
This technology is not currently supported by YouTube, but it is possible in the Facebook Spatial Workstation with the use of high Focus Values.
When we talk about focus, the key thing to keep in mind is that all the events happen regardless of the viewer looking at them or not. If instead you want a certain sound to only happen when the viewer looks at a certain prop, regardless of the time, then you are looking for Gaze Activation.
This concept is much more akin to game audio then to film sound because of the interactivity element it presents. Essentially, you are using the direction of the gaze and potentially the length of the gaze (if you want your viewer to look in a direction for x amount of seconds before something happens) as a trigger for a sound/video playback.
This is very useful if you want to make impossible for your audience to miss something because they were looking in the “wrong” direction. Think of a jump scare in a horror experience. It’s not very scary if you’re looking in the opposite direction, is it?
This is currently only supported if you build your experience in a game engine or as an independent app with tools such as InstaVR.
Both concepts are very closely related and I expect many implementations will make use of both. We should all keep an eye on the VR content distribution platforms to see how these tools will be supported and make the best use of them in order to make 360 videos even more immersive.
Claudio Santos is a sound editor and spatial audio mixer at Silver Sound. Slightly too interested in technology and workflow hacks, he spends most of his waking hours tweaking, fiddling and tinkering away on his computer.