What is Spatial Audio: The Complete 101 Guide for Artists And Musicians
Everything you need to know about Spatial Audio, immersive audio, Dolby Atmos, and what's needed to release your music in this format.
Do you want to get your listeners the ultimate aural experience? You came to the right place!
What does Spatial Audio mean?
The term Spatial Audio may refer to different concepts in different contexts, but nowadays it is primarily associated with Apple's approach to creating an immersive audio experience.
So, what is then immersive audio? This type of audio allows the listener to perceive sounds coming from all around them, including from above.
This distinction, the inclusion of sounds from above, sets immersive audio apart from traditional surround audio.
SURROUND VS. IMMERSIVE
As you probably know, conventional surround sound setups employ an arrangement of speakers arorund the listener, at ear level.
The most common configurations are 5 speakers in a 5.1 setup or 7 speakers in a 7.1 setup, with the ".1" refering to the subwoofer.
With setups like those, the user can perceive sounds coming from around them. Compared to stereo, surround allows for a richer experience, but it still lacks an important component that we experience everyday in real life: height.
By including speakers above the head of the listener, immersive audio adds that crucial height component, which makes for a much more realistic, three-dimensional, and well, immersive, listening experience.
So, coming back to our original question “What is Spatial Audio”, Spatial Audio is the term used by Apple to describe its approach to creating an immersive and multidimensional listening experience.
Spatial Audio is the term used by Apple to describe its approach to creating an immersive listening experience
As you can imagine, the technology needed to create such an audio experience is quite complex, and extensive research and development needs to be done. Instead of creating a whole new technology for immersive audio from scratch, Apple decided to license an existing one: Dolby Atmos.
You've probably heard the term Dolby Atmos before. If you've gone to the movies recently, then you've probably experienced it. Dolby Atmos was first introduced in movie theaters in 2012 and has become really popular since then.
By the time Apple introduced Spatial Audio in June 2021, Dolby Atmos was already well-established and widely used. So, it made sense for Apple to use this tried-and-true technology for their own Spatial Audio. Plus, Dolby Atmos has some other important features, which we'll talk about later.
How Does Spatial Audio / Dolby Atmos Work?
Although Spatial Audio is not exactly the same as Dolby Atmos (that's why Apple uses the wording "Spatial Audio, featuring Dolby Atmos" or "with support for Dolby Atmos"), for our purposes here we can consider both terms more or less interchangeably.
Let's see how Dolby Atmos works in different scenarios to better understand the technology and how it applies to music.
Dolby Atmos in Movie Theaters
As we just saw, Dolby Atmos uses especial speakers placed above the listener (usually referred to as "ceiling", "height", or "Atmos" speakers) to add the illusion of sound coming from above, and thus adding a third dimension component to the sound.Along with the other "side" speakers, and with Dolby Atmos supporting playback systems with up to 64 individual speakers, the location precision of sounds can be stunning.
To efficiently describe the number of speakers an Atmos system uses, you just add a third number to the number of surround speakers, to indicate the number of ceiling speakers:
For example, if you hear someone talking about a 5.1.2 setup, that means the system has five surround speakers, one subwoofer, and two speakers in the ceiling.
Other common setups for Dolby Atmos are 5.1.4, 7.1.2, and especially 7.1.4, which is the one Dolby recommends for professional studios mixing music in Dolby Atmos.
Spatial Audio at home
Of course, hanging speakers from the ceiling is something most people won't be able to do at home. To overcome this, the manufacturers have come up with a clever and ingenious solution: up-firing speakers.
In a setup that includes up-firing speakers, some of the speakers are pointed to the ceiling, so the sound will then bounce from it and arrive to the listener from above.
This way, you can have the equivalent of ceiling speakers without needing to place actual speakers on the ceiling.
If you are thinking, "but those sounds will arrive delayed to the listener and will be out of sync with the other speakers", you are absolutely right.
The systems that include up-firing speakers usually also include algorithms to measure that delay and compensate for it on the other speakers.
Spatial Audio on Headphones: Binaural Audio
And now we get to a crucial aspect of Spatial Audio: is it possible to recreate an immersive audio experience on headphones?
We certainly can perceive a three dimensional aural experience with only two ears, so it follows that perhaps there's a way to recreate it using headphones?
Let's see how that could be possible, by performing a quick experiment:
Close your eyes, extend your arm, and snap your fingers in different positions in front of you, while moving your arm around.
Discounting the fact that you obviously knew where your arm was, on every snap you were probably able to discern by ear where the sound was coming from, even if you moved your arm up or down. How is that possible?
One of the clues our brain uses to determine the location of a sound source is time difference: when a sound is in front of you, it arrives at exactly the same time to both ears (green lines below).
If the sound moves to one side, it will arrive slightly earlier to the ear it is closest to (orange lines below), and the brain will interpret this difference in time and deduce the sound's location. Discrepancies of milliseconds will make a difference.
But how can you tell the difference when an object is in front of you but perhaps above or below you?
The time difference to the ears would be the same whether the sound is at ear level or one meter above, but you still can hear a difference. How is it possible?
Here the brain uses other clues:
The unique shape of your torso, head, and ear lobes causes sound to bounce and reach your ears in specific ways, resulting in slight variations for each direction and height from which the sound originates.
Over your lifetime, your brain has developed an understanding of these differences, allowing it to deduce the position and height of a sound by comparing the direct sound and reflections from your body and ear lobes.
This is why you may sometimes think you heard something in a certain location, only to find it in a different place. Your brain was misled by one or more reflections from other objects that created a familiar pattern.
If you place tiny microphones inside your ears and take a lot of measurements with sounds in hundreds of locations around you, you'll be able to create a "model" of how your particular auditory system hears sounds.
If you then apply that model to the sound coming from your headphones, you can simulate a three-dimensional space around you.
That model is called a Head Related Transfer Function (HRTF) and scientists have not only created one of those models, they have created thousands of them, and then calculated the median or average HRTF.
And that's what Binaural Audio does: it applies an average Head Related Transfer Function to the sound coming out of your headphones, to recreate how different sounds would sound when coming from different positions around you.
As you can imagine, this presents one problem: the effectiveness of the process, or how realistic that recreated 3D space sounds to you, depends very much on how close your personal auditory system is to that average HRTF.
This is why some people get their minds blown with Spatial Audio / Dolby Atmos on headphones using Binaural, and some others have a somewhat less impressive experience.
The good news: apps that allow you to create a personalized HRTF are already being introduced by Dolby, Apple, and other companies, so the experience will only become better and better in the coming years.
Creating Music For Spatial Audio
This all sounds pretty exciting, so what do you need to start creating music for Spatial Audio? The best part is you don't need any special tool or piece of gear to get your music into Spatial Audio.
If you've ever recorded music in the past, you can use those very same tools to create music that will be mixed in Spatial Audio. You can even create a Spatial Audio mix out of any existing multitrack recording you did some time ago.
The main difference when releasing music in Spatial Audio relies on the mixing process, so the great news is you can write and record music as usual, and then just hire a mixing engineer specialized in Spatial Audio.
(Of course the format offers a lot of creatives opportunities you'll want to take advantage of, we cover that in the next section below)
The program your engineer will use to mix in Dolby Atmos is called Dolby Atmos Renderer, which is used in tandem with a "regular" DAW.
At the moment of writing, the DAWs that support mixing in Dolby Atmos are Pro Tools, Nuendo, Logic Pro (v10.7 or later), Pyramix, DaVinci Resolve, and Ableton Live, but that list is going to inevitably expand in the near future.
While no new tools are required, the creative opportunities offered by the immersive medium are boundless, and you'll want to fully exploit them.
You'll probably find that the way you aproach writting, arranging, recording, and producing music change in significant ways. I certainly did!
The creative opportunities offered by the immersive medium are boundless
One of the obvious benefits is the freedom to place instruments beyond the traditional left/right or in between placement. With a semi-sphere surrounding the listener, the possibilities are significantly larger.
Take, for instance, the familiar technique of double-tracking elements like guitars or background vocals, and panning them hard left and right to create a spacious soundstage.
How about 13-times tracking a vocal part to have the listener seated in the middle of a towering stack of vocals during an acapella segment? Maybe 13-times tracking each harmony in that vocal arrangement? 20 times each? More?
Or you could have the band positioned around the listener as if they were seated in the middle of your rehearsal space. Combined with a dry and intimate sound, the listening experience could be amazing. How about being surrounded by an orchestra?
The possibilities with immersive audio are phenomenal, and that's only considering static placement. The moment you start adding automation, the creative opportunities are stunning. With Spatial Audio, the sky is the limit, quite literally.
No matter what genre you work in, the palette of emotions and aural experiences you can convey with your music is significantly wider in immersive
To bring back the main idea here, note how, in all those instances, the instruments can be recorded in regular mono or stereo tracks, and then they are placed in the correct 3D spot during the mixing process.
Of course, nothing stops you from doing an ambisonics recording to be used in Spatial Audio, but the point here is that you can do almost anything without needing new equipment. You can use the same tools you already know, just let your creativity run wild!
The current situation is very similar to what happened at the end of the 1950s when the transition from mono to stereo recordings occurred. At that time, there was a lot of experimentation, and you could easily find drums panned hard left, bass panned hard right, and tambourines flying around all over the place.
With time, some of those practices were discarded, and others became common practices, like placing the main elements of the mix in the center or double-tracking them if you want to pan them to the sides.
Numerous artists have embraced Atmos, exploring the limitless opportunities it offers. Some of their experiments are truly innovative, and only time will reveal which practices become the norm in this exciting new format.
And this brings us to the last section of this piece:
The Future Of Spatial Audio
It looks like Spatial Audio might be here to stay. It has the potential of becoming the standard audio format in the future, with all music being released in Atmos, and stereo versions only being added as a compatibility fallback.
Spatial Audio has the potential of becoming the standard audio format of the future
You might be thinking there have already been many failed attempts at creating a standard immersive or surround format in the past. From Quadraphonic in the 70s to surround in 5.1, and then 7.1 and beyond, and many others in between… so why is Spatial Audio any different?
Let's explore two key elements that could determine Spatial Audio based on Dolby Atmos being widely accepted by the general public.
1. Ease of Adoption and Distribution
One important point is the user can experience Spatial Audio using many different devices, from headphones on a mobile device or computer to smart speakers, setups based on smart TVs with immersive-capable sound bars, home theater installations, and more.
This means the barrier to get started with Spatial Audio is much lower than in the past, where the user had to buy an ad-hoc surround speaker system and install it in a room that probably was not designed to house a surround setup. It was complicated, cumbersome, and usually expensive.
This is thanks to the fact that the Dolby Atmos format is playback-system agnostic.
There's only one file containing the Atmos mix you deliver, and depending on which system it is being reproduced, the mix is adapted on the fly to be played on that setup, no matter whether that's headphones, a 5.1.2 soundbar, a bigger 9.2.6 home cinema, your brand new car, or anything beyond or in between.
In fact, for many users, acquiring the benefits of Spatial Audio was as simple as already owning compatible Apple devices, as it became available to them at no additional cost in November 2021. The significant involvement of Apple also greatly impacts the adoption of Dolby Atmos, as we'll see below.
And another important way in which Dolby Atmos eases friction on its adoption is on the creation of the actual content.
The Artist, Producer, and Mixing Engineer need to create only one single Atmos master, that is then delivered to the different online distributors. No need to create different mixes for different setups like doing a 5.1 mix, then a 7.1 mix, and etc.
If you've ever been involved in a surround project, you know how big of a time saver this is.
2. Apple is in, with both feet
Although Apple was not the first company to offer support for Dolby Atmos in its streaming service (Tidal has been supporting Dolby Atmos since May 2020), it is certainly one of the biggest advocates of the technology.
Spatial Audio has become one of Apple's most prominent selling points for many of its devices, including headphones, computers, and phones.
They added support for Dolby Atmos in their professional DAW Logic Pro X at no extra cost too, and in an interview with Billboard in February 2022, Apple's vice president of Apple Music and Beats, Oliver Schusser, said that half of their user base was listening in Spatial Audio.That's just 8 months after Spatial Audio was introduced in Apple Music, and by the end of 2022, more than 80% of Apple Music users listened to Spatial Audio. Not too bad!
Soon after Apple announced support for Dolby Atmos and Lossless audio at no extra cost on its Apple Music service, Amazon Music followed suit in October 2021, and as you might expect, it is a matter of time before Spotify and the rest of players add support for it too.
Both Tidal and Amazon music also support Sony 360 Reality Audio (perhaps the biggest rival for Dolby Atmos), but the fact that Apple chose Dolby Atmos for their Spatial Audio approach, and the heavy commitment they are showing to it, might spell disaster for Sony's technology.
One more clue as to the commitment of Apple to Spatial Audio is the development of their head tracking technology.
With some of their headphones, the movement of your head is followed in real-time. If you move your head, the virtual space around you will remain in its position as it would in real life.
In other words, if you are listening to music and you can hear drums, vocals, and bass in front of you, at your 12, if you turn your head to the left and look at your 9, those instruments will now sound as coming from your right (your previous 12, your new 3) as if they would still be in their position, instead of moving along with you to the left.
This increases the immersive experience, as that's exactly what would happen in real life.
This combination of spatial audio and head tracking is fundamental to the success of what Apple considers the future of computing and entertaining: Apple Vision Pro.
Spatial Audio has the potential to become the standard audio format for music in the future.
The creative opportunities the format offers for music are huge, and you don't need any special equipment to start writing or recording music for Spatial Audio.
The format solves many of the important friction points that prevented other technologies from being widely adopted in the past, like equipment needed for playback and ease of distribution deliveries.
The fact that Apple is so invested in the format is a big factor. You may love or hate Apple, but they have more than 1.8 billion active devices. None of the surround or immersive formats in the past had such a big distribution chain. Not even a fraction.
Exciting times for sure!