Is first-order Ambisonics adequate for 3D?

There is a common assumption that Ambisonics would be the method of choice for 3D and VR. The professional recording engineer would do well to examine the situation more closely.

Ambisonics, which has existed for a long time by now, is a technology for representing and reproducing the sound field at a given point. But just as with wavefield synthesis, it functions only at a certain spatial resolution or "order". For this reason, we generally distinguish today between "first-order" Ambisonics and "higher-order" Ambisonics ("HOA").

First-order Ambisonics cannot achieve error-free audio reproduction, since the mathematics on which it is based are valid only for a listening space the size of a tennis ball. Thus the laws of stereophony apply here—a microphone for first-order Ambisonics is nothing other than a coincident microphone with the well-known advantages (simplicity; small number of recording channels; flexibility) and disadvantages (very wide, imprecise phantom sound sources; deficient spatial quality) of that approach in general.

Creation of a Ambisonics studio microphone with high spatial resolution is an unsolved problem so far. Existing Ambisonics studio microphones are all first-order, so their resolution is just adequate for 5.1 surround but too low for 3D audio. This becomes evident in their low inter-channel signal separation as well as the insufficient quality of their reproduced spatiality.

The original first-order Ambisonics microphone was the Soundfield microphone, built the same way as for example the Tetramic or the new Sennheiser VR microphone. The Schoeps "Double M/S System" works in similar fashion, but without the height channel.

Ambisonics is very well suited as a storage format for all kinds of spatial signals, but again, only if the order is high enough. A storage format with only four channels (first-order Ambisonics calls them W, X, Y, Z) makes a soup out of any 3D recording, since the mixdown to four channels destroys the signal separation of the 3D setup.

Ambisonics offers a simple, flexible storage and recording format for interactive 360° videos, e.g. on YouTube. In order to rotate the perspective, only the values of the Ambisonics variables need be adjusted. Together with the previously mentioned small first-order Ambisonics microphones, 360° videos are very easily made using small, portable cameras.

For virtual reality the situation is different, however. The acoustical background signal of a scene is generally produced by "binauralizing" the output of a virtual loudspeaker setup, e.g. a cube-shaped arrangement of eight virtual loudspeakers. The signals for this setup are static; turning one's head should not cause the room to spin. Instead, head tracking causes the corresponding HRTFs to be dynamically exchanged, just as with any other audio object in the VR scene.

As a result, most of the advantages of first-order Ambisonics do not come into play in VR. On the contrary, its disadvantages (poor spatial quality, crosstalk among virtual loudspeaker signals) only become more prominent.

If practical conditions allow for a slightly larger microphone arrangement, an ORTF-3D setup would be optimal instead as an ambience microphone for VR.

  • 3D Audio

    The new approaches included in "3D Audio" reproduce sound from all spatial directions. This includes the Dolby Atmos and Auro3D stereophonic systems; binaural / virtual reality ("VR") systems; and soundfield synthesis approaches such as Ambisonics and wavefield synthesis systems. 3D Audio can give distinctly better spatial perceptions than 5.1. Not only is the elevation of sound sources reproduced, but noticeable improvements can also be achieved with regard to envelopment, naturalness, and accuracy of tone color. The listening area can also be greater; listeners can move more freely within the playback room without hearing the image collapse into the nearest loudspeaker.
  • 1
  • 32nd TEC Award: ORTF-3D

    32nd TEC Award: ORTF-3D

  • 1