Setting up an acoustic scene¶
Contemporary auditory models generally work in a feed-forward manner: pre-recorded or pre-generated signals are simply fed into the model and the model calculates and returns the results.
The Two!Ears Auditory Model behaves quite different in that respect - our framework allows to actively explore the provided acoustic scene: the model’s feedback mechanisms could, for instance, initiate a turn of the audio capture unit (dummy head) in order to disambiguate complex input and to arrive at a more precise scenario analysis. Such active exploration can be achieved, on the one hand, in the real world by using a dummy head in order to record binaural ear signals on the fly. Those dummy head should have a moveable head and will be mounted on a robotic platform to allow for translatory movements. For an introduction to that, see Use a robotic platform. On the other hand, this task can be addressed in a virtual way, setting up a simulated acoustic scene in the Binaural simulator. Below, we focus on the latter method.
There are two ways how the Binaural simulator creates the ear signals. One is to specify the entire acoustic scene using meta-data, pass appropriate HRTFs to the simulator, and, with that, render the scenario using the simulator’s Binaural renderer component. The other way is to use a mixture of meta-data and pre-recorded scene details that could be represented by BRIR or BRS files [Horbach1999]. The Binaural room scanning renderer can use this information (BRIRs) to represent sources placed in a room. Stepping further, BRS files enable the simulation of complex scenarios with loudspeaker arrays driven by spatial audio reproduction techniques. With this method the complete synthesised sound field of those reproduction methods could be investigated by binaural synthesis without the need of setting up all the required loudspeakers.
These files can also be used in perceptual experiments with dynamic binaural synthesis (see for example [Wierstorf2014]); here, our model could help to predict the experimental outcome.
In the first example, a KEMAR HRTF data set is used together with a sound
stimulus (cello) as static source to the left of the listener. Such an anechoic
scene can be defined via the following XML-file. The file is called
binaural_renderer.xml and can be found in the
<?xml version="1.0" encoding="utf-8"?> <scene Renderer="ssr_binaural" BlockSize="4096" SampleRate="44100" HRIRs="impulse_responses/qu_kemar_anechoic/QU_KEMAR_anechoic_3m.sofa"> <source Name="Cello" Type="point" Position="1 2 1.75"> <buffer ChannelMapping="1" Type="fifo" File="stimuli/anechoic/instruments/anechoic_cello.wav"/> </source> <sink Name="Head" Position="0 0 1.75" UnitX="1 0 0" UnitZ="0 0 1"/> </scene>
The file specifies the renderer type to be used, the simulator’s processing block size (the Two!Ears Auditory Model works in a completely block based manner), the HRTF data set to be loaded from the Database, the audio material provided, and the positions of the sources (here: cello) / sinks (here: KEMAR dummy head).
With that, the scene is defined. To test the correctness of the scene description, without having to invoke the full Two!Ears model framework, the binaural simulation can be run in a standalone version using the following statements:
>> sim = simulator.SimulatorConvexRoom('binaural_renderer.xml'); >> sim.set('Init',true); >> signal = sim.getSignal();
This yields a variable named
signal which stores the simulated binaural audio
data. Matlab can replay this signal via:
The data can also be stored in a file using:
To finish the simulation and clean up temporary files, type:
Binaural room scanning renderer¶
In the second example the BRS renderer will use BRIR recordings to simulate a
source placed within a (reverberant) room. Again, the scene is defined via an
XML-file, this time named
<?xml version="1.0" encoding="utf-8"?> <scene Renderer="ssr_brs" BlockSize="4096" SampleRate="44100" LengthOfSimulation="5.0"> <source Type="point" Name="SoundSource" IRs="impulse_responses/qu_kemar_rooms/auditorium3/QU_KEMAR_Auditorium3_src2_xs+4.30_ys+3.42.sofa"> <buffer ChannelMapping="1" Type="noise"/> </source> <sink Name="Head" Position="0 0 0" UnitX="0 1 0" UnitZ="0 0 1"/> </scene>
The renderer type is set to
ssr_brs and the BRIR is specified within the
<source> section. A source position is no longer required, as it is inherently
given by the
corresponding BRIR measurement.
Note that the KEMAR dummy head was looking towards the y-axis during the BRIR
measurement. In the scene description, the
UnitX vector defines the looking
direction of the virtual head; thus
[0 1 0].
The audio material used is synthesised by the simulator’s built-in white noise
generator. That way, an infinitely long signal is generated, and the
property has to be specified.
The following commands allow to listen to the final simulation; the perceptual
impression should correspond to a noise source placed in a larger room to the
front-right of the listener:
>> sim = simulator.SimulatorConvexRoom('brs_renderer.xml',1); >> signal = sim.getSignal(); >> sound(signal, sim.SampleRate);
sim.set('Init', true) line has been omitted; the
1 used in
simulator initialisation serves the same purpose.
|[Horbach1999]||Horbach, U., Karamustafaoglu, A., Pellegrini, R., Mackensen, P., Theile, G. (1999), “Design and Applications of a Data-based Auralization System for Surround Sound,” 106th AES Convention, Paper 4976|
|[Wierstorf2014]||Wierstorf, H. (2014), “Perceptual Assessment of Sound Field Synthesis,” PhD-thesis, TU Berlin|