Localisation with and without head rotationsΒΆ
The Two!Ears Auditory Model comes with several knowledge sources that work together to
estimate the perceived azimuth of a sound source, see
Localisation knowledge sources for a summary. The main work is done by the
GmtkLocationKS
knowledge source that uses ITD and ILD cues provided by the
Auditory front-end and compares them with learned cues to azimuth maps. As an output
it provides a probability distribution of possible directions for the source.
This will be passed on to the ConfusionKS
knowledge source which looks at
the probabilities and decides if a clear direction can be extracted from this.
If not, ConfusionSolvingKS
is called which then triggers RotationKS
to
rotate the head of the listener (could be in the simulation or of a robot) and
start the localisation process again.
In this example we will see how to set up the model to perform a localisation
task and how to switch on or off the possibility of the model to rotate its
head. This example can be found in the
examples/localisation_w_and_wo_head_movements
folder which consists of the
following files:
BlackboardNoHeadRotation.xml
Blackboard.xml
localise.m
SceneDescription.xml
The first file we look at is SceneDescription.xml
, it defines the actual
acoustic scene in which our virtual head and the sound source will be placed in
order to simulate binaural signals. It looks like this:
<?xml version="1.0" encoding="utf-8"?>
<scene
BlockSize="4096"
SampleRate="44100"
MaximumDelay="0.0"
NumberOfThreads="1"
LengthOfSimulation = "5"
HRIRs="impulse_responses/scut_kemar_anechoic/SCUT_KEMAR_anechoic_1m.sofa">
<source Radius="1.0"
Mute="false"
Type="point"
Name="SoundSource">
<buffer ChannelMapping="1"
Type="noise"/>
</source>
<sink Name="Head"
Position="0 0 0"
UnitX="1 0 0"
UnitZ="0 0 1"/>
</scene>
Here, we define basic things like the sampling rate, the length of the stimulus, the used HRTF, the source material, the listener position, and the distance between listener and source. For more documentation on specifying an acoustic scene, see Configuration using XML Scene Description.
Note
We don’t specify the exact source azimuth here, as we will choose different azimuth values later on and set them on the fly from within Matlab.
The next thing we have to do is to specify of what components or model should
consists and what it should actually do. This is done by selecting appropriate
modules for the Blackboard system stage of the model. This can be configured
again in a xml file. First we look at the configuration for localisation
including head movements (Blackboard.xml
):
<?xml version="1.0" encoding="utf-8"?>
<blackboardsystem>
<dataConnection Type="AuditoryFrontEndKS"/>
<KS Name="loc" Type="GmtkLocationKS">
<Param Type="char">default</Param>
</KS>
<KS Name="conf" Type="ConfusionKS"/>
<KS Name="confSolv" Type="ConfusionSolvingKS"/>
<KS Name="rot" Type="RotationKS">
<Param Type="ref">robotConnect</Param>
</KS>
<Connection Mode="replaceOld" Event="AgendaEmpty">
<source>scheduler</source>
<sink>dataConnect</sink>
</Connection>
<Connection Mode="replaceOld">
<source>dataConnect</source>
<sink>loc</sink>
</Connection>
<Connection Mode="add">
<source>loc</source>
<sink>conf</sink>
</Connection>
<Connection Mode="replaceOld" Event="ConfusedLocations">
<source>conf</source>
<sink>rot</sink>
</Connection>
<Connection Mode="add" Event="ConfusedLocations">
<source>conf</source>
<sink>confSolv</sink>
</Connection>
</blackboardsystem>
Here, we use different knowledge sources that work together in order to solve
the localisation task. We have AuditoryFrontEndKS
for extract auditory cues
from the ear signals, GmtkLocationKS
, ConfusionKS
, ConfusionSolvingKS
,
and RotationKS
for the actual localisation task. The Param
tags are
parameters we can pass to the knowledge sources. After setting up which
knowledge sources we will use, we connect them with the Connection
tags.
For more information on configuring the blackboard see
Setting up the blackboard.
In a second configuration file we setting up the same blackboard, but now
disabling its ability to turn the head (BlackboardNoHeadRotation.xml
):
<?xml version="1.0" encoding="utf-8"?>
<blackboardsystem>
<dataConnection Type="AuditoryFrontEndKS"/>
<KS Name="loc" Type="GmtkLocationKS">
<Param Type="char">default</Param>
</KS>
<KS Name="conf" Type="ConfusionKS">
<!-- Disable confusion solving (== no head rotation) -->
<Param Type="int">0</Param>
</KS>
<Connection Mode="replaceOld" Event="AgendaEmpty">
<source>scheduler</source>
<sink>dataConnect</sink>
</Connection>
<Connection Mode="replaceOld">
<source>dataConnect</source>
<sink>loc</sink>
</Connection>
<Connection Mode="add">
<source>loc</source>
<sink>conf</sink>
</Connection>
</blackboardsystem>
Now, everything is prepared and we can start Matlab in order to perform the localisation. You can just start it and run the following command to see it in action, afterwards we will have a look at what happened:
>> localise
------------------------------------------------------------------
Source direction Model w head rot. Model wo head rot.
------------------------------------------------------------------
0 1 176
33 34 33
76 75 72
-121 -119 -111
------------------------------------------------------------------
As you can see the model with head rotation returned better results as the model without head rotation enabled. The reason why we have problems without head rotations is that we have trained the model with another HRTF data set (QU KEMAR) as we used for the creation of the acoustic scene (SCUT KEMAR).
Now, we have a look into the details of the localise()
function. We will
only talk about the parts that are responsible for the task, not for printing
out the results onto the screen. First we define the source angles we are going
to synthesise and start the Binaural simulator:
% Different angles the sound source is placed at
sourceAngles = [0 33 76 239];
% === Initialise binaural simulator
sim = simulator.SimulatorConvexRoom('SceneDescription.xml');
sim.Verbose = false;
sim.Init = true;
After that we have a loop over the different source angles in which we are setting the source position in the Binaural simulator and run two different blackboards after each other, one with, the other one without head rotations:
for direction = sourceAngles
sim.Sources{1}.set('Azimuth', direction);
sim.rotateHead(0, 'absolute');
sim.ReInit = true;
% GmtkLocationKS with head rotation for confusion solving
bbs = BlackboardSystem(0);
bbs.setRobotConnect(sim);
bbs.buildFromXml('Blackboard.xml');
bbs.run();
% Reset binaural simulation
sim.rotateHead(0, 'absolute');
sim.ReInit = true;
% GmtkLocationKS without head rotation and confusion solving
bbs = BlackboardSystem(0);
bbs.setRobotConnect(sim);
bbs.buildFromXml('BlackboardNoHeadRotation.xml');
bbs.run();
end