Localisation knowledge sources¶
Four knowledge sources work together to generate hypotheses of sound source
azimuths: Location knowledge source, Confusion Detection knowledge
source, Confusion Solving knowledge source, and Head Rotation knowledge
source.
Location knowledge source: DnnLocationKS¶
Class DnnLocationKS implements knowledge about the statistical relationship
between spatial cues and azimuth locations using DNNs. Currently the DNNs are trained on binaural cues from the Auditory front-end including CCF and ILD cues, as
described in more details in [MaEtAl2015dnn].
This knowledge source requires signals from the Auditory front-end and thus inherits from the
AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and
needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The
canExecute precondition checks the energy level of the current signal block
and localisation takes place only if there is an actual auditory event. After
execution, a SourcesAzimuthsDistributionHypothesis containing a probability
distribution of azimuth locations is placed on the blackboard (category
sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is
notified.
| binds to | AuditoryFrontEndKS.KsFiredEvent |
| writes data category | sourcesAzimuthsDistributionHypotheses |
| triggers event | KsFiredEvent |
Location knowledge source: GmmLocationKS¶
Class GmmLocationKS implements knowledge about the statistical relationship
between spatial cues and azimuth locations. Currently we model the relationship
using GMMs, which are trained on binaural cues from the Auditory front-end including
ITD and ILD cues.
This knowledge source requires signals from the Auditory front-end and thus inherits from the
AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and
needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The
canExecute precondition checks the energy level of the current signal block
and localisation takes place only if there is an actual auditory event. After
execution, a SourcesAzimuthsDistributionHypothesis containing a probability
distribution of azimuth locations is placed on the blackboard (category
sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is
notified.
| binds to | AuditoryFrontEndKS.KsFiredEvent |
| writes data category | sourcesAzimuthsDistributionHypotheses |
| triggers event | KsFiredEvent |
Confusion detection knowledge source: ConfusionKS¶
The ConfusionKS checks new location hypotheses and decides whether
there is a confusion. A confusion emerges when there are more valid locations in
the hypotheses than assumed auditory sources in the scene. In case of a
confusion, a ConfusedLocations event is notified and the responsible
location hypothesis is placed on the blackboard in the confusionHypotheses
category. Otherwise, a PerceivedAzimuth object is added to the blackboard
perceivedAzimuths data category, and the standard event is triggered.
| binds to | {Gmm|Dnn}LocationKS.KsFiredEvent |
| reads data category | sourcesAzimuthsDistributionHypotheses |
| writes data category | confusionHyptheses or perceivedAzimuths |
| triggers event | ConfusedLocations or KsFiredEvent |
Confusion solving knowledge source: ConfusionSolvingKS¶
The ConfusionSolvingKS solves localisation confusions by predicting the
location probability distribution after head rotation, and comparing it with new
location hypotheses received after head rotation is completed. The
canExecute method will wait for new location hypotheses; when there is one,
it will check whether the head has been turned, otherwise it will not execute.
The confusion is then solved by using the old and the new location hypothesis,
and a PerceivedAzimuth object is placed on the blackboard.
| binds to | ConfusionDetectionKS.ConfusedLocations |
| reads data category | confusionHypotheses, headOrientation and sourcesAzimuthsDistributionHypotheses |
| writes data category | perceivedAzimuths |
| triggers event | KsFiredEvent |
Head rotation knowledge source: RotationKS¶
The RotationKS has knowledge on how to move the robotic head in order to
solve confusions in source localisation. If there is no other head rotation
already scheduled, the knowledge source uses the robot interface to turn the
head.
| binds to | ConfusionKS.ConfusedLocations |
| reads data category | confusionHypotheses, headOrientation |
| writes data category | headOrientation |
| [MaEtAl2015dnn] | Ma, N., Brown, G. J. and May, T. (2015) Robust localisation of of multiple speakers exploiting deep neural networks and head movements. Proceedings of Interspeech‘15, pp.3302-3306, Dresden, Germany |