Localisation knowledge sources¶

Location knowledge source: DnnLocationKS
Location knowledge source: GmmLocationKS
Localisation decision knowledge source: LocalisationDecisionKS
Confusion detection knowledge source: ConfusionKS
Confusion solving knowledge source: ConfusionSolvingKS
Head rotation knowledge source: RotationKS

Four knowledge sources work together to generate hypotheses of sound source azimuths: Location knowledge source, Confusion Detection knowledge source, Confusion Solving knowledge source, and Head Rotation knowledge source.

Location knowledge source: `DnnLocationKS`¶

Class DnnLocationKS implements knowledge about the statistical relationship between spatial cues and azimuth locations using DNNs. Currently the DNNs are trained on binaural cues from the Auditory front-end including CCF and ILD cues, as described in more details in [MaEtAl2015dnn].

This knowledge source requires signals from the Auditory front-end and thus inherits from the AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The canExecute precondition checks the energy level of the current signal block and localisation takes place only if there is an actual auditory event. After execution, a SourcesAzimuthsDistributionHypothesis containing a probability distribution of azimuth locations is placed on the blackboard (category sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is notified.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
writes data category	`sourcesAzimuthsDistributionHypotheses`
triggers event	`KsFiredEvent`

Location knowledge source: `GmmLocationKS`¶

Class GmmLocationKS implements knowledge about the statistical relationship between spatial cues and azimuth locations. Currently we model the relationship using GMMs, which are trained on binaural cues from the Auditory front-end including ITD and ILD cues.

This knowledge source requires signals from the Auditory front-end and thus inherits from the AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The canExecute precondition checks the energy level of the current signal block and localisation takes place only if there is an actual auditory event. After execution, a SourcesAzimuthsDistributionHypothesis containing a probability distribution of azimuth locations is placed on the blackboard (category sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is notified.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
writes data category	`sourcesAzimuthsDistributionHypotheses`
triggers event	`KsFiredEvent`

Localisation decision knowledge source: `LocalisationDecisionKS`¶

Add text here.

Confusion detection knowledge source: `ConfusionKS`¶

The ConfusionKS checks new location hypotheses and decides whether there is a confusion. A confusion emerges when there are more valid locations in the hypotheses than assumed auditory sources in the scene. In case of a confusion, a ConfusedLocations event is notified and the responsible location hypothesis is placed on the blackboard in the confusionHypotheses category. Otherwise, a PerceivedAzimuth object is added to the blackboard perceivedAzimuths data category, and the standard event is triggered.

binds to	`{Gmm\|Dnn}LocationKS.KsFiredEvent`
reads data category	`sourcesAzimuthsDistributionHypotheses`
writes data category	`confusionHyptheses` or `perceivedAzimuths`
triggers event	`ConfusedLocations` or `KsFiredEvent`

Confusion solving knowledge source: `ConfusionSolvingKS`¶

The ConfusionSolvingKS solves localisation confusions by predicting the location probability distribution after head rotation, and comparing it with new location hypotheses received after head rotation is completed. The canExecute method will wait for new location hypotheses; when there is one, it will check whether the head has been turned, otherwise it will not execute. The confusion is then solved by using the old and the new location hypothesis, and a PerceivedAzimuth object is placed on the blackboard.

binds to	`ConfusionDetectionKS.ConfusedLocations`
reads data category	`confusionHypotheses`, `headOrientation` and `sourcesAzimuthsDistributionHypotheses`
writes data category	`perceivedAzimuths`
triggers event	`KsFiredEvent`

Head rotation knowledge source: `RotationKS`¶

The RotationKS has knowledge on how to move the robotic head in order to solve confusions in source localisation. If there is no other head rotation already scheduled, the knowledge source uses the robot interface to turn the head.

binds to	`ConfusionKS.ConfusedLocations`
reads data category	`confusionHypotheses`, `headOrientation`
writes data category	`headOrientation`

[MaEtAl2015dnn]

Ma, N., Brown, G. J. and May, T. (2015) Robust localisation of of multiple speakers exploiting deep neural networks and head movements. Proceedings of Interspeech‘15, pp.3302-3306, Dresden, Germany

Localisation knowledge sources¶

Location knowledge source: DnnLocationKS¶

Location knowledge source: GmmLocationKS¶

Localisation decision knowledge source: LocalisationDecisionKS¶

Confusion detection knowledge source: ConfusionKS¶

Confusion solving knowledge source: ConfusionSolvingKS¶

Head rotation knowledge source: RotationKS¶