Knowledge sources¶

In this section we describe knowledge sources that are currently available in the Blackboard system, those are:

Abstract knowledge source
Auditory front-end knowledge source: AuditoryFrontEndKS
Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS
Localisation knowledge sources
Identification knowledge sources
Sound quality related knowledge sources
- Coloration knowledge source: ColorationKS
- Location knowledge source: ItdLocationKS
Segmentation knowledge sources
- Segmentation knowledge source: SegmentationKS
Obsolete knowledge sources
- Acoustic cues knowledge source: AcousticCuesKS
Upcoming knowledge sources
- Number of sources knowledge source: SourceNumberKS

Abstract knowledge source ¶

The abstract knowledge source (AbstractKS class) is the base class for all knowledge sources in the blackboard system. The corresponding file AbstractKS.m is located in the src/blackboard_core directory as opposed to all implementations of actual knowledge sources that are located in the directory src/knowledge_sources. The listing below shows the parts most relevant to development of new knowledge sources:

class AbstractKS
    properties
        blackboard;
        blackboardSystem;
        invocationMaxFrequency_Hz;
        trigger;
    events
        KsFiredEvent
    methods (Abstract)
        canExecute()
        execute()
    methods
        focus()
        unfocus()

There are different aspects of functionality in this interface.

Data access: Knowledge sources have a handle to the blackboard. Through this handle, data can be placed on and retrieved from the blackboard.
System setup: Through the handle blackboardSystem, knowledge sources get access to the methods for adding and removing other knowledge sources, and also access to the BlackboardMonitor.
Execution properties: The property invocationMaxFrequency_Hz specifies how often this knowledge source is allowed to be executed. The methods focus and unfocus give access to the attentional priority of the knowledge source, which influences its relative importance when competing for computing resources with other knowledge sources. See Section Dynamic blackboard scheduler for a description of scheduling.
Execution conditional: The abstract method canExecute must be implemented by the inheriting knowledge source. It is called by the scheduler when the knowledge source is next in the schedule before actually executing. If this method returns false, execution will not be performed. The second output argument of this method indicates whether the knowledge source should remain in the agenda or be removed.
Execution: The main functionality of any knowledge source is implemented in the method execute. A knowledge source gets executed by the scheduler if its maximum invocation frequency would not be exceeded and its canExecute method returns true. In this method, a knowledge source gets access to its trigger, a structure that contains information about the triggering event, the triggering source, and an argument the trigger source placed for usage by sinks.
Events: Knowledge sources can define their own individual events. However, each class already inherits a standard event from AbstractKS, KsFiredEvent. Events can be triggered by knowledge sources via obj.notify(eventname, attachedData).

Auditory front-end knowledge source: `AuditoryFrontEndKS`¶

This knowledge source integrates the Auditory front-end into the Blackboard system. The Auditory front-end itself is a self-contained module and this section focuses on its integration within the framework.

The Auditory front-end knowledge source is connected to the blackboard and the robot interface by registering itself in the system via BlackboardSystem.setDataConnect. Upon construction, the Auditory front-end dataObject and managerObject are instantiated and connected to the robot interface ear signals stream. The maximum invocation frequency of the AuditoryFrontEndKS is set to infinity. Execution mainly consists of getting the latest chunk of ear signals data, processing it through the Auditory front-end, and notifying a KsFiredEvent.

Other knowledge sources can register requests with the Auditory front-end indirectly, through inheriting from the AuditoryFrontEndDepKS class (see Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS, and binding to it’s KsFiredEvent.

Auditory signal dependent knowledge source superclass: `AuditoryFrontEndDepKS`¶

Whenever a knowledge source needs signals, cues or features from the auditory front-end, it should subclass from the AuditoryFrontEndDepKS class. Any knowledge source added to the blackboard through the BlackboardSystem addKS or createKS methods, register these requests automatically with the Auditory front-end.

Setting up the requests

Inheriting knowledge sources need to put their requests in their call to the super-constructor: obj@AuditoryFrontEndDepKS(requests), with requests being a cell array of structures each with fields name, stating the requested signal name, and params, specifying the signal parameters. (Have a look at Available processors.)

An example looks like this:

requests{1}.name = 'modulation';
requests{1}.params = genParStruct( ...
   'nChannels', obj.amFreqChannels, ...
   'am_type', 'filter', ...
   'am_nFilters', obj.amChannels ...
   );
requests{2}.name = 'ratemap_magnitude';
requests{2}.params = genParStruct( ...
   'nChannels', obj.freqChannels ...
   );

The params field always needs to be populated by a call to the genParStruct method.

Accessing signals

These requested signals can then be accessed by the knowledge source via the inherited getAFEdata method, which returns a map (with the indexes as in the request structure being the keys) of handles to the actual signals.

An example, according to the requests example above, looks like this:

afeData = obj.getAFEdata();
modSobj = afeData(1);
rmSobj = afeData(2);
rmBlock = rmSobj.getSignalBlock(0.5,0);

A more elaborate description of the request parameter structure and the signal objects can be found in the help for the Two!Ears Auditory Front-End. Have a look at the implementation of the GmmLocationKS to see a real-world example of how to subclass AuditoryFrontEndDepKS.

Localisation knowledge sources ¶

Four knowledge sources work together to generate hypotheses of sound source azimuths: Location knowledge source, Confusion Detection knowledge source, Confusion Solving knowledge source, and Head Rotation knowledge source.

Location knowledge source: `DnnLocationKS`¶

Class DnnLocationKS implements knowledge about the statistical relationship between spatial cues and azimuth locations using DNNs. Currently the DNNs are trained on binaural cues from the Auditory front-end including CCF and ILD cues, as described in more details in [MaEtAl2015dnn].

This knowledge source requires signals from the Auditory front-end and thus inherits from the AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The canExecute precondition checks the energy level of the current signal block and localisation takes place only if there is an actual auditory event. After execution, a SourcesAzimuthsDistributionHypothesis containing a probability distribution of azimuth locations is placed on the blackboard (category sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is notified.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
writes data category	`sourcesAzimuthsDistributionHypotheses`
triggers event	`KsFiredEvent`

Location knowledge source: `GmmLocationKS`¶

Class GmmLocationKS implements knowledge about the statistical relationship between spatial cues and azimuth locations. Currently we model the relationship using GMMs, which are trained on binaural cues from the Auditory front-end including ITD and ILD cues.

This knowledge source requires signals from the Auditory front-end and thus inherits from the AuditoryFrontEndDepKS (Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and needs to be bound to the AuditoryFrontEndKS’s KsFiredEvent. The canExecute precondition checks the energy level of the current signal block and localisation takes place only if there is an actual auditory event. After execution, a SourcesAzimuthsDistributionHypothesis containing a probability distribution of azimuth locations is placed on the blackboard (category sourcesAzimuthsDistributionHypotheses) and the event KsFiredEvent is notified.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
writes data category	`sourcesAzimuthsDistributionHypotheses`
triggers event	`KsFiredEvent`

Confusion detection knowledge source: `ConfusionKS`¶

The ConfusionKS checks new location hypotheses and decides whether there is a confusion. A confusion emerges when there are more valid locations in the hypotheses than assumed auditory sources in the scene. In case of a confusion, a ConfusedLocations event is notified and the responsible location hypothesis is placed on the blackboard in the confusionHypotheses category. Otherwise, a PerceivedAzimuth object is added to the blackboard perceivedAzimuths data category, and the standard event is triggered.

binds to	`{Gmm\|Dnn}LocationKS.KsFiredEvent`
reads data category	`sourcesAzimuthsDistributionHypotheses`
writes data category	`confusionHyptheses` or `perceivedAzimuths`
triggers event	`ConfusedLocations` or `KsFiredEvent`

Confusion solving knowledge source: `ConfusionSolvingKS`¶

The ConfusionSolvingKS solves localisation confusions by predicting the location probability distribution after head rotation, and comparing it with new location hypotheses received after head rotation is completed. The canExecute method will wait for new location hypotheses; when there is one, it will check whether the head has been turned, otherwise it will not execute. The confusion is then solved by using the old and the new location hypothesis, and a PerceivedAzimuth object is placed on the blackboard.

binds to	`ConfusionDetectionKS.ConfusedLocations`
reads data category	`confusionHypotheses`, `headOrientation` and `sourcesAzimuthsDistributionHypotheses`
writes data category	`perceivedAzimuths`
triggers event	`KsFiredEvent`

Head rotation knowledge source: `RotationKS`¶

The RotationKS has knowledge on how to move the robotic head in order to solve confusions in source localisation. If there is no other head rotation already scheduled, the knowledge source uses the robot interface to turn the head.

binds to	`ConfusionKS.ConfusedLocations`
reads data category	`confusionHypotheses`, `headOrientation`
writes data category	`headOrientation`

Identification knowledge sources ¶

This section focuses on implementation of sound identification knowledge sources within the blackboard framework.

Identity knowledge source: `IdentityKS`¶

Objects of class IdentityKS implement source type models, by incorporating an instance of a model (which has to implement the models.Base interface) with knowledge about the relationship of auditory cues and certain sound source types. Many Identity knowledge sources can be used concurrently; usually, for each sound class to be identified, you would instantiate an object of class IdentityKS with the respective model. The models get loaded from directories you specify upon construction, and should be created with the sec-idTrainPipeline. The model object of IdentityKS can employ any kind of model, such as a linear support vector machine, or a Gaussian mixture model. The IdentityKS needs access to Auditory front-end signals, thus it is a subclass of AuditoryFrontEndDepKS (see Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS). The model object holds the signal request structure.

The knowledge source predicts, based on the incorporated source model, whether the currently received auditory stream includes an auditory object of the sound type it represents.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
writes data category	`identityHypotheses`
triggers event	`KsFiredEvent`

Have a look at the example Identification of sound types to see IdentityKS in action.

Identity decision knowledge source: `IdDecisionKS`¶

The identity knowledge source checks new identity hypotheses. It then decides which of them are valid, by comparison and incorporating knowledge about the number of assumed auditory objects in the scene.

binds to	`IdentityKS.KsFiredEvent`
reads data category	`identityHypotheses`
writes data category	`identityDecision`
triggers event	`KsFiredEvent`

Identity Live Debugging knowledge source: `IdTruthPlotKS`¶

This is not really a knowledge source in the sense of the word, but rather a way to enable live-inspection of the identity information in the blackboard system. Upon construction, it takes ground truth information about event labels, onset and offset times; and when triggered, displays this in comparison with the actual hypotheses created by the identity knowledge sources. This figure shows an example of the produced plot.

binds to	`IdentityKS.KsFiredEvent`
reads data category	`identityHypotheses`

Segment Identity knowledge source: `SegmentIdentityKS`¶

The segment identity knowledge source checks new segmentation hypotheses and assigns an identity to each. This is performed by applying the mask estimated by the SegmentationKS before performing inference on the source’s identity. identityHypotheses are triggered for each source in the segmentationHypotheses

binds to	`SegmentationKS.KsFiredEvent`
reads data category	`SegmentationHypotheses`
writes data category	`identityHypotheses`
triggers event	`KsFiredEvent`

Segmentation knowledge sources ¶

This section focuses on implementation of knowledge sources for the segmentation of auditory features within the blackboard framework.

Segmentation knowledge source: `SegmentationKS`¶

The segmentation knowledge source generates hypotheses about the assignment of individual time-frequency units to sound sources present in a scene. This assignment is done probabilistically, hence, each time-frequency unit is associated with a unique discrete probability distribution. These distributions can be interpreted as soft-masks which can be used to generate segmented auditory features. Specifically, each auditory feature that can be represented in the time-frequency domain can be modified accordingly by a corresponding soft-mask. The soft-masks are generated by a probabilistic clustering approach based on a mixture of von Mises distributions over estimated angular positions of the sound sources. These positions can either be estimated by the SegmentationKS itself or provided by a SourcesAzimuthsDistributionHypothesis on the blackboard. If not all source positions can be reliably estimated by the DnnLocationKS, the remaining positions are estimated during the segmentation process. All estimated positions are subsumed with corresponding circular uncertainties in a sourceAzimuthHypotheses object for each sound source. Additionally, the estimated soft-masks are stored in a sound source specific segmentationHypotheses object. Each sourceAzimuthHypotheses and segmentationHypotheses contains a unique source identifier tag, enabling other knowledge sources to assign each soft-mask with the corresponding source position. The current implementation of the SegmentationKS relies on a pre-defined number of sound sources that will be present in the scene.

binds to	`AuditoryFrontEndKS.KsFiredEvent`
reads data category	`sourcesAzimuthsDistributionHypotheses`
writes data category	`sourceAzimuthHypotheses` and `segmentationHypotheses`
triggers event	`KsFiredEvent`

Obsolete knowledge sources ¶

Acoustic cues knowledge source: `AcousticCuesKS`¶

This knowledge source is obsolete and will be removed in a later release.

Upcoming knowledge sources ¶

For the following knowledge sources skeleton files are already existing, but its functionality is not implemented yet.

Number of sources knowledge source: `SourceNumberKS`¶

This knowledge source will generate a hypothesis about the number of sound sources present in the auditory scene.

[MaEtAl2015dnn]

Ma, N., Brown, G. J. and May, T. (2015) Robust localisation of of multiple speakers exploiting deep neural networks and head movements. Proceedings of Interspeech‘15, pp.3302-3306, Dresden, Germany

[MooreTan2004]

Moore, B. C. J., & Tan, C. (2004) Development and Validation of a Method for Predicting the Perceived Naturalness of Sounds Subjected to Spectral Distortion. JAES, 52(9), 900–14.

[Wierstorf2014]

Wierstorf, H. (2014) “Perceptual Assessment of Sound Field Synthesis,” PhD-thesis, TU Berlin