Knowledge sources¶
In this section we describe knowledge sources that are currently available in the Blackboard system, those are:
- Abstract knowledge source
- Auditory front-end knowledge source:
AuditoryFrontEndKS
- Auditory signal dependent knowledge source superclass:
AuditoryFrontEndDepKS
- Localisation knowledge sources
- Identification knowledge sources
- Sound quality related knowledge sources
- Segmentation knowledge sources
- Obsolete knowledge sources
- Upcoming knowledge sources
Abstract knowledge source¶
The abstract knowledge source (AbstractKS
class) is the base class for all
knowledge sources in the blackboard system. The corresponding file
AbstractKS.m
is located in the src/blackboard_core
directory as
opposed to all implementations of actual knowledge sources that are located in
the directory src/knowledge_sources
. The listing below shows the parts most
relevant to development of new knowledge sources:
class AbstractKS
properties
blackboard;
blackboardSystem;
invocationMaxFrequency_Hz;
trigger;
events
KsFiredEvent
methods (Abstract)
canExecute()
execute()
methods
focus()
unfocus()
There are different aspects of functionality in this interface.
- Data access
- Knowledge sources have a handle to the
blackboard
. Through this handle, data can be placed on and retrieved from the blackboard. - System setup
- Through the handle
blackboardSystem
, knowledge sources get access to the methods for adding and removing other knowledge sources, and also access to theBlackboardMonitor
. - Execution properties
- The property
invocationMaxFrequency_Hz
specifies how often this knowledge source is allowed to be executed. The methodsfocus
andunfocus
give access to the attentional priority of the knowledge source, which influences its relative importance when competing for computing resources with other knowledge sources. See Section Dynamic blackboard scheduler for a description of scheduling. - Execution conditional
- The abstract method
canExecute
must be implemented by the inheriting knowledge source. It is called by the scheduler when the knowledge source is next in the schedule before actually executing. If this method returns false, execution will not be performed. The second output argument of this method indicates whether the knowledge source should remain in the agenda or be removed. - Execution
- The main functionality of any knowledge source is implemented in the method
execute
. A knowledge source gets executed by the scheduler if its maximum invocation frequency would not be exceeded and itscanExecute
method returns true. In this method, a knowledge source gets access to itstrigger
, a structure that contains information about the triggering event, the triggering source, and an argument the trigger source placed for usage by sinks. - Events
- Knowledge sources can define their own individual events. However, each
class already inherits a standard event from
AbstractKS
,KsFiredEvent
. Events can be triggered by knowledge sources viaobj.notify(eventname, attachedData)
.
Auditory front-end knowledge source: AuditoryFrontEndKS
¶
This knowledge source integrates the Auditory front-end into the Blackboard system. The Auditory front-end itself is a self-contained module and this section focuses on its integration within the framework.
The Auditory front-end knowledge source is connected to the blackboard and the robot
interface
by registering itself in the system via BlackboardSystem.setDataConnect
.
Upon construction, the Auditory front-end dataObject
and managerObject
are
instantiated and connected to the robot interface ear signals stream. The
maximum invocation frequency of the AuditoryFrontEndKS
is set to infinity.
Execution mainly consists of getting the latest chunk of ear signals data,
processing it through the Auditory front-end, and notifying a KsFiredEvent
.
Other knowledge sources can register requests with the Auditory front-end indirectly, through
inheriting from the AuditoryFrontEndDepKS
class (see
Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS, and binding to it’s
KsFiredEvent
.
Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS
¶
Whenever a knowledge source needs signals, cues or features from the auditory
front-end, it should subclass from the AuditoryFrontEndDepKS
class. Any
knowledge source added to the blackboard through the BlackboardSystem
addKS
or createKS
methods, register these requests
automatically with the Auditory front-end.
- Setting up the requests
Inheriting knowledge sources need to put their requests in their call to the super-constructor:
obj@AuditoryFrontEndDepKS(requests)
, withrequests
being a cell array of structures each with fieldsname
, stating the requested signal name, andparams
, specifying the signal parameters. (Have a look at Available processors.)An example looks like this:
requests{1}.name = 'modulation'; requests{1}.params = genParStruct( ... 'nChannels', obj.amFreqChannels, ... 'am_type', 'filter', ... 'am_nFilters', obj.amChannels ... ); requests{2}.name = 'ratemap_magnitude'; requests{2}.params = genParStruct( ... 'nChannels', obj.freqChannels ... );
The
params
field always needs to be populated by a call to thegenParStruct
method.- Accessing signals
These requested signals can then be accessed by the knowledge source via the inherited
getAFEdata
method, which returns a map (with the indexes as in the request structure being the keys) of handles to the actual signals.An example, according to the requests example above, looks like this:
afeData = obj.getAFEdata(); modSobj = afeData(1); rmSobj = afeData(2); rmBlock = rmSobj.getSignalBlock(0.5,0);
A more elaborate description of the request parameter structure and the signal
objects can be found in the help for the Two!Ears Auditory Front-End. Have a look at the implementation of the GmmLocationKS
to see a real-world example of how to subclass AuditoryFrontEndDepKS
.
Localisation knowledge sources¶
Four knowledge sources work together to generate hypotheses of sound source
azimuths: Location
knowledge source, Confusion Detection
knowledge
source, Confusion Solving
knowledge source, and Head Rotation
knowledge
source.
Location knowledge source: DnnLocationKS
¶
Class DnnLocationKS
implements knowledge about the statistical relationship
between spatial cues and azimuth locations using DNNs. Currently the DNNs are trained on binaural cues from the Auditory front-end including CCF and ILD cues, as
described in more details in [MaEtAl2015dnn].
This knowledge source requires signals from the Auditory front-end and thus inherits from the
AuditoryFrontEndDepKS
(Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and
needs to be bound to the AuditoryFrontEndKS
’s KsFiredEvent
. The
canExecute
precondition checks the energy level of the current signal block
and localisation takes place only if there is an actual auditory event. After
execution, a SourcesAzimuthsDistributionHypothesis
containing a probability
distribution of azimuth locations is placed on the blackboard (category
sourcesAzimuthsDistributionHypotheses
) and the event KsFiredEvent
is
notified.
binds to | AuditoryFrontEndKS.KsFiredEvent |
writes data category | sourcesAzimuthsDistributionHypotheses |
triggers event | KsFiredEvent |
Location knowledge source: GmmLocationKS
¶
Class GmmLocationKS
implements knowledge about the statistical relationship
between spatial cues and azimuth locations. Currently we model the relationship
using GMMs, which are trained on binaural cues from the Auditory front-end including
ITD and ILD cues.
This knowledge source requires signals from the Auditory front-end and thus inherits from the
AuditoryFrontEndDepKS
(Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS) and
needs to be bound to the AuditoryFrontEndKS
’s KsFiredEvent
. The
canExecute
precondition checks the energy level of the current signal block
and localisation takes place only if there is an actual auditory event. After
execution, a SourcesAzimuthsDistributionHypothesis
containing a probability
distribution of azimuth locations is placed on the blackboard (category
sourcesAzimuthsDistributionHypotheses
) and the event KsFiredEvent
is
notified.
binds to | AuditoryFrontEndKS.KsFiredEvent |
writes data category | sourcesAzimuthsDistributionHypotheses |
triggers event | KsFiredEvent |
Confusion detection knowledge source: ConfusionKS
¶
The ConfusionKS
checks new location hypotheses and decides whether
there is a confusion. A confusion emerges when there are more valid locations in
the hypotheses than assumed auditory sources in the scene. In case of a
confusion, a ConfusedLocations
event is notified and the responsible
location hypothesis is placed on the blackboard in the confusionHypotheses
category. Otherwise, a PerceivedAzimuth
object is added to the blackboard
perceivedAzimuths
data category, and the standard event is triggered.
binds to | {Gmm|Dnn}LocationKS.KsFiredEvent |
reads data category | sourcesAzimuthsDistributionHypotheses |
writes data category | confusionHyptheses or perceivedAzimuths |
triggers event | ConfusedLocations or KsFiredEvent |
Confusion solving knowledge source: ConfusionSolvingKS
¶
The ConfusionSolvingKS
solves localisation confusions by predicting the
location probability distribution after head rotation, and comparing it with new
location hypotheses received after head rotation is completed. The
canExecute
method will wait for new location hypotheses; when there is one,
it will check whether the head has been turned, otherwise it will not execute.
The confusion is then solved by using the old and the new location hypothesis,
and a PerceivedAzimuth
object is placed on the blackboard.
binds to | ConfusionDetectionKS.ConfusedLocations |
reads data category | confusionHypotheses , headOrientation and sourcesAzimuthsDistributionHypotheses |
writes data category | perceivedAzimuths |
triggers event | KsFiredEvent |
Head rotation knowledge source: RotationKS
¶
The RotationKS
has knowledge on how to move the robotic head in order to
solve confusions in source localisation. If there is no other head rotation
already scheduled, the knowledge source uses the robot
interface to turn the
head.
binds to | ConfusionKS.ConfusedLocations |
reads data category | confusionHypotheses , headOrientation |
writes data category | headOrientation |
Identification knowledge sources¶
This section focuses on implementation of sound identification knowledge sources within the blackboard framework.
Identity knowledge source: IdentityKS
¶
Objects of class IdentityKS
implement source type models, by incorporating
an instance of a model (which has to implement the models.Base
interface) with knowledge about the relationship of auditory cues and certain
sound source types. Many Identity knowledge sources can be used concurrently;
usually, for each sound class to be identified, you would instantiate an object
of class IdentityKS
with the respective model. The models get loaded from
directories you specify upon construction, and should be created with the
sec-idTrainPipeline. The model object of IdentityKS
can employ any
kind of model, such as a linear support vector machine, or a Gaussian mixture
model. The IdentityKS
needs access to Auditory front-end signals, thus it is a subclass
of AuditoryFrontEndDepKS
(see Section Auditory signal dependent knowledge source superclass: AuditoryFrontEndDepKS).
The model object holds the signal request structure.
The knowledge source predicts, based on the incorporated source model, whether the currently received auditory stream includes an auditory object of the sound type it represents.
binds to | AuditoryFrontEndKS.KsFiredEvent |
writes data category | identityHypotheses |
triggers event | KsFiredEvent |
Have a look at the example Identification of sound types to see IdentityKS
in action.
Identity decision knowledge source: IdDecisionKS
¶
The identity knowledge source checks new identity hypotheses. It then decides which of them are valid, by comparison and incorporating knowledge about the number of assumed auditory objects in the scene.
binds to | IdentityKS.KsFiredEvent |
reads data category | identityHypotheses |
writes data category | identityDecision |
triggers event | KsFiredEvent |
Identity Live Debugging knowledge source: IdTruthPlotKS
¶
This is not really a knowledge source in the sense of the word, but rather a way to enable live-inspection of the identity information in the blackboard system. Upon construction, it takes ground truth information about event labels, onset and offset times; and when triggered, displays this in comparison with the actual hypotheses created by the identity knowledge sources. This figure shows an example of the produced plot.
binds to | IdentityKS.KsFiredEvent |
reads data category | identityHypotheses |
Segment Identity knowledge source: SegmentIdentityKS
¶
The segment identity knowledge source checks new segmentation hypotheses and
assigns an identity to each. This is performed by applying the mask estimated by
the SegmentationKS before performing inference on the source’s identity.
identityHypotheses
are triggered for each source in the
segmentationHypotheses
binds to | SegmentationKS.KsFiredEvent |
reads data category | SegmentationHypotheses |
writes data category | identityHypotheses |
triggers event | KsFiredEvent |
Segmentation knowledge sources¶
This section focuses on implementation of knowledge sources for the segmentation of auditory features within the blackboard framework.
Segmentation knowledge source: SegmentationKS
¶
The segmentation knowledge source generates hypotheses about the assignment of
individual time-frequency units to sound sources present in a scene. This
assignment is done probabilistically, hence, each time-frequency unit is
associated with a unique discrete probability distribution. These distributions
can be interpreted as soft-masks which can be used to generate segmented
auditory features. Specifically, each auditory feature that can be represented
in the time-frequency domain can be modified accordingly by a corresponding
soft-mask. The soft-masks are generated by a probabilistic clustering approach
based on a mixture of von Mises distributions over estimated angular positions
of the sound sources. These positions can either be estimated by the
SegmentationKS
itself or provided by a
SourcesAzimuthsDistributionHypothesis
on the blackboard. If not all source
positions can be reliably estimated by the DnnLocationKS
, the remaining
positions are estimated during the segmentation process. All estimated positions
are subsumed with corresponding circular uncertainties in a
sourceAzimuthHypotheses
object for each sound source. Additionally, the
estimated soft-masks are stored in a sound source specific
segmentationHypotheses
object. Each sourceAzimuthHypotheses
and
segmentationHypotheses
contains a unique source identifier tag, enabling
other knowledge sources to assign each soft-mask with the corresponding source
position. The current implementation of the SegmentationKS
relies on a
pre-defined number of sound sources that will be present in the scene.
binds to | AuditoryFrontEndKS.KsFiredEvent |
reads data category | sourcesAzimuthsDistributionHypotheses |
writes data category | sourceAzimuthHypotheses and segmentationHypotheses |
triggers event | KsFiredEvent |
Obsolete knowledge sources¶
Acoustic cues knowledge source: AcousticCuesKS
¶
This knowledge source is obsolete and will be removed in a later release.
Upcoming knowledge sources¶
For the following knowledge sources skeleton files are already existing, but its functionality is not implemented yet.
Number of sources knowledge source: SourceNumberKS
¶
This knowledge source will generate a hypothesis about the number of sound sources present in the auditory scene.
[MaEtAl2015dnn] | Ma, N., Brown, G. J. and May, T. (2015) Robust localisation of of multiple speakers exploiting deep neural networks and head movements. Proceedings of Interspeech‘15, pp.3302-3306, Dresden, Germany |
[MooreTan2004] | Moore, B. C. J., & Tan, C. (2004) Development and Validation of a Method for Predicting the Perceived Naturalness of Sounds Subjected to Spectral Distortion. JAES, 52(9), 900–14. |
[Wierstorf2014] | Wierstorf, H. (2014) “Perceptual Assessment of Sound Field Synthesis,” PhD-thesis, TU Berlin |