Stream segregation knowledge sources

This section focuses on implementation of knowledge sources for the segregation of auditory streams and mask auditory features to isolate each stream within the blackboard framework.

Stream segregation knowledge source: StreamSegregationKS

The stream segregation knowledge source generates hypotheses about the assignment of individual time-frequency units to sound sources present in a scene. This assignment is done probabilistically, hence, each time-frequency unit is associated with a unique discrete probability distribution. These distributions can be interpreted as soft-masks which can be used to generate segregated auditory features. Specifically, each auditory feature that can be represented in the time-frequency domain can be modified accordingly by a corresponding soft-mask. The soft-masks are generated by a probabilistic clustering approach based on a mixture of von Mises distributions over estimated angular positions of the sound sources. Estimation for these positions are provided by a locationHypothesis or if unavailable then sourcesAzimuthsDistributionHypotheses on the blackboard. Positions can be reliably estimated through the combination of DnnLocationKS and LocalisationDecisionKS. Additionally, the estimated soft-masks are stored in a sound source specific segmentationHypothesis object. Each segmentationHypotheses contains a unique source identifier tag, enabling other knowledge sources to assign each soft-mask with the corresponding source position. The current implementation of the StreamSegregationKS relies on a pre-defined number of sound sources that will be present in the scene. The number of sound sources is provided through the NumberOfSourcesHypotheses.

binds to AuditoryFrontEndKS.KsFiredEvent
reads data category locationHypothesis (otherwise sourcesAzimuthsDistributionHypotheses) and NumberOfSourcesHypotheses
writes data category segmentationHypotheses
triggers event KsFiredEvent