Onset strength (onsetProc.m
)¶
According to [Bregman1990], common onsets and offsets across frequency are
important grouping cues that are utilised by the human auditory system to
organise and integrate sounds originating from the same source. The onset
processor is based on the rate-map representation, and therefore, the choice of
the rate-map parameters, as listed in Table 24, will influence the
output of the onset processor. The temporal resolution is controlled by the
window size rm_wSizeSec
and the step size rm_hSizeSec
, respectively. The
amount of temporal smoothing can be adjusted by the leaky integrator time
constant rm_decaySec
, which reduces the amount of temporal fluctuations in
the rate-map. Onset are detected by measuring the frame-based increase in energy
of the rate-map representation. This detection is performed based on the
logarithmically-scaled energy, as suggested by [Klapuri1999]. It is possible to
limit the strength of individual onsets to an upper limit, which is by default
set to ons_maxOnsetdB = 30
. A list of all parameters is presented in
Table 26.
Parameter | Default | Description |
---|---|---|
ons_maxOnsetdB |
30 |
Upper limit for onset strength in dB |
The resulting onset strength expressed in decibel, which is a function of time
frame and frequency channel, is shown in Fig. 28. The
two figures can be replicated by running the script DEMO_OnsetStrength.m
.
When considering speech as an input signal, it can be seen that onsets appear
simultaneously across a broad frequency range and typically mark the beginning
of an auditory event.
[Klapuri1999] | Klapuri, A. (1999), “Sound onset detection by applying psychoacoustic knowledge,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3089–3092. |