Spectral features¶
This section contains the documentation for:
-
class
aubio.
dct
(size=1024)¶ Compute Discrete Fourier Transorms of Type-II.
- Parameters
size (int) – size of the DCT to compute
Example
>>> d = aubio.dct(16) >>> d.size 16 >>> x = aubio.fvec(np.ones(d.size)) >>> d(x) array([4., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32) >>> d.rdo(d(x)) array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32)
References
DCT-II in Discrete Cosine Transform on Wikipedia.
-
class
aubio.
fft
(size=1024)¶ Compute Fast Fourier Transorms.
- Parameters
size (int) – size of the FFT to compute
Example
>>> x = aubio.fvec(512) >>> f = aubio.fft(512) >>> c = f(x); c aubio cvec of 257 elements >>> x2 = f.rdo(c); x2.shape (512,)
-
rdo
()¶ synthesis of spectral grain
-
win_s
¶ size of the window
-
class
aubio.
filterbank
(n_filters=40, win_s=1024)¶ Create a bank of spectral filters. Each instance is a callable that holds a matrix of coefficients.
See also
set_mel_coeffs()
,set_mel_coeffs_htk()
,set_mel_coeffs_slaney()
,set_triangle_bands()
, andset_coeffs()
.- Parameters
n_filters (int) – Number of filters to create.
win_s (int) – Size of the input spectrum to process.
Examples
>>> f = aubio.filterbank(128, 1024) >>> f.set_mel_coeffs(44100, 0, 10000) >>> c = aubio.cvec(1024) >>> f(c).shape (128, )
-
get_coeffs
()¶ Get coefficients matrix of filterbank.
- Returns
Array of shape (n_filters, win_s/2+1) containing the coefficients.
- Return type
array_like
-
get_norm
()¶ Get norm parameter of filterbank.
- Returns
Norm parameter.
- Return type
float
-
get_power
()¶ Get power applied to filterbank.
- Returns
Power parameter.
- Return type
float
-
set_coeffs
(coeffs)¶ Set coefficients of filterbank.
- Parameters
coeffs (fmat) – Array of shape (n_filters, win_s/2+1) containing the coefficients.
-
set_mel_coeffs
(samplerate, fmin, fmax)¶ Set coefficients of filterbank to linearly spaced mel scale.
- Parameters
samplerate (float) – Sampling-rate of the expected input.
fmin (float) – Lower frequency boundary of the first filter.
fmax (float) – Upper frequency boundary of the last filter.
See also
-
set_mel_coeffs_htk
(samplerate, fmin, fmax)¶ Set coefficients of the filters to be linearly spaced in the HTK mel scale.
- Parameters
samplerate (float) – Sampling-rate of the expected input.
fmin (float) – Lower frequency boundary of the first filter.
fmax (float) – Upper frequency boundary of the last filter.
See also
-
set_mel_coeffs_slaney
(samplerate)¶ Set coefficients of filterbank to match Slaney’s Auditory Toolbox.
The filter coefficients will be set as in Malcolm Slaney’s implementation. The filterbank should have been created with n_filters = 40.
This is approximately equivalent to using
set_mel_coeffs()
with fmin = 400./3., fmax = 6853.84.- Parameters
samplerate (float) – Sampling-rate of the expected input.
References
Malcolm Slaney, Auditory Toolbox Version 2, Technical Report #1998-010
-
set_norm
(norm)¶ Set norm parameter. If set to 0, the filters will not be normalized. If set to 1, the filters will be normalized to one. Default to 1.
This function should be called before
set_triangle_bands()
,set_mel_coeffs()
,set_mel_coeffs_htk()
, orset_mel_coeffs_slaney()
.- Parameters
norm (int) – 0 to disable, 1 to enable
-
set_power
(power)¶ Set power applied to input spectrum of filterbank.
- Parameters
power (float) – Power to raise input spectrum to before computing the filters.
-
set_triangle_bands
(freqs, samplerate)¶ Set triangular bands. The coefficients will be set to triangular overlapping windows using the boundaries specified by freqs.
freqs should contain n_filters + 2 frequencies in Hz, ordered by value, from smallest to largest. The first element should be greater or equal to zero; the last element should be smaller or equal to samplerate / 2.
- Parameters
freqs (fvec) – List of frequencies, in Hz.
samplerate (float) – Sampling-rate of the expected input.
Example
>>> fb = aubio.filterbank(n_filters=100, win_s=2048) >>> samplerate = 44100; freqs = np.linspace(0, 20200, 102) >>> fb.set_triangle_bands(aubio.fvec(freqs), samplerate)
-
n_filters
¶ number of filters
-
win_s
¶ size of the window
-
class
aubio.
mfcc
(buf_size=1024, n_filters=40, n_coeffs=13, samplerate=44100)¶ Compute Mel Frequency Cepstrum Coefficients (MFCC).
mfcc creates a callable which takes a cvec as input.
If n_filters = 40, the filterbank will be initialized with
filterbank.set_mel_coeffs_slaney()
. Otherwise, if n_filters is greater than 0, it will be initialized withfilterbank.set_mel_coeffs()
using fmin = 0, fmax = samplerate/.Example
>>> buf_size = 2048; n_filters = 128; n_coeffs = 13; samplerate = 44100 >>> mf = aubio.mfcc(buf_size, n_filters, n_coeffs, samplerate) >>> fftgrain = aubio.cvec(buf_size) >>> mf(fftgrain).shape (13,)
-
class
aubio.
pvoc
(win_s=512, hop_s=256)¶ Phase vocoder.
pvoc creates callable object implements a phase vocoder 1, using the tricks detailed in 2.
The call function takes one input of type fvec and of size hop_s, and returns a cvec of length win_s//2+1.
- Parameters
win_s (int) – number of channels in the phase-vocoder.
hop_s (int) – number of samples expected between each call
Examples
>>> x = aubio.fvec(256) >>> pv = aubio.pvoc(512, 256) >>> pv(x) aubio cvec of 257 elements
Default values for hop_s and win_s are provided:
>>> pv = aubio.pvoc() >>> pv.win_s, pv.hop_s 512, 256
A cvec can be resynthesised using rdo():
>>> pv = aubio.pvoc(512, 256) >>> y = aubio.cvec(512) >>> x_reconstructed = pv.rdo(y) >>> x_reconstructed.shape (256,)
References
- 1
James A. Moorer. The use of the phase vocoder in computer music applications. Journal of the Audio Engineering Society, 26(1/2):42–45, 1978.
- 2
Amalia de Götzen, Nicolas Bernardini, and Daniel Arfib. Traditional (?) implementations of a phase vocoder: the tricks of the trade. In Proceedings of the International Conference on Digital Audio Effects (DAFx-00), pages 37–44, University of Verona, Italy, 2000. (online version).
-
rdo
(fftgrain)¶ Read a new spectral grain and resynthesise the next hop_s output samples.
- Parameters
fftgrain (cvec) – new input cvec to synthesize from, should be of size win_s/2+1
- Returns
re-synthesised output of shape (hop_s,)
- Return type
Example
>>> pv = aubio.pvoc(2048, 512) >>> out = pv.rdo(aubio.cvec(2048)) >>> out.shape (512,)
-
set_window
(window_type)¶ Set window function
- Parameters
window_type (str) – the window type to use for this phase vocoder
- Raises
ValueError – If an unknown window type was given.
See also
window
create a window.
-
hop_s
¶ Interval between two analysis, in samples.
- Type
int
-
win_s
¶ Size of phase vocoder analysis windows, in samples.
- Type
int
-
class
aubio.
specdesc
(method='default', buf_size=1024)¶ Spectral description functions. Creates a callable that takes a
cvec
as input, typically created bypvoc
for overlap and windowing, and returns a single float.method can be any of the values listed below. If default is used the hfc function will be selected.
Onset novelty functions:
energy: local energy,
hfc: high frequency content,
complex: complex domain,
phase: phase-based method,
wphase: weighted phase deviation,
specdiff: spectral difference,
kl: Kullback-Liebler,
mkl: modified Kullback-Liebler,
specflux: spectral flux.
Spectral shape functions:
centroid: spectral centroid (barycenter of the norm vector),
spread: variance around centroid,
skewness: third order moment,
kurtosis: a measure of the flatness of the spectrum,
slope: decreasing rate of the amplitude,
decrease: perceptual based measurement of the decreasing rate,
rolloff: 95th energy percentile.
- Parameters
method (str) – Onset novelty or spectral shape function.
buf_size (int) – Length of the input frame.
Example
>>> win_s = 1024; hop_s = win_s // 2 >>> pv = aubio.pvoc(win_s, hop_s) >>> sd = aubio.specdesc("mkl", win_s) >>> sd(pv(aubio.fvec(hop_s))).shape (1,)
References
-
class
aubio.
tss
(buf_size=1024, hop_size=512)¶ Transient/Steady-state separation.