SBSMS - Subband Sinusoidal Modeling Synthesis


For time stretching and pitch scaling of audio.


The audio is fed into a FIFO, which takes the STFT of the input. Each frame is high-pass filtered in the fourier domain, and then written to a frame FIFO which does quadratic interpolating peak detection and track continuation. The tracks are resynthesized with a quadratic phase preserving oscillator bank at an arbitrary time scale.

The subbands are fed from the low-pass filtered frames, which are decimated by 2 and reconstructed in a half rate time domain. The subbands perform the same process as the parent band, only the data is at half the audio frequency, and at half the rate. There are typically 6 bands. The point of subbands is to allow high time resolution for high frequencies and at the same time high frequency resolution for low frequencies.

Pitch scaling is performed in a post-processing resampling step.


percussion and rapidly changing bass original reconstructed at original rate stretched x2
vocals+bass+drums original stretched x2
string quartet original stretched x2 pitched +4 half steps
brass ensemble (warning: loud) original pitched -4 half steps sliding stretch