psolabatch

offline psola analysis tool


download _2011


psolabatch is a little batch tool for analysis used in psola synthesis (pitch-synchronous overlap-add). It makes use of psola_analyse and run on the Terminal under MacOS 10.5. It could possibly run on a newer MacOS system though. psolabatch contains Max patches for batching psola_analyse.

An Ircam analysis command line program called psola analyse calculates parameters from a sound signal and can then subsequently modify the sound. psola analyse finds:
– period synchronous temporal markers
– fundamental frequency
– voiced and unvoiced zones
– transitory zones

For instance, sound modifications can then be extreme time stretching and clean transposition with separated spectral envelope. The usual easy demos are:
– change of gender by changing the spectral envelope (respect)
– voice pitch correction
– prosody alterations
– artificial or removal of vibrato
– choir or chorus effect by slightly transposing and stretching a single voice
– voices alignment

A “proper” analyzed sound should be a type of “tension/release”, some kind of saw ramp signal; voice, strings, brasses etc. A continuous and thin sound like a sine wave or noise would not work. Also, reverberated sounds or recordings with crosstalk between microphones would not work neither since the fundamental frequency would not be accurately found. Despite the fact they “filter” more voice quality, cardioid microphones could work better for that kind of capture. Though, I have had good result in production with omnidirectional DPAs for voice.

Rabiner and Schafer in 1978 put forth an alternate solution that works in the time domain: attempt to find the period (or equivalently the fundamental frequency) of a given section of the wave using some pitch detection algorithm (commonly the peak of the signal’s autocorrelation, or sometimes cepstral processing), and crossfade one period into another.

This is called time domain harmonic scaling or the synchronized overlap-add method (SOLA) and performs somewhat faster than the phase vocoder on slower machines but fails when the autocorrelation mis-estimates the period of a signal with complicated harmonics (such as orchestral pieces). But when the fundamental is clean, the re-synthesis quality is quite astonishing.

It provides the most coherent results for single-pitched sounds like voice or musically monophonic instrument recordings.
High-end commercial audio processing packages either combine ola techniques with spectral approaches. For example the signal is separated into sinusoid and transient waveforms. Other techniques based on the wavelet transform, or artificial neural network processing, can also produce the highest-quality time stretching.”

For more information about SOLA and more specifically Sinola, you can read the PhD thesis of Geoffroy PeetersWsola is an other overlap-add technique based on waveform similarity. There is also picola, tdhs (Time Domain Harmonic Scaling) or Solafs.

The first famous commercial use was probably in 1986 with the Lexicon 2400. This gear detects transients without OLAing them; without overlapping. The first time I have personally heard an application of psola was in the Philippe Manoury‘s K opera in 2001 with Serge Lemouton. It is now widely used in softwares like Melodyne or maybe Auto-tune. You can listen to interesting pieces Jay-Z‘s Death to Auto-Tune or How I Roll from Britney Spears‘s producer Christian Karlsson.

Several MaxMSP externals make use of this synthesis. For instance psych~psychoitrist~, distributed by the Ircam forum, or shifter~ from Tristan Jehan. Those externals do the analysis in realtime. Other ones rather use an offline analysis such as pagsolo~ and pagsensemble~ from Norbert Schnellpsola analyse makes the needed analysis and saves it onto a SDIF file.

JT Rinker and I used it quite intensively and we have found the tools rather difficult to install and use properly. psolabatch is a small package containing the pm2 and psola_analyse executables with some exaplanation how to install and run them. The package also contains a Max patch generating a batch file from a folder of sounds ready for analysis. The current distributed pm2 and psola_analyse are normally distributed by Ircam and run from the Terminal on MacOS 10.5. They could possibly run on a newer MacOS system.

psola_analyse tracks a fundamental frequency using pm2. pm2 is dedicated to analysis and synthesis of sinusoidal components. It is highly important to find the right fundamental frequency. Another way to track the f0 is sometimes necessary so one can eventually either use yin, supervp or its interface control audiosculpt and save the fundamental as a SDIF file. That file can then be imported by psola_analyse. The fundamental function can be corrected by hand with audiosculpt. supervp analyses the F0, the VUF (voiced/unvoiced frequency), FFT, spectral envelope.

audiosculpt, supervp, psych~, psychoitrist~, pagsolo~ and pagsensemble~ are not included.