andrewslotnick.com


Audio Delay with Python

Notebook source code

Delay is a fundamental audio effect. The idea of playing a sound and repeating it after some time is simple, but it is used extensively in music production as a standalone effect and as the basis of reverb, chorus, and flanging. In this post I will implement a version of delay to test out audio support in the standard library and to better understand how the effect works.

To implement a delay function we'll need to read and write audio data, add two audio signals together, change a signal's volume, and create empty space at the end of a signal. The final version of the function will be able to generate the following sound files (and many more) based on the input.

In [1]:
from IPython.display import Audio, display, HTML
s3 = 'https://s3.amazonaws.com/audio-experiments/examples/'

#display audio object and text link for browsers not compatible with <audio>
def display_link_audio(file):
    return display(HTML('<a href='+s3+file+'>'+file+'</a>'),
           Audio(url=s3+file, embed=False))

print("Input:")
display_link_audio('Trumpet.wav')  
print("\nOutputs with delay:")
display_link_audio('Trumpet_delay_700ms.wav')
display_link_audio('Trumpet_delay_1ms_0.75f_1n.wav')
display_link_audio('Trumpet_delay_250ms_0.7f_10n.wav')
Input:
Outputs with delay:

(If you just want to hear some delayed audio, skip straight to the Examples section)

Get some audio

The Wave module handles input and output of Wav files. This uncompressed format consists of many samples, each of which is an amplitude measurement of the sound wave. Sample data is represented as a bytes object, which makes this standard library module less useful than some other modules that return a numpy array (e.g. audiolab).

To work with the data in the bytes object we'll need to know the number of bytes per sample and the sample frequency. For simplicity these examples only support mono audio, so the nchannels parameter will need to equal 1.

In [2]:
import wave

def input_wave(filename,frames=10000000): #10000000 is an arbitrary large number of frames
    with wave.open(filename,'rb') as wave_file:
        params=wave_file.getparams()
        audio=wave_file.readframes(frames)  
        if params.nchannels!=1:
            raise Exception("The input audio should be mono for these examples")
    return params, audio

#output to file so we can use ipython notebook's Audio widget
def output_wave(audio, params, stem, suffix):
    #dynamically format the filename by passing in data
    filename=stem.replace('.wav','_{}.wav'.format(suffix))
    with wave.open(filename,'wb') as wave_file:
        wave_file.setparams(params)
        wave_file.writeframes(audio)

The following short mono audio clips will be used to demonstrate the delay effect. One is a trumpet solo and the other is the sound of a punch.

In [3]:
trumpet_params, trumpet_bytes = input_wave('wavs/Trumpet.wav')
punch_params,punch_bytes = input_wave('wavs/Punch.wav')

print("Bytes per sample: {}".format(trumpet_params.sampwidth), 
      "Samples per second: {}".format(trumpet_params.framerate),
      "First 10 bytes:", trumpet_bytes[:10], sep='\n')
display_link_audio('Trumpet.wav')

print("Bytes per sample: {}".format(punch_params.sampwidth), 
      "Samples per second: {}".format(punch_params.framerate),
      "First 10 bytes:", punch_bytes[:10], sep='\n')
display_link_audio('Punch.wav')
Bytes per sample: 3
Samples per second: 44100
First 10 bytes:
b'\x00\x00\x00\x13\x00\x00\x1f\x00\x005'
Bytes per sample: 2
Samples per second: 44100
First 10 bytes:
b'\x00\x00\x00\x00\x00\x00\x00\x00\xfe\xff'

Implement a delay function

The simplest delay function will create a copy of the input, add some silence (0's in the bytes object) to the beginning of the copy, and combine it with the original input. The add function from Audioop will add the two bytes objects together. Audioop.add requires both pieces of audio to have the same length, so we also need to cut off the end of the copy.

In [4]:
from audioop import add

def delay(audio_bytes,params,offset_ms):
    """version 1: delay after 'offset_ms' milliseconds"""
    #calculate the number of bytes which corresponds to the offset in milliseconds
    offset= params.sampwidth*offset_ms*int(params.framerate/1000)
    #create some silence
    beginning= b'\0'*offset
    #remove space from the end
    end= audio_bytes[:-offset]
    return add(audio_bytes, beginning+end, params.sampwidth)
In [5]:
#1-second delay
output_wave(delay(trumpet_bytes,trumpet_params,1000), 
            trumpet_params, 'wavs/Trumpet.wav','delay_1000ms')
#250 ms delay
output_wave(delay(punch_bytes,punch_params,250), 
            punch_params, 'wavs/Punch.wav','delay_250ms')

display_link_audio('Trumpet_delay_1000ms.wav')
display_link_audio('Punch_delay_250ms.wav')

Change the delayed audio's volume

To make this sound more like a realistic echo, we can change the volume of the delayed audio by multiplying it using audioop.mul. Note that multiplying each sample by one half is not the same as reducing the percieved loudness by one half.

In [6]:
from audioop import mul
#new delay function with factor
def delay(audio_bytes,params,offset_ms,factor=1):
    """version 2: delay after 'offset_ms' milliseconds amplified by 'factor'"""
    #calculate the number of bytes which corresponds to the offset in milliseconds
    offset= params.sampwidth*offset_ms*int(params.framerate/1000)
    #create some silence
    beginning= b'\0'*offset
    #remove space from the end
    end= audio_bytes[:-offset]
    #multiply by the factor
    multiplied_end= mul(audio_bytes[:-offset],params.sampwidth,factor)
    return add(audio_bytes, beginning+ multiplied_end, params.sampwidth)
In [7]:
#1-second delay with factor .5
output_wave(delay(trumpet_bytes,trumpet_params,offset_ms=1000, factor=0.5),
            trumpet_params, 'wavs/Trumpet.wav','delay_1000ms_0.5f')
#500 ms delay with factor .25
output_wave(delay(punch_bytes,punch_params,offset_ms=250, factor=0.25),
            punch_params, 'wavs/Punch.wav','delay_250ms_0.25f')

display_link_audio('Trumpet_delay_1000ms_0.5f.wav')
display_link_audio('Punch_delay_250ms_0.25f.wav')

Multiple delays

Another enhancement to the delay function is to allow for multiple repeats. Each time a repeat occurs, the volume will get progressively louder or softer based on the factor.

In [8]:
from warnings import warn

def delay(audio_bytes,params,offset_ms,factor=1,num=1):
    """version 3: 'num' delays after 'offset_ms' milliseconds amplified by 'factor'"""
    if factor>=1:
        warn("These settings may produce a very loud audio file. \
             Please use caution when listening")
    #calculate the number of bytes which corresponds to the offset in milliseconds
    offset=params.sampwidth*offset_ms*int(params.framerate/1000)
    #add extra space at the end for the delays
    delayed_bytes=audio_bytes
    for i in range(num):
        #create some silence
        beginning = b'\0'*offset*(i+1)
        #remove space from the end
        end = audio_bytes[:-offset*(i+1)]
        #multiply by the factor
        multiplied_end= mul(end,params.sampwidth,factor**(i+1))
        delayed_bytes= add(delayed_bytes, beginning+multiplied_end, params.sampwidth)
    return delayed_bytes
In [9]:
#1-second delay with factor .5, 3 repeats
output_wave(delay(trumpet_bytes,trumpet_params,offset_ms=1000, factor=0.5, num=3),
            trumpet_params, 'wavs/Trumpet.wav','delay_1000ms_0.5f_3n')
#250 ms delay with factor .7, 4 repeats
output_wave(delay(punch_bytes, punch_params,offset_ms=250, factor=0.7, num=4),
            punch_params, 'wavs/Punch.wav','delay_250ms_0.7f_4n')

display_link_audio('Trumpet_delay_1000ms_0.5f_3n.wav')
display_link_audio('Punch_delay_250ms_0.7f_4n.wav')

Leave space at the end

The multi-delays above output a sound of the same length as the original, so trying to add too many repeats will throw an exception. To give more time to hear the effects, let's increase the length to allow every repeat to finish.

In [10]:
def delay(audio_bytes,params,offset_ms,factor=1,num=1):
    """version 4: 'num' delays after 'offset_ms' milliseconds amplified by 'factor' 
    with additional space"""
    if factor>=1:
        warn("These settings may produce a very loud audio file. \
              Please use caution when listening")
    #calculate the number of bytes which corresponds to the offset in milliseconds
    offset=params.sampwidth*offset_ms*int(params.framerate/1000)
    #add extra space at the end for the delays
    audio_bytes=audio_bytes+b'\0'*offset*(num)
    #create a copy of the original to apply the delays
    delayed_bytes=audio_bytes
    for i in range(num):
        #create some silence
        beginning = b'\0'*offset*(i+1)
        #remove space from the end
        end = audio_bytes[:-offset*(i+1)]
        #multiply by the factor
        multiplied_end= mul(end,params.sampwidth,factor**(i+1))
        delayed_bytes= add(delayed_bytes, beginning+multiplied_end, params.sampwidth)
    return delayed_bytes
In [11]:
#3-second delay with factor .5, 3 repeats
output_wave(delay(trumpet_bytes,trumpet_params,offset_ms=3000, factor=0.5, num=3),
            trumpet_params, 'wavs/Trumpet.wav','delay_3000ms_0.5f_3n')
#500 ms delay with factor .7, 6 repeats
output_wave(delay(punch_bytes, punch_params,offset_ms=500, factor=0.7, num=6),
            punch_params, 'wavs/Punch.wav','delay_500ms_0.7f_6n')

display_link_audio('Trumpet_delay_3000ms_0.5f_3n.wav')
display_link_audio('Punch_delay_500ms_0.7f_6n.wav')

Examples

The best way to understand how different parameters affect the final sound is to try out a lot of examples. delay_to_file is a helper function to speed up this process.

In [12]:
#helper function to try out lots of delays
def delay_to_file(audio_bytes, params, offset_ms, file_stem, factor=1, num=1):
    echoed_bytes=delay(audio_bytes, params, offset_ms, factor,num)
    output_wave(echoed_bytes, params, file_stem,
                'delay_{}ms_{}f_{}n'.format(offset_ms,factor,num))

1-2 seconds of delay is too long to sound like a natural echo, so this can be used for a musical effect.

In [13]:
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=1000,file_stem='wavs/Trumpet.wav',
              factor=.5)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=2000,file_stem='wavs/Trumpet.wav',
              factor=.9, num=10)

display_link_audio('Trumpet_delay_1000ms_0.5f_1n.wav')
display_link_audio('Trumpet_delay_2000ms_0.9f_10n.wav')

Adding 250-400 ms of delay with a low enough decay factor starts to give the impression of a natural echo that might occur in a larger physical space.

In [14]:
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=400, file_stem='wavs/Trumpet.wav',
              factor=.5, num=5)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=400, file_stem='wavs/Trumpet.wav',
              factor=.25)
delay_to_file(punch_bytes,punch_params, offset_ms=400, file_stem='wavs/Punch.wav',
              factor=.5, num=10)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=250, file_stem='wavs/Trumpet.wav',
              factor=.5, num=4)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=250, file_stem='wavs/Trumpet.wav',
              factor=.25, num=3)
delay_to_file(punch_bytes,punch_params, offset_ms=250, file_stem='wavs/Punch.wav',
              factor=.5, num=10)

display_link_audio('Trumpet_delay_400ms_0.5f_5n.wav')
display_link_audio('Trumpet_delay_400ms_0.25f_1n.wav')
display_link_audio('Punch_delay_400ms_0.5f_10n.wav')
display_link_audio('Trumpet_delay_250ms_0.5f_4n.wav')
display_link_audio('Trumpet_delay_250ms_0.25f_3n.wav')
display_link_audio('Punch_delay_250ms_0.5f_10n.wav')

As delays get shorter, around 125 ms or less, the effect starts to be percieved as a single sound rather than a distinct echo. This produces the effect of a comb filter as certain frequencies get louder or softer due to interference. On the trumpet sample, 1-5 ms delay sounds a lot like a trumpet played through a mute, especially with a high number of delays.

In [15]:
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=125, file_stem='wavs/Trumpet.wav',
              factor=.5)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=75, file_stem='wavs/Trumpet.wav', 
              factor=.5)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=20, file_stem='wavs/Trumpet.wav', 
              factor=.75)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=5, file_stem='wavs/Trumpet.wav', 
              factor=.75)
delay_to_file(trumpet_bytes,trumpet_params, offset_ms=1, file_stem='wavs/Trumpet.wav', 
              factor=.75, num=20)

display_link_audio('Trumpet_delay_125ms_0.5f_1n.wav')
display_link_audio('Trumpet_delay_75ms_0.5f_1n.wav')
display_link_audio('Trumpet_delay_20ms_0.75f_1n.wav')
display_link_audio('Trumpet_delay_5ms_0.75f_1n.wav')
display_link_audio('Trumpet_delay_1ms_0.75f_20n.wav')     

Really interesting sounds can be produced below 125ms of delay by using many repeats. The trumpet starts to sound like an entirely different instrument.

Since the punch sound is so short, we can actually use delay to produce a tone. For example, applying a delay of 2ms many times with almost no decay will result in a fading 500 Hertz note, since we're effectively playing the same punch sound at a frequency of 500 times per second.

In [16]:
delay_to_file(trumpet_bytes, trumpet_params, offset_ms=50, file_stem='wavs/Trumpet.wav',
              factor=.76,num=10)
delay_to_file(trumpet_bytes, trumpet_params, offset_ms=125, file_stem='wavs/Trumpet.wav',
              factor=.7,num=10)
delay_to_file(trumpet_bytes, trumpet_params, offset_ms=20, file_stem='wavs/Trumpet.wav',
              factor=.97,num=300)
delay_to_file(punch_bytes, punch_params, offset_ms=2, file_stem='wavs/Punch.wav',
              factor=.996, num=3000)

display_link_audio('Trumpet_delay_50ms_0.76f_10n.wav')
display_link_audio('Trumpet_delay_125ms_0.7f_10n.wav')
display_link_audio('Trumpet_delay_20ms_0.97f_300n.wav')
display_link_audio('Punch_delay_2ms_0.996f_3000n.wav')

Next Steps

Now that I've implemented a simple delay function, I'd like to make a few enhancements.

This post proved that it's possible to use the standard library to do audio I/O and effects, but next time I will use audiolab and numpy arrays instead of bytes objects. This will make it much easier and more efficient to add and multiply.

I would also like to build on this delay function to build higher-level effects. Varying the amount of delay applied to an audio clip instead of using a fixed offset with produce a flanging effect. And combining multiple delays with different parameters will produce reverb.

I'm not satisfied with the way the "factor" parameter is used in the delay function, so it would be great to model decay in a way more similar to a natural sound.

Another interesting problem will be applying delay to a live signal, rather than as a post-processing effect.

And of course it will be nice to try this function with some new audio clips, as I'm getting really sick of that trumpet solo.

Thanks for reading!