Audio I/O: Buffering, Latency, and Throughput

Audio Toolbox™ is optimized for real-time stream processing. Its input and output System objects are efficient, low-latency, and they control all necessary parameters so that you can trade off between throughput and latency.

This tutorial describes how MATLAB^® software implements real-time stream processing. The tutorial presents key terminology and basic techniques for optimizing your stream-processing algorithm. For more detailed technical descriptions and concepts, see the documentation for the audio I/O System objects used in this tutorial.

The concepts presented in this tutorial are described in terms of System objects in the MATLAB environment. The same concepts can be applied to corresponding blocks in the Simulink^® environment.

Input Audio Stream

To acquire an audio stream from a file, use the dsp.AudioFileReader System object™. To acquire an audio stream from a device, use the audioDeviceReader System object.

This diagram and the description that follows indicate the data flow when acquiring a monochannel signal with the audioDeviceReader System object.

Configuration

Properties of your audioDeviceReader specify the driver, device (sound card), sample rate, bit depth, buffer size, and channel mapping between your device's input channels and columns output from your audioDeviceReader object. Your object communicates these specifications to the driver once at setup.

Real-Time Processing Loop

The microphone picks up the sound and sends a continuous electrical signal to your sound card.
The sound card performs analog-to-digital conversion at a sample rate, buffer size, and bit depth specified during configuration.
The analog-to-digital converter writes audio samples into the sound card buffer. If the buffer is full, the new samples are dropped. These samples are referred to as overruns.
The audioDeviceReader uses the driver to pull the oldest frame from the sound card buffer iteratively.

Output Audio Stream

To send an audio stream to a file, use the dsp.AudioFileWriter System object. To send an audio stream to a device, use the audioDeviceWriter System object.

This diagram and the description that follows indicate the data flow when playing a monochannel signal with the audioDeviceWriter System object.

Configuration

Properties of your audioDeviceWriter specify the driver, device (sound card), sample rate, bit depth, buffer size, and channel mapping between your device's output channels and columns input to your audioDeviceWriter object. Your object communicates these specifications to the driver once at setup.

Real-Time Processing Loop

The processing stage passes a frame of variable length to the audioDeviceWriter System object.
audioDeviceWriter sends the frame to the sound card’s buffer.
The sound card pulls the oldest frame from the buffer and performs digital-to-analog conversion. The sound card sends the analog chunk to the speaker. If the buffer is empty when the sound card tries to pull from it, the sound card outputs a region of silence. This is referred to as underrun.

Synchronize Audio to and from Device

To simultaneously read from and write to a single audio device, use the audioPlayerRecorder System object.

This diagram and the description that follows indicate the data flow when playing and recording monochannel signals with the audioPlayerRecorder System object.

Configuration

Properties of your audioPlayerRecorder specify the device (sound card), sample rate, bit depth, buffer size, and channel mapping between your device and object. Your object communicates these specifications to the driver once at setup.

Real-Time Processing Loop

The microphone picks up the sound and sends a continuous electrical signal to your sound card. Simultaneously, the speaker plays an analog chunk received from the sound card.
The sound card performs analog-to-digital conversion of the acquired audio signal and writes the digital chunk to the input buffer. If the input buffer is full, the new samples are dropped. Simultaneously, the sound card pulls the oldest frame from the output buffer and performs digital-to-analog conversion of the next audio chunk to be played. If the output buffer is empty when the sound card tries to retrieve the data, the sound card outputs a region of silence.
The audioPlayerRecorder object returns the acquired audio signal to the MATLAB environment for processing. Simultaneously, the audio to be played is specified as an argument of the audioPlayerRecorder for playback in the next I/O cycle.

Terminology and Techniques to Optimize Performance

Signal Drops

Underrun refers to output signal silence. Output signal silence occurs if the device buffer is empty when it is time for digital-to-analog conversion. This results when the processing loop in MATLAB does not supply samples at the rate the sound card demands. The number of samples underrun is returned when you call your audioPlayerRecorder or audioDeviceWriter object.
Overrun refers to input signal drops. Input signal drops occur when the processing stage does not keep pace with the acquisition of samples. The number of samples overrun is returned when you call your audioPlayerRecorder or audioDeviceReader object.

If you encounter overrun or underrun, try improving your I/O system in one or more of the following ways:

Identify when the overrun or underrun occurs. If it occurs in the first few iterations, consider calling setupImpl on your System objects before the loop where real-time processing is required. You can also run the I/O system with dummy data for a few frames before starting the real processing. For more information, see Measure Performance of Streaming Real-Time Audio Algorithms.
If you are using a DirectSound driver on a Windows^® platform, consider switching to a WASAPI or ASIO™ driver. ASIO drivers have the least overhead. If you are using an ASIO driver, make sure to match the frame size in MATLAB to the ASIO buffer size. You can use asiosettings to open the ASIO preferences UI from MATLAB.
If you can afford to add more latency to your application, consider increasing the buffer size of your object. By default, the buffer size is the frame size of the data processed by the audio object.
If you can afford to decrease signal resolution, consider decreasing the sample rate.
Close all nonessential processes on your machine, such as mail checkers and file sync utilities. These processes can asynchronously ask for CPU time through interrupts and disturb the audio-processing loop.
To maximize performance, remove all plotting and visualization from your real-time loop. If you require a visualization to update in your processing loop, use a DSP System Toolbox™ scope such as timescope, spectrumAnalyzer, or dsp.ArrayPlot. Follow the recommendations listed in point 1 to set up and pre-run your scopes. If you require custom graphics or are processing callbacks in the loop, use the drawnow command and specify a limited update rate to optimize your event queue.
If the processing loop is algorithm heavy, try profiling your loop to locate the bottlenecks, and then apply appropriate measures:
- Replace handwritten code with MATLAB features that have been optimized for speed.
- Follow best practices for performance: Techniques to Improve Performance.
- Generating MATLAB executables (MEX files) using MATLAB Coder™ may result in faster execution. See Remove Interfering Tone From Audio Stream for an example.
  You can also generate standalone executables (EXE files). See Generate Standalone Executable for Parametric Audio Equalizer for an example.
- If you are considering turning your algorithm into a VST plugin, then try running it as a VST plugin within MATLAB. VST plugin generation uses C code generation technology under the hood, and running the generated VST plugin within MATLAB may result in faster execution than with your original MATLAB code. See Audio Plugins in MATLAB and Host External Audio Plugins to learn how to design, generate, and then host a VST plugin.

Latency

Output latency is measured as the time delay between the time of generation of an audio frame in MATLAB and the time that audio is heard through the speaker.
Input latency is measured as the time delay between the time that audio enters the sound card and the time that the frame is output by the processing stage.

If properties and frame size remain consistent, the ratio of input latency to output latency is consistent between calls to an audioPlayerRecorder object.

To minimize latency, you can:

Optimize the processing stage. If your processing stage has reached a peak algorithmically, compiling your MATLAB code into C code using MATLAB Coder may result in faster execution.
Increase the sample rate.
Decrease the frame size.

For a tutorial on measuring the round-trip latency of your system, see Measure Audio Latency.