Low-level IO

Module for low-level audio input-output operations.

AudioSource(sampling_rate, sample_width, …) Base class for audio source objects.
Rewindable(sampling_rate, sample_width, channels) Base class for rewindable audio streams.
BufferAudioSource(data[, sampling_rate, …]) An AudioSource that encapsulates and reads data from a memory buffer.
WaveAudioSource(filename) A class for an AudioSource that reads data from a wave file.
PyAudioSource([sampling_rate, sample_width, …]) A class for an AudioSource that reads data from built-in microphone using PyAudio (https://people.csail.mit.edu/hubert/pyaudio/).
StdinAudioSource([sampling_rate, …]) A class for an AudioSource that reads data from standard input.
PyAudioPlayer([sampling_rate, sample_width, …]) A class for audio playback using Pyaudio (https://people.csail.mit.edu/hubert/pyaudio/).
from_file(filename[, audio_format, large_file]) Read audio data from filename and return an AudioSource object.
to_file(data, file[, audio_format]) Writes audio data to file.
player_for(source) Return an AudioPlayer compatible with source (i.e., has the same sampling rate, sample width and number of channels).
class auditok.io.AudioSource(sampling_rate, sample_width, channels)[source]

Base class for audio source objects.

Subclasses should implement methods to open/close and audio stream and read the desired amount of audio samples.

Parameters:
  • sampling_rate (int) – number of samples per second of audio data.
  • sample_width (int) – size in bytes of one audio sample. Possible values: 1, 2 or 4.
  • channels (int) – number of channels of audio data.
ch

Number of channels in audio stream (alias for channels).

channels

Number of channels in audio stream.

close()[source]

Close audio source.

is_open()[source]

Return True if audio source is open, False otherwise.

open()[source]

Open audio source.

read(size)[source]

Read and return size audio samples at most.

Parameters:size (int) – Number of samples to read.
Returns:data – Audio data as a bytes object of length N * sample_width * channels where N equals:
  • size if size <= remaining samples
  • remaining samples if size > remaining samples
Return type:bytes
sample_width

Number of bytes used to represent one audio sample.

sampling_rate

Number of samples per second of audio stream.

sr

Number of samples per second of audio stream (alias for sampling_rate).

sw

Number of bytes used to represent one audio sample (alias for sample_width).

class auditok.io.Rewindable(sampling_rate, sample_width, channels)[source]

Base class for rewindable audio streams.

Subclasses should implement a method to return back to the start of an the stream (rewind), as well as a property getter/setter named position that reads/sets stream position expressed in number of samples.

position

Return stream position in number of samples.

position_ms

Return stream position in milliseconds.

position_s

Return stream position in seconds.

rewind()[source]

Go back to the beginning of audio stream.

class auditok.io.BufferAudioSource(data, sampling_rate=16000, sample_width=2, channels=1)[source]

An AudioSource that encapsulates and reads data from a memory buffer.

This class implements the Rewindable interface. :param data: audio data :type data: bytes :param sampling_rate: number of samples per second of audio data. :type sampling_rate: int, default: 16000 :param sample_width: size in bytes of one audio sample. Possible values: 1, 2 or 4. :type sample_width: int, default: 2 :param channels: number of channels of audio data. :type channels: int, default: 1

close()[source]

Close audio source.

data

Get raw audio data as a bytes object.

is_open()[source]

Return True if audio source is open, False otherwise.

open()[source]

Open audio source.

position

Get stream position in number of samples

position_ms

Get stream position in milliseconds.

read(size)[source]

Read and return size audio samples at most.

Parameters:size (int) – Number of samples to read.
Returns:data – Audio data as a bytes object of length N * sample_width * channels where N equals:
  • size if size <= remaining samples
  • remaining samples if size > remaining samples
Return type:bytes
rewind()[source]

Go back to the beginning of audio stream.

class auditok.io.RawAudioSource(file, sampling_rate, sample_width, channels)[source]

A class for an AudioSource that reads data from a raw (headerless) audio file.

This class should be used for large raw audio files to avoid loading the whole data to memory.

Parameters:
  • filename (str) – path to a raw audio file.
  • sampling_rate (int) – Number of samples per second of audio data.
  • sample_width (int) – Size in bytes of one audio sample. Possible values : 1, 2, 4.
  • channels (int) – Number of channels of audio data.
open()[source]

Open audio source.

class auditok.io.WaveAudioSource(filename)[source]

A class for an AudioSource that reads data from a wave file.

This class should be used for large wave files to avoid loading the whole data to memory.

Parameters:filename (str) – path to a valid wave file.
open()[source]

Open audio source.

class auditok.io.PyAudioSource(sampling_rate=16000, sample_width=2, channels=1, frames_per_buffer=1024, input_device_index=None)[source]

A class for an AudioSource that reads data from built-in microphone using PyAudio (https://people.csail.mit.edu/hubert/pyaudio/).

Parameters:
  • sampling_rate (int, default: 16000) – number of samples per second of audio data.
  • sample_width (int, default: 2) – size in bytes of one audio sample. Possible values: 1, 2 or 4.
  • channels (int, default: 1) – number of channels of audio data.
  • frames_per_buffer (int, default: 1024) – PyAudio number of frames per buffer.
  • input_device_index (None or int, default: None) – PyAudio index of audio device to read audio data from. If None default device is used.
close()[source]

Close audio source.

is_open()[source]

Return True if audio source is open, False otherwise.

open()[source]

Open audio source.

read(size)[source]

Read and return size audio samples at most.

Parameters:size (int) – Number of samples to read.
Returns:data – Audio data as a bytes object of length N * sample_width * channels where N equals:
  • size if size <= remaining samples
  • remaining samples if size > remaining samples
Return type:bytes
class auditok.io.StdinAudioSource(sampling_rate=16000, sample_width=2, channels=1)[source]

A class for an AudioSource that reads data from standard input.

Parameters:
  • sampling_rate (int, default: 16000) – number of samples per second of audio data.
  • sample_width (int, default: 2) – size in bytes of one audio sample. Possible values: 1, 2 or 4.
  • channels (int, default: 1) – number of channels of audio data.
close()[source]

Close audio source.

is_open()[source]

Return True if audio source is open, False otherwise.

open()[source]

Open audio source.

class auditok.io.PyAudioPlayer(sampling_rate=16000, sample_width=2, channels=1)[source]

A class for audio playback using Pyaudio (https://people.csail.mit.edu/hubert/pyaudio/).

Parameters:
  • sampling_rate (int, default: 16000) – number of samples per second of audio data.
  • sample_width (int, default: 2) – size in bytes of one audio sample. Possible values: 1, 2 or 4.
  • channels (int, default: 1) – number of channels of audio data.
auditok.io.from_file(filename, audio_format=None, large_file=False, **kwargs)[source]

Read audio data from filename and return an AudioSource object. if audio_format is None, the appropriate AudioSource class is guessed from file’s extension. filename can be a compressed audio or video file. This will require installing pydub (https://github.com/jiaaro/pydub).

The normal behavior is to load all audio data to memory from which a BufferAudioSource object is created. This should be convenient most of the time unless audio file is very large. In that case, and in order to load audio data in lazy manner (i.e. read data from disk each time AudioSource.read() is called), large_file should be True.

Note that the current implementation supports only wave and raw formats for lazy audio loading.

If an audio format is raw, the following keyword arguments are required:

  • sampling_rate, sr: int, sampling rate of audio data.
  • sample_width, sw: int, size in bytes of one audio sample.
  • channels, ch: int, number of channels of audio data.

See also

to_file()

Parameters:
  • filename (str) – path to input audio or video file.
  • audio_format (str) – audio format used to save data (e.g. raw, webm, wav, ogg).
  • large_file (bool, default: False) – if True, audio won’t fully be loaded to memory but only when a window is read from disk.
Other Parameters:
 
  • sampling_rate, sr (int) – sampling rate of audio data
  • sample_width (int) – sample width (i.e. number of bytes used to represent one audio sample)
  • channels (int) – number of channels of audio data
Returns:

audio_source – an AudioSource object that reads data from input file.

Return type:

AudioSource

Raises:

AudioIOError – raised if audio data cannot be read in the given format or if format is raw and one or more audio parameters are missing.

auditok.io.to_file(data, file, audio_format=None, **kwargs)[source]

Writes audio data to file. If audio_format is None, output audio format will be guessed from extension. If audio_format is None and file comes without an extension then audio data will be written as a raw audio file.

Parameters:
  • data (bytes-like) – audio data to be written. Can be a bytes, bytearray, memoryview, array or numpy.ndarray object.
  • file (str) – path to output audio file.
  • audio_format (str) – audio format used to save data (e.g. raw, webm, wav, ogg)
  • kwargs (dict) –

    If an audio format other than raw is used, the following keyword arguments are required:

    • sampling_rate, sr: int, sampling rate of audio data.
    • sample_width, sw: int, size in bytes of one audio sample.
    • channels, ch: int, number of channels of audio data.
Raises:
  • AudioParameterError if output format is different than raw and one or more
  • audio parameters are missing. AudioIOError if audio data cannot be written
  • in the desired format.
auditok.io.player_for(source)[source]

Return an AudioPlayer compatible with source (i.e., has the same sampling rate, sample width and number of channels).

Parameters:source (AudioSource) – An object that has sampling_rate, sample_width and sample_width attributes.
Returns:player – An audio player that has the same sampling rate, sample width and number of channels as source.
Return type:PyAudioPlayer