Low-level IO

Module for low-level audio input-output operations.

`AudioSource`(sampling_rate, sample_width, ...)	Base class for audio source objects.
`Rewindable`(sampling_rate, sample_width, channels)	Base class for rewindable audio sources.
`BufferAudioSource`(data[, sampling_rate, ...])	An AudioSource that reads audio data from a memory buffer.
`WaveAudioSource`(filename)	An AudioSource class for reading data from a wave file.
`PyAudioSource`([sampling_rate, sample_width, ...])	An AudioSource class for reading data from a built-in microphone using PyAudio.
`StdinAudioSource`([sampling_rate, ...])	An AudioSource class for reading audio data from standard input.
`PyAudioPlayer`([sampling_rate, sample_width, ...])	A class for audio playback using PyAudio.
`from_file`(filename[, audio_format, large_file])	Read audio data from filename and return an AudioSource object.
`to_file`(data, filename[, audio_format])	Write audio data to a file.
`player_for`(source)	Return an AudioPlayer compatible with the specified source.

class auditok.io.AudioSource(sampling_rate, sample_width, channels)[source]

Base class for audio source objects.

This class provides a foundation for audio source objects. Subclasses are expected to implement methods to open and close an audio stream, as well as to read the desired number of audio samples.

Parameters:

sampling_rate (int) – The number of samples per second of audio data.
sample_width (int) – The size, in bytes, of each audio sample. Accepted values are 1, 2, or 4.
channels (int) – The number of audio channels.

property ch: Number of channels in audio stream (alias for channels).

property channels: Number of channels in audio stream.

abstract close()[source]: Close audio source.

abstract is_open()[source]: Return True if audio source is open, False otherwise.

abstract open()[source]: Open audio source.

abstract read(size)[source]

Read and return up to size audio samples.

This abstract method reads audio data and returns it as a bytes object, containing at most size samples.

Parameters:

size (int) – The number of samples to read.

Returns:

A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:

size, if size is less than or equal to the number of remaining

samples - the number of remaining samples, if size exceeds the remaining samples

Return type:

bytes

property sample_width: Number of bytes used to represent one audio sample.

property sampling_rate: Number of samples per second of audio stream.

property sr: Number of samples per second of audio stream (alias for sampling_rate).

property sw: Number of bytes used to represent one audio sample (alias for sample_width).

class auditok.io.BufferAudioSource(data, sampling_rate=16000, sample_width=2, channels=1)[source]

An AudioSource that reads audio data from a memory buffer.

This class implements the Rewindable interface, allowing audio data stored in a buffer to be read with support for rewinding and position control.

Parameters:

data (bytes) – The audio data stored in a memory buffer.
sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of one audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.

close()[source]: Close audio source.

property data: Get raw audio data as a bytes object.

is_open()[source]: Return True if audio source is open, False otherwise.

open()[source]: Open audio source.

property position: Get stream position in number of samples

property position_ms: Get stream position in milliseconds.

read(size)[source]

Read and return up to size audio samples.

This abstract method reads audio data and returns it as a bytes object, containing at most size samples.

Parameters:

size (int) – The number of samples to read.

Returns:

A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:

size, if size is less than or equal to the number of remaining

samples - the number of remaining samples, if size exceeds the remaining samples

Return type:

bytes

rewind()[source]: Go back to the beginning of audio stream.

class auditok.io.PyAudioPlayer(sampling_rate=16000, sample_width=2, channels=1)[source]

A class for audio playback using PyAudio.

This class facilitates audio playback through the PyAudio library (https://people.csail.mit.edu/hubert/pyaudio/).

Parameters:

sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.

class auditok.io.PyAudioSource(sampling_rate=16000, sample_width=2, channels=1, frames_per_buffer=1024, input_device_index=None)[source]

An AudioSource class for reading data from a built-in microphone using PyAudio.

This class leverages PyAudio (https://people.csail.mit.edu/hubert/pyaudio/) to capture audio data directly from a microphone.

Parameters:

sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.
frames_per_buffer (int, optional, default=1024) – The number of frames per buffer, as specified by PyAudio.
input_device_index (int or None, optional, default=None) – The PyAudio index of the audio device to read from. If None, the default audio device is used.

close()[source]: Close audio source.

is_open()[source]: Return True if audio source is open, False otherwise.

open()[source]: Open audio source.

read(size)[source]

Read and return up to size audio samples.

This abstract method reads audio data and returns it as a bytes object, containing at most size samples.

Parameters:

size (int) – The number of samples to read.

Returns:

A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:

size, if size is less than or equal to the number of remaining

samples - the number of remaining samples, if size exceeds the remaining samples

Return type:

bytes

class auditok.io.RawAudioSource(filename, sampling_rate, sample_width, channels)[source]

An AudioSource class for reading data from a raw (headerless) audio file.

This class is suitable for large raw audio files, allowing for efficient data handling without loading the entire file into memory.

Parameters:

filename (str or Path) – The path to the raw audio file.
sampling_rate (int) – The number of samples per second of audio data.
sample_width (int) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int) – The number of audio channels.

open()[source]: Open audio source.

class auditok.io.Rewindable(sampling_rate, sample_width, channels)[source]

Base class for rewindable audio sources.

This class serves as a base for audio sources that support rewinding. Subclasses should implement a method to return to the beginning of the stream (rewind), and provide a property position that allows getting and setting the current stream position, expressed in number of samples.

abstract property position: Return stream position in number of samples.

property position_ms: Return stream position in milliseconds.

property position_s: Return stream position in seconds.

abstract rewind()[source]: Go back to the beginning of audio stream.

class auditok.io.StdinAudioSource(sampling_rate=16000, sample_width=2, channels=1)[source]

An AudioSource class for reading audio data from standard input.

This class is designed to capture audio data directly from standard input, making it suitable for streaming audio sources.

Parameters:

sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.

close()[source]: Close audio source.

is_open()[source]: Return True if audio source is open, False otherwise.

open()[source]: Open audio source.

class auditok.io.WaveAudioSource(filename)[source]

An AudioSource class for reading data from a wave file.

This class is suitable for large wave files, allowing for efficient data handling without loading the entire file into memory.

Parameters:: filename (str or Path) – The path to a valid wave file.

open()[source]: Open audio source.

auditok.io.from_file(filename, audio_format=None, large_file=False, **kwargs)[source]

Read audio data from filename and return an AudioSource object.

If audio_format is None, the appropriate AudioSource class is inferred from the file extension. The filename can refer to a compressed audio or video file; if a video file is provided, its audio track(s) are extracted. This functionality requires pydub (https://github.com/jiaaro/pydub).

By default, all audio data is loaded into memory to create a BufferAudioSource object, suitable for most cases. For very large files, set large_file=True to enable lazy loading, which reads audio data from disk each time AudioSource.read is called. Currently, lazy loading supports only wave and raw formats.

If audio_format is raw, the following keyword arguments are required:

sampling_rate, sr: int, sampling rate of audio data.

sample_width, sw: int, size in bytes of one audio sample.

channels, ch: int, number of channels of audio data.