Low-level IO
Module for low-level audio input-output operations.
|
Base class for audio source objects. |
|
Base class for rewindable audio sources. |
|
An AudioSource that reads audio data from a memory buffer. |
|
An AudioSource class for reading data from a wave file. |
|
An AudioSource class for reading data from a built-in microphone using PyAudio. |
|
An AudioSource class for reading audio data from standard input. |
|
A class for audio playback using PyAudio. |
|
Read audio data from filename and return an AudioSource object. |
|
Write audio data to a file. |
|
Return an AudioPlayer compatible with the specified source. |
- class auditok.io.AudioSource(sampling_rate, sample_width, channels)[source]
Base class for audio source objects.
This class provides a foundation for audio source objects. Subclasses are expected to implement methods to open and close an audio stream, as well as to read the desired number of audio samples.
- Parameters:
sampling_rate (int) – The number of samples per second of audio data.
sample_width (int) – The size, in bytes, of each audio sample. Accepted values are 1, 2, or 4.
channels (int) – The number of audio channels.
- property ch
Number of channels in audio stream (alias for channels).
- property channels
Number of channels in audio stream.
- abstract read(size)[source]
Read and return up to size audio samples.
This abstract method reads audio data and returns it as a bytes object, containing at most size samples.
- Parameters:
size (int) – The number of samples to read.
- Returns:
A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:
size, if size is less than or equal to the number of remaining
samples - the number of remaining samples, if size exceeds the remaining samples
- Return type:
bytes
- property sample_width
Number of bytes used to represent one audio sample.
- property sampling_rate
Number of samples per second of audio stream.
- property sr
Number of samples per second of audio stream (alias for sampling_rate).
- property sw
Number of bytes used to represent one audio sample (alias for sample_width).
- class auditok.io.BufferAudioSource(data, sampling_rate=16000, sample_width=2, channels=1)[source]
An AudioSource that reads audio data from a memory buffer.
This class implements the Rewindable interface, allowing audio data stored in a buffer to be read with support for rewinding and position control.
- Parameters:
data (bytes) – The audio data stored in a memory buffer.
sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of one audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.
- property data
Get raw audio data as a bytes object.
- property position
Get stream position in number of samples
- property position_ms
Get stream position in milliseconds.
- read(size)[source]
Read and return up to size audio samples.
This abstract method reads audio data and returns it as a bytes object, containing at most size samples.
- Parameters:
size (int) – The number of samples to read.
- Returns:
A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:
size, if size is less than or equal to the number of remaining
samples - the number of remaining samples, if size exceeds the remaining samples
- Return type:
bytes
- class auditok.io.PyAudioPlayer(sampling_rate=16000, sample_width=2, channels=1)[source]
A class for audio playback using PyAudio.
This class facilitates audio playback through the PyAudio library (https://people.csail.mit.edu/hubert/pyaudio/).
- Parameters:
sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.
- class auditok.io.PyAudioSource(sampling_rate=16000, sample_width=2, channels=1, frames_per_buffer=1024, input_device_index=None)[source]
An AudioSource class for reading data from a built-in microphone using PyAudio.
This class leverages PyAudio (https://people.csail.mit.edu/hubert/pyaudio/) to capture audio data directly from a microphone.
- Parameters:
sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.
frames_per_buffer (int, optional, default=1024) – The number of frames per buffer, as specified by PyAudio.
input_device_index (int or None, optional, default=None) – The PyAudio index of the audio device to read from. If None, the default audio device is used.
- read(size)[source]
Read and return up to size audio samples.
This abstract method reads audio data and returns it as a bytes object, containing at most size samples.
- Parameters:
size (int) – The number of samples to read.
- Returns:
A bytes object containing the audio data, with a length of N * sample_width * channels, where N is:
size, if size is less than or equal to the number of remaining
samples - the number of remaining samples, if size exceeds the remaining samples
- Return type:
bytes
- class auditok.io.RawAudioSource(filename, sampling_rate, sample_width, channels)[source]
An AudioSource class for reading data from a raw (headerless) audio file.
This class is suitable for large raw audio files, allowing for efficient data handling without loading the entire file into memory.
- Parameters:
filename (str or Path) – The path to the raw audio file.
sampling_rate (int) – The number of samples per second of audio data.
sample_width (int) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int) – The number of audio channels.
- class auditok.io.Rewindable(sampling_rate, sample_width, channels)[source]
Base class for rewindable audio sources.
This class serves as a base for audio sources that support rewinding. Subclasses should implement a method to return to the beginning of the stream (rewind), and provide a property position that allows getting and setting the current stream position, expressed in number of samples.
- abstract property position
Return stream position in number of samples.
- property position_ms
Return stream position in milliseconds.
- property position_s
Return stream position in seconds.
- class auditok.io.StdinAudioSource(sampling_rate=16000, sample_width=2, channels=1)[source]
An AudioSource class for reading audio data from standard input.
This class is designed to capture audio data directly from standard input, making it suitable for streaming audio sources.
- Parameters:
sampling_rate (int, optional, default=16000) – The number of samples per second of audio data.
sample_width (int, optional, default=2) – The size in bytes of each audio sample. Accepted values are 1, 2, or 4.
channels (int, optional, default=1) – The number of audio channels.
- class auditok.io.WaveAudioSource(filename)[source]
An AudioSource class for reading data from a wave file.
This class is suitable for large wave files, allowing for efficient data handling without loading the entire file into memory.
- Parameters:
filename (str or Path) – The path to a valid wave file.
- auditok.io.from_file(filename, audio_format=None, large_file=False, **kwargs)[source]
Read audio data from filename and return an AudioSource object.
If audio_format is None, the appropriate AudioSource class is inferred from the file extension. The filename can refer to a compressed audio or video file; if a video file is provided, its audio track(s) are extracted. This functionality requires pydub (https://github.com/jiaaro/pydub).
By default, all audio data is loaded into memory to create a BufferAudioSource object, suitable for most cases. For very large files, set large_file=True to enable lazy loading, which reads audio data from disk each time AudioSource.read is called. Currently, lazy loading supports only wave and raw formats.
If audio_format is raw, the following keyword arguments are required:
sampling_rate, sr: int, sampling rate of audio data.
sample_width, sw: int, size in bytes of one audio sample.
channels, ch: int, number of channels of audio data.
See also
to_fileA related function for saving audio data to a file.
- Parameters:
filename (str or Path) – The path to the input audio or video file.
audio_format (str, optional) – The audio format (e.g., raw, webm, wav, ogg).
large_file (bool, optional, default=False) – If True, the audio data is read lazily from disk rather than being fully loaded into memory.
sampling_rate (int) – The sampling rate of the audio data.
sr (int) – The sampling rate of the audio data.
sample_width (int) – The sample width in bytes (i.e., number of bytes per audio sample).
channels (int) – The number of audio channels.
- Returns:
An AudioSource object that reads data from the specified file.
- Return type:
- Raises:
AudioIOError – If audio data cannot be read in the given format or if audio_format is raw and one or more required audio parameters are missing.
- auditok.io.player_for(source)[source]
Return an AudioPlayer compatible with the specified source.
This function creates an AudioPlayer instance (currently only PyAudioPlayer is implemented) that matches the audio properties of the provided source, ensuring compatibility in terms of sampling rate, sample width, and number of channels.
- Parameters:
source (AudioSource) – An object with sampling_rate, sample_width, and channels attributes.
- Returns:
An audio player with the same sampling rate, sample width, and number of channels as source.
- Return type:
- auditok.io.to_file(data, filename, audio_format=None, **kwargs)[source]
Write audio data to a file.
This function writes audio data to a file in the specified format. If audio_format is None, the output format will be inferred from the file extension. If audio_format is None and filename has no extension, the data will be saved as a raw audio file.
- Parameters:
data (bytes-like) – The audio data to be written. Accepts bytes, bytearray, memoryview, array, or numpy.ndarray objects.
filename (str or Path) – The path to the output audio file.
audio_format (str, optional) – The audio format to use for saving the data (e.g., raw, webm, wav, ogg).
kwargs (dict, optional) –
Additional parameters required for non-raw audio formats:
sampling_rate, sr : int, the sampling rate of the audio data.
sample_width, sw : int, the size in bytes of one audio sample.
channels, ch : int, the number of audio channels.
- Raises:
AudioParameterError – Raised if the output format is not raw and one or more required audio parameters are missing.
AudioIOError – Raised if the audio data cannot be written in the specified format.