auditok.util.AudioReader

class auditok.util.AudioReader(input, block_dur=0.01, hop_dur=None, record=False, max_read=None, **kwargs)[source]

Class to read fixed-size chunks of audio data from a source. A source can be a file on disk, standard input (with input = “-“) or microphone. This is normally used by tokenization algorithms that expect source objects with a read function that returns a windows of data of the same size at each call expect when remaining data does not make up a full window.

Objects of this class can be set up to return audio windows with a given overlap and to record the whole stream for later access (useful when reading data from the microphone). They can also have a limit for the maximum amount of data to read.

Parameters:
  • input (str, bytes, AudioSource, AudioReader, AudioRegion or None) – input audio data. If the type of the passed argument is str, it should be a path to an existing audio file. “-” is interpreted as standardinput. If the type is bytes, input is considered as a buffer of raw audio data. If None, read audio from microphone. Every object that is not an AudioReader will be transformed, when possible, into an AudioSource before processing. If it is an str that refers to a raw audio file, bytes or None, audio parameters should be provided using kwargs (i.e., samplig_rate, sample_width and channels or their alias).
  • block_dur (float, default: 0.01) – length in seconds of audio windows to return at each read call.
  • hop_dur (float, default: None) – length in seconds of data amount to skip from previous window. If defined, it is used to compute the temporal overlap between previous and current window (nameply overlap = block_dur - hop_dur). Default, None, means that consecutive windows do not overlap.
  • record (bool, default: False) – whether to record read audio data for later access. If True, audio data can be retrieved by first calling rewind(), then using the data property. Note that once rewind() is called, no new data will be read from source (subsequent read() call will read data from cache) and that there’s no need to call rewind() again to access data property.
  • max_read (float, default: None) – maximum amount of audio data to read in seconds. Default is None meaning that data will be read until end of stream is reached or, when reading from microphone a Ctrl-C is sent.
  • input is None, of type bytes or a raw audio files some of the (When) –
  • kwargs are mandatory. (follwing) –
Other Parameters:
 
  • audio_format, fmt (str) – type of audio data (e.g., wav, ogg, flac, raw, etc.). This will only be used if input is a string path to an audio file. If not given, audio type will be guessed from file name extension or from file header.
  • sampling_rate, sr (int) – sampling rate of audio data. Required if input is a raw audio file, is a bytes object or None (i.e., read from microphone).
  • sample_width, sw (int) – number of bytes used to encode one audio sample, typically 1, 2 or 4. Required for raw data, see sampling_rate.
  • channels, ch (int) – number of channels of audio data. Required for raw data, see sampling_rate.
  • use_channel, uc ({None, “any”, “mix”, “avg”, “average”} or int) – which channel to use for split if input has multiple audio channels. Regardless of which channel is used for splitting, returned audio events contain data from all the channels of input. The following values are accepted:
    • None (alias “any”): accept audio activity from any channel, even if other channels are silent. This is the default behavior.
    • “mix” (alias “avg” or “average”): mix down all channels (i.e., compute average channel) and split the resulting channel.
    • int (>= 0 , < channels): use one channel, specified by its integer id, for split.
  • large_file (bool, default: False) – If True, AND if input is a path to a wav of a raw audio file (and only these two formats) then audio data is lazily loaded to memory (i.e., one analysis window a time). Otherwise the whole file is loaded to memory before split. Set to True if the size of the file is larger than available memory.
__init__(input, block_dur=0.01, hop_dur=None, record=False, max_read=None, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(input[, block_dur, hop_dur, …]) Initialize self.
read() Read a block (i.e., window) of data read from this source.

Attributes

block_dur
hop_dur
hop_size
max_read
rewindable