Logo

Getting started

  • Installation
  • Load audio data
  • Basic split example
  • Improving detection boundaries
  • Trim silence
  • Normalize pauses
  • Split and plot
  • Read audio data from the microphone
  • Working with AudioRegions

Command-line guide

  • Command-line guide

API Reference

  • Audio (main API)
  • Tokenization
auditok
  • auditok – Audio Activity Detection
  • View page source

auditok – Audio Activity Detection

PyPI version Python versions Build Status https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf

auditok is a lightweight audio activity detection library for Python. It splits audio streams into events by thresholding signal energy (no models or training data required).

Use it for voice activity detection, silence removal, audio segmentation, or any task where you need to find “where the sound is” in an audio stream.

Getting started

  • Installation
  • Load audio data
    • From a file
    • On-the-fly format conversion
    • From a bytes object
    • From the microphone
    • Skip part of audio data
    • Limit the amount of read audio
  • Basic split example
  • Improving detection boundaries
  • Trim silence
  • Normalize pauses
  • Split and plot
    • Interactive widget in Jupyter
  • Read audio data from the microphone
  • Working with AudioRegions
    • Basic region information
    • Concatenate regions
    • Repeat a region
    • Split one region into N regions of equal size
    • Slice a region by samples, seconds, or milliseconds
    • Export as a numpy array
    • Playback
    • Save audio

Command-line guide

  • Command-line guide
    • Split audio into events
      • Save detected events to individual files
      • Save the full audio stream
      • Customize output format
      • Play back detections
      • Plot detections
    • Trim silence
    • Normalize pauses (fix-pauses)
    • Improving detection boundaries
    • Real-time microphone input
    • Read audio from an external program
    • Common options reference

API Reference

  • Audio (main API)
  • Tokenization

License

MIT.

Next

© Copyright 2015-2026, Amine Sehili.

Built with Sphinx using a theme provided by Read the Docs.