I/O🔗

callcut.io provides utilities for loading audio files and their annotations, converting annotations to frame labels, and building PyTorch datasets for training. The audio loading relies on torchaudio which under-the-hood delegates to torchcodec and ffmpeg.

Important

ffmpeg must be installed on your system to use callcut.io.

Audio and Annotation Loading🔗

load_audio(fname, *[, sample_rate, mono, device])

Load an audio file.

load_annotations(fname)

Load call annotations from a CSV file.

Recording Metadata🔗

Scan and validate recordings before building datasets. This pre-validation step ensures all recordings have valid annotations and computes metadata (duration, annotation count) without loading the full audio.

scan_recordings(recordings)

Scan recordings and return metadata for valid ones.

RecordingInfo(audio_path, annotation_path, ...)

Metadata about a recording for dataset construction.

Label Generation🔗

Convert annotation intervals to per-frame binary labels for training.

intervals_to_frame_labels(intervals, times)

Convert annotation intervals to per-frame binary labels.

Dataset🔗

PyTorch Dataset for training call detection models.

CallDataset(recordings, extractor[, ...])

PyTorch Dataset for frame-level call detection.