I/O🔗

callcut.io provides utilities for loading audio files and their annotations, converting annotations to frame labels, and building PyTorch datasets for training. The audio loading relies on torchaudio which under-the-hood delegates to torchcodec and ffmpeg.

Important

ffmpeg must be installed on your system to use callcut.io.

Audio and Annotation Loading🔗

`load_audio`(fname, *[, sample_rate, mono, device])	Load an audio file.
`load_annotations`(fname)	Load call annotations from a CSV file.

Recording Metadata🔗

Scan and validate recordings before building datasets. This pre-validation step ensures all recordings have valid annotations and computes metadata (duration, annotation count) without loading the full audio.

`scan_recordings`(recordings)	Scan recordings and return metadata for valid ones.
`RecordingInfo`(audio_path, annotation_path, ...)	Metadata about a recording for dataset construction.

Label Generation🔗

Convert annotation intervals to per-frame binary labels for training.

intervals_to_frame_labels(intervals, times)

Convert annotation intervals to per-frame binary labels.

Dataset🔗

PyTorch Dataset for training call detection models.

CallDataset(recordings, extractor[, ...])

PyTorch Dataset for frame-level call detection.