Neural speaker diarization with pyannote-audio

Neural speaker diarization with pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines:

pipeline-2

pyannote.audio also comes with pretrained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding:

segmentation

Installation

pyannote.audio only supports Python 3.7 (or later) on Linux

To finish reading, please visit source site