A2DP agent for promiscuous/permissive audio sinc

A2DP agent for promiscuous/permissive audio sinc for Linux. Once installed, a Bluetooth client, such as a smart phone, should be able to discover, pair, and subsequently play audio without any manual interaction. This is perfect for those with headless media boxes wanting to expand their connective options and saves explaining things to the kids 8) This project assumes the use of PulseAudio and should be tested with PortAudio if required. This project is heavily based on the Gist and comments […]

Read more

MelGAN test on audio decoding

The original work URL: https://github.com/descriptinc/melgan-neurips Previous works have found that generating coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques. Subjective evaluation metric (Mean Opinion Score, or MOS) shows the effectiveness of the proposed approach for high quality mel-spectrogram inversion. To establish the generality of the proposed techniques, we show qualitative […]

Read more

Audio event detection model based on YOLOX

Introduction YOLOX_AUDIO is an audio event detection model based on YOLOX, an anchor-free version of YOLO.This repo is an implementated by PyTorch.Main goal of YOLOX_AUDIO is to detect and classify pre-defined audio events in multi-spectrogram domain using image object detection frameworks. Updates!! 【2021/11/15】 We released YOLOX_AUDIO to public Quick Start Installation Step1. Install YOLOX_AUDIO. git clone https://github.com/intflow/YOLOX_AUDIO.git cd YOLOX_AUDIO pip3 install -U pip && pip3    

Read more

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper WaveFake. Deep generative modeling has the potential to cause significant harm to society.Recognizing this threat, a magnitude of research into detecting so-called “Deepfakes” has emerged.This research most often focuses on the image domain, while studies exploring generated audio signals have – so far – been neglected.In this paper, we aim to narrow this gap.We present a novel data set, for which we collected ten sample […]

Read more

Easy to use Audio Tagging in PyTorch

audio-tagging Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: [x] Fine-tune on audio classification [ ] Fine-tune on audio tagging [ ] Fine-tune on sound event detection [x] Add tagging metrics [ ] Add Tutorial [x] Add Augmentation Notebook [ ] Add more schedulers [ ] Add FSDKaggle2019 dataset [ ] Add MTT dataset [ ] Add DESED Model Zoo AudioSet Pretrained Models Model Task mAP (%) Sample Rate (kHz) Window Length Num Mels Fmax Weights CNN14 Tagging […]

Read more

Neural speaker diarization with pyannote-audio

Neural speaker diarization with pyannote-audio Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines: pyannote.audio also comes with pretrained models covering a wide range of domains for voice activity detection, speaker change detection, […]

Read more

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python. Requirements macOS (tested on 10.14 Mojave and 10.15 Catalina) or Linux (tested on 5.9.1-1-rt19-MANJARO) (Windows is not supported due to an incompatibility with the current multiprocessing implementation) JACK library (prebuilt installers / binaries are available) Conda installation (miniconda is sufficient; provides an easy way to get Intel MKL or alternatively OpenBLAS optimized numpy versions which is highly recommended) Python installation (tested with 3.7 to 3.9; recommended way […]

Read more

Simplified Python Audio-Features Extraction

spafe spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequency-stats etc. It also provides various filterbank modules (Mel, Bark and Gammatone filterbanks) and other spectral statistics. Dependencies spafe requires: Python (>= 3.5) NumPy (>= 1.17.2) SciPy (>= 1.3.1) User installation If you already have a working installation of numpy and scipy, you can simply install spafe using pip: […]

Read more

A python module that has the main focus to help estimate the Sound Absorption Coefficient

PyAbsorp This is a package developed to be use to find the Sound Absorption Coefficient through some implemented models, like Biot-Allard, Johnson-Champoux and others. This project is in the alpha stage. Dependencies PyAbsorp runs under Linux, Windows and MacOS, a Python 3.8.10 installation is needed with the latest Numpy (1.20.3 or higher) , Scipy (1.6.3 or higher).Matplotlib is recommended, but not necessary. Implemented Models Delany-Bazley (with Miki and Allard-Champoux variation) Biot-Allard Johnson-Champoux (with Allard and Lafarge variation) Rayleigh How to […]

Read more

A GUI-based audio player based on the discord bot

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includes a queue system, audio filters, as well as recording/saving audio. Has the ability to search for and display song lyrics, as well as visualise audio using piano key frequency bars. Based on the audio features of the discord bot https://github.com/thomas-xin/Miza, with audio visualisers based on https://github.com/thomas-xin/SpectralPulse GitHub https://github.com/thomas-xin/Miza-Player    

Read more
1 2