April 1, 2022 Speech

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPERB Leaderboard | The authors’ PyTorch implementation and pretrained models of LightHuBERT. Pre-Trained Models Load Pre-Trained Models for Inference

January 30, 2022 Speech

SpeechHacks – QHacks 2022 Project

QHacks 2022 Project- Liam Seagram, Jimmy Lu, Nolan Hepworth, Taylor Fiorelli Aim Using AssemblyAPI to perform text-to-speech and summarize the result. Dependencies/Installations To run the python code located in the scripts folder, here are the required installs: Next, you will need to install the nltk librairy to make the summarizer work. Here is the method that will work for local machines: Once that is done, run these two commands consecutively in python 3: import nltk nltk.download() This should open a […]

January 9, 2022 Speech

Speech Recognition Database Management with python

The main aim of this project is to recognize voice of the user as input and convert that input voice into the text form. We have used Speech Recognition module of Python to accomplish this mission. Inside it we have modules like PyAudio which helps us to play and record audio. Also, we have used the MySQL connector module for connecting our Python program to our MySQL database. We have created a library named MySQLvoice which helps our Artificial Intelligence […]

October 10, 2021 Speech

Every Google, Azure & IBM text to speech voice for free

Quick thing i made about a year ago to download any text with any tts voice, over 630 voices to choose from currently. It will split the input into multiple files every 1500 words or so to not hit any cutoff limits from TTS providers. Usage: Edit input.txt to change the text to synthesize. You can run just tts.py without any parameters to open the voice selector with default settings. Parameters

October 10, 2021 Speech

Persian Kaldi profile for Rhasspy built from open speech data

A Rhasspy profile for Persian (fa). Installation Get started by first installing Vosk: # Create virtual environment python3 -m venv .venv source .venv/bin/activate pip3 install –upgrade pip pip3 install –upgrade wheel setuptools # Install Vosk pip3 install vosk Next, download the model and extract it: wget ‘https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip’ unzip vosk-model-small-fa-rhasspy-0.15.zip Finally, run the transcribe.py Python program with the model and an audio file:

September 23, 2021 Speech

WaveGlow: A Flow-based Generative Network for Speech Synthesis

GitHub – npuichigo/waveglow: A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis – GitHub – npuichigo/waveglow: A PyTorch implementation of the WaveGlow: A Flow-based Generative Netwo…

August 31, 2021 Speech

Make your AirPlay devices as TTS speakers

Home Assistant integration component, make your AirPlay devices as TTS speakers. 2021.6.X or earlier Apple Airplayer component requires pyatv 0.8.1, which is self-contained in the latest version Home Assistant (2021.7.3). You can run pip list | grep pyqatv in your Home Assistant container host to check the version of pyatv. If lower than 0.8.1, you should run commands as below to upgrade pyatv. apk update apk add build-base pip3 install –upgrade pyatv pip3 install –upgrade attrs 2021.7.X or later There […]

August 26, 2021 Speech

Open-sourced speech technology by Huawei Noah’s Ark Lab

Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah’s Ark Lab. Grad-TTS Official implementation of the Grad-TTS model based on Diffusion Probabilistic Modelling. For all details check out our paper accepted to ICML 2021 via this link. Authors: Vadim Popov*, Ivan Vovk*, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov. *Equal contribution. GitHub https://github.com/huawei-noah/Speech-Backbones