Articles About Natural Language Processing

slimIPL: Language-Model-Free Iterative Pseudo-Labeling

Abstract Recent results in end-to-end automatic speech recognition have demonstrated the efficacy of pseudo-labeling for semi-supervised models trained both with Connectionist Temporal Classification (CTC) and Sequence-to-Sequence (seq2seq) losses. Iterative Pseudo-Labeling (IPL), which continuously trains a single model using pseudo-labels iteratively re-generated as the model learns, has been shown to further improve performance in ASR. We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without […]

Read more

A Two-stage Approach to Speech Bandwidth Extension

August 30, 2021 By: Ju Lin, Yun Wang, Kaustubh Kalgaonkar, Gil Keren, Didi Zhang, Christian Fuegen Abstract Algorithms for speech bandwidth extension (BWE) may work in either the time domain or the frequency domain. Time-domain methods often do not sufficiently recover the high-frequency content of speech signals; frequency-domain methods are better at recovering the spectral envelope, but have difficulty reconstructing the details of the waveform. In this paper, we propose a two-stage approach for BWE, which enjoys the advantages of […]

Read more

Getting to Production with Few-shot Natural Language Generation Models

July 29, 2021 By: Peyman Heidari, Arash Einolghozati, Shashank Jain, Soumya Batra, Lee Callender, Ankit Arun, Shawn Mei, Sonal Gupta, Pinar Donmez, Vikas Bhardwaj, Anuj Kumar, Michael White Abstract In this paper, we study the utilization of pretrained language models to enable few-shot Natural Language Generation (NLG) in task-oriented dialog systems. We introduce a system consisting of iterative self-training and an extensible mini-template framework that textualizes the structured input data into semi-natural text to fully take advantage of pre-trained language […]

Read more

Text-Free Image-to-Speech Synthesis Using Learned Segmental Units

August 2, 2021 By: Wei-Ning Hsu, David Harwath, Tyler Miller, Christopher Song, James Glass Abstract In this paper we present the first model for directly synthesizing fluent, natural-sounding spoken audio captions for images that does not require natural language text as an intermediate representation or source of supervision. Instead, we connect the image captioning module and the speech synthesis module with a set of discrete, sub-word speech units that are discovered with a self-supervised visual grounding task. We conduct experiments […]

Read more

SUPERB: Speech Understanding and PERformance Benchmark

August 30, 2021 By: Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Daniel Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Godic Lee, Darong Liu, Zili Huang, Annie Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee Abstract Using self-supervised learning methods to pre-train a network on large volumes of unlabeled data followed by fine-tuning for multiple downstream tasks has proven vital for advancing research in natural language representation learning. However, […]

Read more

Beginner’s Guide To Text Classification Using PyCaret

Introduction Have you ever solved a Machine Learning problem in just one go? Solving a problem using machine learning isn’t straightforward. It involves various steps to come up with an accurate solution. The process/steps to be followed for solving an ml problem is known as ML Pipeline/ML Cycle. ML Pipeline/ ML Cycle (Credits: https://medium.com/analytics-vidhya/machine-learning-development-life-cycle-dfe88c44222e) As shown in the figure, the Machine Learning pipeline consists of different steps like: Understand Problem Statement, Hypothesis Generation, Exploratory Data Analysis, Data Preprocessing, Feature Engineering, […]

Read more

A friendly guide to NLP: Bag-of-Words with Python example

1. A Quick Example Let’s look at an easy example to understand the concepts previously explained. We could be interested in analyzing the reviews about Game of Thrones: Review 1: Game of Thrones is an amazing tv series! Review 2: Game of Thrones is the best tv series! Review 3: Game of Thrones is so great In the table, I show all the calculations to obtain the Bag-Of-Words approach: Each row corresponds to a different review, while the rows are […]

Read more

Don’t Sweep your Learning Rate under the Rug- A Closer Look at Cross-modal Transfer of Pretrained Transformers

July 23, 2021 By: Danielle Rothermel, Margaret Li, Tim Rocktäschel, Jakob Foerster Abstract Self-supervised pre-training of large-scale transformer models on text corpora followed by fine-tuning has achieved state-of-the-art on a number of natural language processing tasks. Recently, Lu et al. (2021) claimed that frozen pretrained transformers (FPTs) match or outperform training from scratch as well as unfrozen (fine-tuned) pretrained transformers in a set of transfer tasks to other modalities. In our work, we find that this result is, in fact, […]

Read more

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

Abstract Single channel speech separation has experienced great progress in the last few years. However, training neural speech separation for a large number of speakers (e.g., more than 10 speakers) is out of reach for the current methods, which rely on the Permutation Invariant Training (PIT). In this work, we present a permutation invariant training that employs the Hungarian algorithm in order to train with an O (C 3) time complexity, where C is the number of speakers, in comparison […]

Read more

Predict the next word of your text using Long Short Term Memory (LSTM)

This article was published as a part of the Data Science Blogathon Introduction: https://sm.mashable.com/t/mashable_in/photo/default/shutterstock-1208129407_trm5.960.jpg Natural language processing has been an area of research and used widely in different applications. We often love texting each other and find that whenever we try to type a text a suggestion poops up trying to predict the next word we want to write. This process of prediction is one of the applications NLP deals with. We have made huge progress here and we can use […]

Read more
1 6 7 8 9 10 71