Articles About Natural Language Processing

Course Recommendations for Introductory Machine Learning

Before you jump into deep learning, I would strongly advise you to do a few introductory machine learning courses to get up to speed with fundamental concepts like clustering, regression, evaluation metrics, etc.  Here is a thread including a few recent courses you can explore: This is a crosspost of a Twitter thread I published earlier this week. Elements of AI by University of Helsinki Note: I have taken many machine learning courses online. I do some courses for fun […]

Read more

My Recommendations for Getting Started with NLP

I have been studying natural language processing (NLP) since 2013, back when manual feature engineering was very popular in the world of machine learning. We have come a long way since then. I actually specialized in information retrieval and machine learning techniques for my Ph.D., particularly how they apply to social computing and computational linguistics, while at the same time developing approaches for efficient information extraction from large-scale text-based data. I am fortunate to have experience with classical machine learning […]

Read more

Performance and Efficiency Evaluation of ASR Inference on the Edge

Abstract Automatic speech recognition, a process of converting speech signals to text, has improved a great deal in the past decade thanks to the deep learning based systems. With the latest transformer based models, the recognition accuracy measured as word-error-rate (WER), is even below the human annotator error (4%). However, most of these advanced models run on big servers with large amounts of memory, CPU/GPU resources and have huge carbon footprint. This server based architecture of ASR is not viable […]

Read more

Findings of the WMT 2021 Shared Task on Large-Scale Multilingual Machine Translation

November 8, 2021 By: Guillaume Wenzek, Vishrav Chaudhary, Angela Fan, Sahir Gomez, Naman Goyal, Somya Jain, Douwe Kiela, Tristan Thrush, Francisco Guzmán Abstract We present the results of the first task on Large-Scale Multilingual Machine Translation. The task consists on the many-to-many evaluation of a single model across a variety of source and target languages. This year, the task consisted on three different settings: (i) SMALLTASK1 (Central/South-Eastern European Languages), (ii) the SMALL-TASK2 (South East Asian Languages), and (iii) FULL-TASK (all […]

Read more

Findings of the WMT 2021 Shared Task on Quality Estimation

November 8, 2021 By: Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins Abstract We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. […]

Read more

The NLP Cypher | 11.21.21

Hey … so have you ever deployed a state-of-the-art production level inference server? Don’t know how to do it? Well… last week, Michael Benesty dropped a bomb when he published one of the first ever detailed blogs on how to not only deploy a production level inference API but benchmarking some of the most widely used frameworks such as FastAPI and Triton servers and runtime engines such as ONNX runtime (ORT) and TensorRT (TRT). Eventually, Michael recreated Hugging Face’s ability […]

Read more

Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

Abstract Despite their failure to solve the compositional SCAN dataset, seq2seq architectures still achieve astonishing success on more practical tasks. This observation pushes us to question the usefulness of SCAN-style compositional generalization in realistic NLP tasks. In this work, we study the benefit that such compositionality brings about to several machine translation tasks. We present several focused modifications of Transformer that greatly improve generalization capabilities on SCAN and select one that remains on par with a vanilla Transformer on a […]

Read more

Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules

Abstract One of the challenges faced by conversational agents is their inability to identify unstated presumptions of their users’ commands, a task trivial for humans due to their common sense. In this paper, we propose a zero-shot commonsense reasoning system for conversational agents in an attempt to achieve this. Our reasoner uncovers unstated presumptions from user commands satisfying a general template of if-(state ), then-(action ), because-(goal ). Our reasoner uses a state-of-the-art transformer-based generative commonsense knowledge base (KB) as […]

Read more

DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

Abstract Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks. However, research in language model pre-training has mostly focused on natural languages, and it is unclear whether models like BERT and its variants provide the best pre-training when applied to other modalities, such as source code. In this paper, we introduce a new pre-training objective, DOBF, that leverages the structural aspect of programming languages and pre-trains a model to recover […]

Read more

Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

January 9, 2022 By: Xiaohui Zhang, Frank Zhang, Chunxi Liu, Kjell Schubert, Julian Chan, Pradyot Prakash, Jun Liu, Ching-Feng Yeh, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig Abstract In this work, to measure the accuracy and efficiency for a latency-controlled streaming automatic speech recognition (ASR) application, we perform comprehensive evaluations on three popular training criteria: LF-MMI, CTC and RNN-T. In transcribing social media videos of 7 languages with training data 3K – 14K hours, we conduct large-scale controlled experimentation across each […]

Read more
1 2 3 4 5 6 71