Articles About Natural Language Processing

December 7, 2021 NLP

Interpretable agent communication from scratch (with a generic visual processor emerging on the side)

Abstract As deep networks begin to be deployed as autonomous agents, the issue of how they can communicate with each other becomes important. Here, we train two deep nets from scratch to perform large-scale referent identification through unsupervised emergent communication. We show that the partially interpretable emergent protocol allows the nets to successfully communicate even about object classes they did not see at training time. The visual representations induced as a by-product of our training regime, moreover, when re-used as […]

December 7, 2021 NLP

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

Abstract Multi-head attention has each of the attention heads collect salient information from different parts of an input sequence, making it a powerful mechanism for sequence modeling. Multilingual and multi-domain learning are common scenarios for sequence modeling, where the key challenge is to maximize positive transfer and mitigate negative interference across languages and domains. In this paper, we find that non-selective attention sharing is sub-optimal for achieving good generalization across all languages and domains. We further propose attention sharing strategies […]

December 7, 2021 NLP

Course Recommendations for Introductory Machine Learning

Before you jump into deep learning, I would strongly advise you to do a few introductory machine learning courses to get up to speed with fundamental concepts like clustering, regression, evaluation metrics, etc. Here is a thread including a few recent courses you can explore: This is a crosspost of a Twitter thread I published earlier this week. Elements of AI by University of Helsinki Note: I have taken many machine learning courses online. I do some courses for fun […]

December 7, 2021 NLP

My Recommendations for Getting Started with NLP

I have been studying natural language processing (NLP) since 2013, back when manual feature engineering was very popular in the world of machine learning. We have come a long way since then. I actually specialized in information retrieval and machine learning techniques for my Ph.D., particularly how they apply to social computing and computational linguistics, while at the same time developing approaches for efficient information extraction from large-scale text-based data. I am fortunate to have experience with classical machine learning […]

November 25, 2021 NLP

Performance and Efficiency Evaluation of ASR Inference on the Edge

Abstract Automatic speech recognition, a process of converting speech signals to text, has improved a great deal in the past decade thanks to the deep learning based systems. With the latest transformer based models, the recognition accuracy measured as word-error-rate (WER), is even below the human annotator error (4%). However, most of these advanced models run on big servers with large amounts of memory, CPU/GPU resources and have huge carbon footprint. This server based architecture of ASR is not viable […]

November 22, 2021 NLP

Findings of the WMT 2021 Shared Task on Large-Scale Multilingual Machine Translation

November 8, 2021 By: Guillaume Wenzek, Vishrav Chaudhary, Angela Fan, Sahir Gomez, Naman Goyal, Somya Jain, Douwe Kiela, Tristan Thrush, Francisco Guzmán Abstract We present the results of the first task on Large-Scale Multilingual Machine Translation. The task consists on the many-to-many evaluation of a single model across a variety of source and target languages. This year, the task consisted on three different settings: (i) SMALLTASK1 (Central/South-Eastern European Languages), (ii) the SMALL-TASK2 (South East Asian Languages), and (iii) FULL-TASK (all […]

November 22, 2021 NLP

Findings of the WMT 2021 Shared Task on Quality Estimation

November 8, 2021 By: Lucia Specia, Frédéric Blain, Marina Fomicheva, Chrysoula Zerva, Zhenhao Li, Vishrav Chaudhary, André F. T. Martins Abstract We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. […]

November 22, 2021 Natural Language Processing, NLP, Python

The NLP Cypher | 11.21.21

Hey … so have you ever deployed a state-of-the-art production level inference server? Don’t know how to do it? Well… last week, Michael Benesty dropped a bomb when he published one of the first ever detailed blogs on how to not only deploy a production level inference API but benchmarking some of the most widely used frameworks such as FastAPI and Triton servers and runtime engines such as ONNX runtime (ORT) and TensorRT (TRT). Eventually, Michael recreated Hugging Face’s ability […]

November 19, 2021 NLP

Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

Abstract Despite their failure to solve the compositional SCAN dataset, seq2seq architectures still achieve astonishing success on more practical tasks. This observation pushes us to question the usefulness of SCAN-style compositional generalization in realistic NLP tasks. In this work, we study the benefit that such compositionality brings about to several machine translation tasks. We present several focused modifications of Transformer that greatly improve generalization capabilities on SCAN and select one that remains on par with a vanilla Transformer on a […]

November 16, 2021 NLP

Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules

Abstract One of the challenges faced by conversational agents is their inability to identify unstated presumptions of their users’ commands, a task trivial for humans due to their common sense. In this paper, we propose a zero-shot commonsense reasoning system for conversational agents in an attempt to achieve this. Our reasoner uncovers unstated presumptions from user commands satisfying a general template of if-(state ), then-(action ), because-(goal ). Our reasoner uses a state-of-the-art transformer-based generative commonsense knowledge base (KB) as […]

« 1 2 3 4 5 6 … 72 »