How to Develop a Neural Net for Predicting Disturbances in the Ionosphere

It can be challenging to develop a neural network predictive model for a new dataset. One approach is to first inspect the dataset and develop ideas for what models might work, then explore the learning dynamics of simple models on the dataset, then finally develop and tune a model for the dataset with a robust test harness. This process can be used to develop effective neural network models for classification and regression predictive modeling problems. In this tutorial, you will […]

Read more

Issue #117 – Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation

11 Feb21 Issue #117 – Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation Author: Dr. Jingyi Han, Machine Translation Scientist @ Iconic Introduction Nowadays, zero-shot machine translation is receiving more and more attention due to the expensive cost of building new engines for different language directions. The underlying principle of this strategy is to build a single model that can learn to translate between different language pairs without involving direct training for such combinations. Following the […]

Read more

Hugging Face – Issue 7 – Feb 9th 2021

News New Year, New Website! Our vision for the future of machine learning is one step closer to reality thanks to the 1,000+ researchers & open-source contributors, thousands of companies & the fantastic Hugging Face team! Last month, we announced the launch of the latest version of huggingface.co and we couldn’t be more proud. 🔥 Play live with >10 billion parameters models for tasks including translation, NER, zero-shot classification, and

Read more

Introduction to Hugging Face’s Transformers v4.3.0 and its First Automatic Speech Recognition Model – Wav2Vec2

Overview Hugging Face has released Transformers v4.3.0 and it introduces the first Automatic Speech Recognition model to the library: Wav2Vec2 Using one hour of labeled data, Wav2Vec2 outperforms the previous state of the art on the 100-hour subset while using 100 times less labeled data Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data Wav2Vec2 achieves 4.8/8.2 WER Understand Wav2Vec2 implementation using transformers library on audio to text generation   Introduction Transformers has been […]

Read more

Function Optimization With SciPy

Optimization involves finding the inputs to an objective function that result in the minimum or maximum output of the function. The open-source Python library for scientific computing called SciPy provides a suite of optimization algorithms. Many of the algorithms are used as a building block in other algorithms, most notably machine learning algorithms in the scikit-learn library. These optimization algorithms can be used directly in a standalone manner to optimize a function. Most notably, algorithms for local search and algorithms […]

Read more

Speller100: Zero-shot spelling correction at scale for 100-plus languages

At Microsoft Bing, our mission is to delight users everywhere with the best search experience. We serve a diverse set of customers all over the planet who issue queries in over 100 languages. In search we’ve found about 15% of queries submitted by customers have misspellings. When queries are misspelled, we match the wrong set of documents and trigger incorrect answers, which can produce a suboptimal results page for our customers. Therefore, spelling correction is the very first component in […]

Read more

Summarising Historical Text in Modern Languages

de №11 Story Die Arbeiten im hiesigen Arsenal haben schon seit langer Zeit nachgelassen, und seitdem die Perser so sehr von den Russen geschlagen worden sind, hört man überhaupt nichts mehr von Kriegsrüstungen in den türkischen Provinzen. Die Pforte hatte nicht geglaubt, daß Rußland eine so starke Macht nach den Ufern des kaspischen Meeres abschicken, und daß der Krieg mit den Persern sobald eine so entscheidende Wendung nehmen würde. Alle kriegerischen Nachrichten, die wir jetzt aus den türkischen Provinzen erhalten, […]

Read more

Spark NLP: Natural Language Understanding at Scale

Abstract Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January […]

Read more

Attention Can Reflect Syntactic Structure (If You Let It)

Abstract Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English — a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We […]

Read more
1 4 5 6 7 8 10