The multitask and transfer learning toolkit for natural language processing research

The multitask and transfer learning toolkit for natural language processing research. Why should I use jiant? A few additional things you might want to know about jiant: jiant is configuration file driven jiant is built with PyTorch jiant integrates with datasets to manage task data jiant integrates with transformers to manage models and tokenizers. Getting Started Installation To import jiant from source (recommended for researchers): git clone https://github.com/nyu-mll/jiant.git cd jiant pip install -r requirements.txt # Add the following to your […]

Read more

A library for Multilingual Unsupervised or Supervised word Embeddings

MUSE: Multilingual Unsupervised and Supervised Embeddings A library for Multilingual Unsupervised or Supervised word Embeddings. MUSE is a Python library for multilingual word embeddings, whose goal is to provide the community with: state-of-the-art multilingual word embeddings (fastText embeddings aligned in a common space) large-scale high-quality bilingual dictionaries for training and evaluation We include two methods, one supervised that uses a bilingual dictionary or identical character strings, and one unsupervised that does not use any parallel data (see Word Translation without […]

Read more

Multilingual Unsupervised Sentence Simplification by Mining Paraphrases

Multilingual Unsupervised Sentence Simplification Code and pretrained models to reproduce experiments in “MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases”. Prerequisites Linux with python 3.6 or above (not compatible with python 3.9 yet). Installing git clone [email protected]:facebookresearch/muss.git cd muss/ pip install -e . # Install package python -m spacy download en_core_web_md fr_core_news_md es_core_news_md # Install required spacy models How to use Some scripts might still contain a few bugs, if you notice anything wrong, feel free to open an issue […]

Read more

Binary LSTM model for text classification

Text Classification The purpose of this repository is to create a neural network model of NLP with deep learning for binary classification of texts related to the Ministry of Emergency Situations. Components of the model The block contains the structure of the project, as well as a brief excerpt from the files, a more detailed description is located inside each module. model_predict.py – The module is designed to predict the topic of the text, whether the text belongs to the […]

Read more

Bi-encoder based entity linker for japanese with python

jel: Japanese Entity Linker jel – Japanese Entity Linker – is Bi-encoder based entity linker for japanese. Currently, link and question methods are supported. el.link This returnes named entity and its candidate ones from Wikipedia titles. from jel import EntityLinker el = EntityLinker() el.link(‘今日は東京都のマックにアップルを買いに行き、スティーブジョブスとドナルドに会い、堀田区に引っ越した。’) >> [ { “text”: “東京都”, “label”: “GPE”, “span”: [ 3, 6 ], “predicted_normalized_entities”: [ [ “東京都庁”, 0.1084 ], [ “東京”, 0.0633 ], [ “国家地方警察東京都本部”, 0.0604 ], [ “東京都”, 0.0598 ], … ] }, { “text”: “アップル”, […]

Read more

Simple translators for text files, microphone recordings or terminal input

lingopy Simple translators for text files, microphone recordings or terminal input, converting from and to most known languages. Welcome to lingopy, the quick and easy way to translate text or audio files into a foreign language. If you would like to transform a recording of a speech you gave into a different language or always wanted to translate a whole text file, then this project may help you out! Translator types :u7121: translator.py -> convert texts or messages by just […]

Read more

Learning Dense Representations of Phrases at Scale

DensePhrases DensePhrases is an extractive phrase search tool based on your natural language inputs. From 5 million Wikipedia articles, it can search phrase-level answers to your questions or find related entities to (subject, relation) pairs in real-time. Due to the extractive nature of DensePhrases, it always provides an evidence passage for each phrase. Please see our paper Learning Dense Representations of Phrases at Scale (Lee et al., 2021) for more details. Installation # Install torch with conda (please check your […]

Read more

Blue Brain text mining toolbox for semantic search and structured information extraction

Blue Brain Search Blue Brain Search is a text mining toolbox to perform semantic literature search and structured information extraction from text sources. This repository originated from the Blue Brain Project efforts on exploring and mining the CORD-19 dataset. Graphical Interface The graphical interface is composed of widgets to be used in Jupyter notebooks. For the graphical interface to work, the steps of the Getting Started should have been completed successfully. Find documents based on sentence semantic similarity To find […]

Read more

A method to pre-train general purpose natural language models

TunBERT People in Tunisia use the Tunisian dialect in their daily communications, in most of their media (TV, radio, songs, etc), and on the internet (social media, forums). Yet, this dialect is not standardized which means there is no unique way for writing and speaking it. Added to that, it has its proper lexicon, phonetics, and morphological structures. The need for a robust language model for the Tunisian dialect has become crucial in order to develop NLP-based applications (translation, information […]

Read more

An Efficient Pipeline For Bloom’s Taxonomy Using Natural Language Processing

Pipeline-For-NLP-With-Blooms-Taxonomy Pipeline For NLP with Bloom’s Taxonomy Using Improved Question Classification and Question Generation using Deep Learning This repository contains all the source code that is needed for the Project : An Efficient Pipeline For Bloom’s Taxonomy with Question Generation Using Natural Language Processing and Deep Learning. Outline : An examination assessment undertaken by educational institutions is an essential process, since it is one of the fundamental steps to determine a student’s progress and achievements for a distinct subject or […]

Read more
1 23 24 25 26 27