Machine Translation Weekly 90: The Surprising Multinguality of Large Language Models

This week, I am going to share my amazement and doubts about what could be called the surprising multilinguality of large language models. By large language models, I mean the really large ones that I can hardly run myself, trained on huge, hardly curated data and thus harbouring the worst societal demons, but also having many fascinating properties. Here, I would like to feature three papers that make me think about the properties of the models. 1. Finetuning to other […]

Read more

A Roadmap to XML Parsers in Python

If you’ve ever tried to parse an XML document in Python before, then you know how surprisingly difficult such a task can be. On the one hand, the Zen of Python promises only one obvious way to achieve your goal. At the same time, the standard library follows the batteries included motto by letting you choose from not one but several XML parsers. Luckily, the Python community solved this surplus problem by creating even more XML parsing libraries. Jokes aside, […]

Read more

AI in Manufacturing: 4 Real-World Examples

Human error causes 23% of unplanned downtime in manufacturing. As you may know, unplanned downtime in manufacturing is a major cause of lost revenues. Can AI help reduce human errors in manufacturing? The quick answer is yes! AI can help mimic human decision-making on specific tasks. For example, on analyzing the image of a traffic stop, AI systems can be trained to detect the presence of objects such as a person, a stop sign, or a road bump. Given an […]

Read more

Django blog – complete customization and ready to use with one click installer

django-blog-it Simple blog package developed with Django. Features: Dynamic blog articles Blog pages Contact us page (configurable) google analytics SEO compliant Installation Install django-blog-it using the following command: pip install django-blog-it (or) git clone git://github.com/micropyramid/django-blog-it.git cd django-blog-it python setup.py install Add app name in settings.py: INSTALLED_APPS = [ ‘………………’, ‘simple_pagination’, ‘django_blog_it.django_blog_it’, ‘………………’ ] Include the django_blog_it urls in your urls.py: from django.conf.urls import    

Read more

Pytorch Implementation of rpautrat/SuperPoint

python export detections_repeatability.py python compute_repeatability.py (NOTE: You have to edit *.yaml files to run corresponding tasks, especially for the following items model name: superpoint # magicpoint … data: name: coco #synthetic image_train_path: [‘./data/mp_coco_v2/images/train2017’,] #several data sets can be list here label_train_path: [‘./data/mp_coco_v2/labels/train2017/’,] image_test_path: ‘./data/mp_coco_v2/images/test2017/’ label_test_path: ‘./data/mp_coco_v2/labels/test2017/’    

Read more

Python package to easily retrain OpenAI’s GPT-2 text-generating model on new texts

A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI‘s GPT-2 text generation model (specifically the “small” 124M and “medium” 355M hyperparameter versions). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase. This package incorporates and makes minimal low-level changes to: Model management from OpenAI’s official GPT-2 repo (MIT License) Model finetuning from Neil Shepperd’s fork of […]

Read more

Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

This package provides spaCy components and architectures to use transformer models via Hugging Face’s transformers in spaCy. The result is convenient access to state-of-the-art transformer architectures, such as BERT, GPT-2, XLNet, etc. This release requires spaCy v3. For the previous version of this library, see the v0.6.x branch. Features Use pretrained transformer models like BERT, RoBERTa and XLNet to power your spaCy pipeline. Easy multi-task learning: backprop to one transformer model from several pipeline components. Train using spaCy v3’s powerful […]

Read more

Fast Coreference Resolution in spaCy with Neural Networks

NeuralCoref is a pipeline extension for spaCy 2.1+ which annotates and resolves coreference clusters using a neural network. NeuralCoref is production-ready, integrated in spaCy’s NLP pipeline and extensible to new training datasets. For a brief introduction to coreference resolution and NeuralCoref, please refer to our blog post. NeuralCoref is written in Python/Cython and comes with a pre-trained statistical model for English only. NeuralCoref is accompanied by a visualization client NeuralCoref-Viz, a web interface powered by a REST server that can […]

Read more

sense2vec: Contextually-keyed word vectors

sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors. This library is a simple Python implementation for loading, querying and training sense2vec models. For more details, check out our blog post. To explore the semantic similarities across all Reddit comments of 2015 and 2019, see the interactive demo. ?Version 2.0 (for spaCy v3) out now! Read the release notes here. ✨Features

Read more
1 431 432 433 434 435 946