Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers

image-to-recipe-transformers Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning This is the PyTorch companion code for the paper: Amaia Salvador, Erhan Gundogdu, Loris Bazzani, and Michael Donoser. Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning. CVPR 2021 If you find this code useful in your research, please consider citing using the following BibTeX entry: @inproceedings{salvador2021revamping, title={Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning}, author={Salvador, Amaia and Gundogdu, Erhan and […]

Read more

Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval

LightningDOT This repository contains source code and pre-trained/fine-tuned checkpoints for NAACL 2021 paper “LightningDOT”. It currently supports fine-tuning on MSCOCO and Flickr30k. Pre-training code and a demo for FULL MSCOCO retrieval are also available. Some code in this repo is copied/modifed from UNITER and DPR. If you find the code useful for your research, please consider citing: @inproceedings{sun2021lightningdot, title={LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval}, author={Sun, Siqi and Chen, Yen-Chun and Li, Linjie and Wang, Shuohang and Fang, Yuwei […]

Read more

Extract knowledge from raw text in python

This repository is a nearly copy-paste of “From Text to Knowledge: The Information Extraction Pipeline” with some cosmetic updates. I made an installable version to evaluate it easily. The original code is available @ trinity-ie. To create some value, I added the Luke model to predict relations between entities. Luke is a transformer (same family as Bert), its particularity is that during its pre-training, it trains parameters dedicated to entities within the attention mechanism. Luke is in fact a very […]

Read more

DeepViT: Towards Deeper Vision Transformer

DeepViT This repo is the official implementation of “DeepViT: Towards Deeper Vision Transformer”. The repo is based on the timm library (https://github.com/rwightman/pytorch-image-models) by Ross Wightman Deep Vision Transformer is initially described in arxiv, which observes the attention collapese phenomenon when training deep vision transformers: In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper. More specifically, we empirically observe that […]

Read more

A fast and easy implementation of Transformer with PyTorch

FasySeq FasySeq is a shorthand as a Fast and easy sequential modeling toolkit. It aims to provide a seq2seq model to researchers and developers, which can be trained efficiently and modified easily. This toolkit is based on Transformer(Vaswani et al.), and will add more seq2seq models in the future. Dependency PyTorch >= 1.4 NLTK Result … Structure … To Be Updated top-k and top-p sampling multi-GPU inference length penalty in beam search … Preprocess Build Vocabulary createVocab.py NamedArguments Description -f/–file […]

Read more

The “tl;dr” on a few notable transformer papers

# tldr-transformers The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc). Models: GPT- *, * BERT *, Adapter- *, * T5, etc. Each set of notes includes links to the paper, the original code implementation (if available) and the Huggingface :hugs: implementation. Here is an example: t5. The transformers papers are presented somewhat chronologically below. Go to the “:point_right: Notes :point_left:” column below to find the notes for each paper. This repo also includes […]

Read more

A Transformer Model for Embodied, Language-guided Visual Task Completion

EmBERT We present Embodied BERT (EmBERT), a transformer-based model which can attend to high-dimensional, multi-modal inputs across long temporal horizons for language-conditioned task completion. Additionally, we bridge the gap between successful object-centric navigation models used for non-interactive agents and the language-guided visual task completion benchmark, ALFRED, by introducing object navigation targets for EmBERT training. We achieve competitive performance on the ALFRED benchmark, and EmBERT marks the first transformer-based model to successfully handle the long-horizon, dense, multi-modal histories of ALFRED, and […]

Read more

Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt – PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The validation logs up to 70K of synthesized mel and alignment are shown below (VCTK_val_p237-088). DATASET refers to the names of datasets such as VCTK in the following documents. Dependencies You can install the Python dependencies with pip3 install -r requirements.txt Also, Dockerfile is provided for Docker users. Inference You have to download the pretrained models and put them in output/ckpt/DATASET/. For a […]

Read more

Sign Language Transformers (CVPR’20)

Sign Language Transformers (CVPR’20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation. This code is based on Joey NMT but modified to realize joint continuous sign language recognition and translation. For text-to-text translation experiments, you can use the original Joey NMT framework. Requirements Download the feature files using the data/download.sh script. [Optional] Create a conda or python virtual environment. Install required packages using the […]

Read more

Transform-Invariant Non-Negative Matrix Factorization

Transform-Invariant Non-Negative Matrix Factorization A comprehensive Python package for Non-Negative Matrix Factorization (NMF) with a focus on learning transform-invariant representations. The packages supports multiple optimization backends and can be easily extended to handle application-specific types of transforms. A general introduction to Non-Negative Matrix Factorization and the purpose of this package can be found on the corresponding GitHub Pages. For using this package, you will need Python version 3.7 (or higher).The package is available via PyPI. Installation is easiest using pip: […]

Read more
1 3 4 5 6 7