ProtoAttend: Attention-Based Prototypical Learning

Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister, “ProtoAttend: Attention-Based Prototypical Learning”Link: https://arxiv.org/abs/1902.06292 We propose a novel inherently interpretable machine learning method that bases decisions on few relevant examples that we call prototypes. Our method, ProtoAttend, can be integrated into a wide range of neural network architectures including pre-trained models. It utilizes an attention mechanism that relates the encoded representations to samples in order to determine prototypes. The resulting model outperforms state of the […]

Read more

Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Introduction This is the official implementation of the paper “Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation“. Installation This repo is built using mmdetection.To install the dependencies, first clone the repository locally: git clone https://github.com/anirudh-chakravarthy/objprop.git Then, install PyTorch 1.1.0, torchvision 0.3.0, mmcv 0.2.12: conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch pip install mmcv==0.2.12 Then, install the CocoAPI for YouTube-VIS conda install cython scipy pip install git+https://github.com/youtubevos/cocoapi.git#”egg=pycocotools&subdirectory=PythonAPI”    

Read more

AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in ‘AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks’ Getting started requirements.txt must be installed for execution. We state our experiment environment for those who prefer to simulate as similar as possible. pip install -r requirements.txt Our environment (for GPU training) Based on a docker image: pytorch:1.6.0-cuda10.1-cudnn7-runtime GPU: 1 NVIDIA Tesla V100 About 16GB is required to train AASIST using a batch size of 24 […]

Read more

Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation

This repo hosts the code to accompany the camera-ready version of “Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation” in EMNLP 2021. Setup We provide our scripts and modifications to Fairseq. In this section, we describe how to go about running the code and, for instance, reproduce Table 2 in the paper. Data To view the data as we prepared and used it, switch to the main branch. But we recommend cloning code from this branch to […]

Read more

A Pytorch Implementation of the Transformer: Attention Is All You Need

Our implementation is largely based on Tensorflow implementation Requirements Why This Project? I’m a freshman of pytorch. So I tried to implement some projects by pytorch. Recently, I read the paper Attention is all you need and impressed by the idea. So that’s it. I got similar result compared with the original tensorflow implementation. Differences with the original paper I don’t intend to replicate the paper exactly. Rather, I aim to implement the main ideas in the paper and verify […]

Read more

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention – 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences. Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models. See README. Efficient Wait-k Models for Simultaneous Machine Translation Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths. […]

Read more

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting

ICCV 2021 Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting Baseline of DKPNet is available. Currently, only code of DKPNet-baseline is released. In fact, MSE in our paper is equivalent to RMSE in academic papers. Please use the word RMSE instead of MSE when refering to the corresponding numerical values in our paper. We are sorry for the mistake and can do nothing to corret it after the camera-ready version deadline. Download the datasets ShanghaiTech A, ShanghaiTech […]

Read more