Res2Net: A New Multi-scale Backbone Architecture

Res2Net The official pytorch implemention of the paper “Res2Net: A New Multi-scale Backbone Architecture” Our paper is accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). We propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-likeconnections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the rangeof receptive fields for each network layer. The proposed Res2Net block can be plugged into the state-of-the-art backbone CNN models,e.g. , […]

Read more

Unsupervised Multi-hop Question Answering by Question Generation

This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NAACL 2021). We propose MQA-QG, an unsupervised question answering framework that can generate human-like multi-hop training pairs from both homogeneous and heterogeneous data sources. We find that we can train a competent multi-hop QA model with only generated data. The F1 gap between the unsupervised and fully-supervised models is less than 20 in both the HotpotQA and the HybridQA dataset. Pretraining a multi-hop QA […]

Read more

Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

groomed_nms GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection CVPR 2021 Abhinav Kumar, Garrick Brazil, Xiaoming Liu project, supp, 5min_talk, slides, demo, poster, arxiv This code is based on Kinematic-3D, such that the setup/organization is very similar. A few of the implementations, such as classical NMS, are based on Caffe. References Please cite the following paper if you find this repository useful: @inproceedings{kumar2021groomed, title={{GrooMeD-NMS}: Grouped Mathematically Differentiable NMS for Monocular {$3$D} Object Detection}, author={Kumar, Abhinav and Brazil, Garrick […]

Read more

Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Introduction This repository is built for the official implementation of: PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds (CVPR2021) [arXiv] If you find our work useful in your research, please consider citing: @inproceedings{xu2021paconv, title={PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds}, author={Xu, Mutian and Ding, Runyu and Zhao, Hengshuang and Qi, Xiaojuan}, […]

Read more

Regularizing Generative Adversarial Networks under Limited Data

lecam-gan Regularizing Generative Adversarial Networks under Limited Data Implementation for our GAN regularization method. The proposed regularization 1) improves the performance of GANs under limited training data, and 2) complements the exisiting data augmentation approches. Please note that this is not an officially supported Google product. Paper Please cite our paper if you find the code or dataset useful for your research. Regularizing Generative Adversarial Networks under Limited Data Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang Computer […]

Read more

A Unified Framework for Self-Supervised Outlier Detection

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021] Pdf: https://openreview.net/forum?id=v5gjXpmR8J Code for our ICLR 2021 paper on outlier detection, titled SSD, without requiring class labels of in-distribution training data. We leverage recent advances in self-supervised representation learning followed by the cluster-based outlier detection to achieve competitive performance. This repository support both self-supervised training of networks and outlier detection evaluation of pre-trained networks. It also includes code for the two proposed extensions in the paper, i.e., 1) Few-shot outlier […]

Read more

A friendly guide to NLP: Bag-of-Words with Python example

1. A Quick Example Let’s look at an easy example to understand the concepts previously explained. We could be interested in analyzing the reviews about Game of Thrones: Review 1: Game of Thrones is an amazing tv series! Review 2: Game of Thrones is the best tv series! Review 3: Game of Thrones is so great In the table, I show all the calculations to obtain the Bag-Of-Words approach: Each row corresponds to a different review, while the rows are […]

Read more

Don’t Sweep your Learning Rate under the Rug- A Closer Look at Cross-modal Transfer of Pretrained Transformers

July 23, 2021 By: Danielle Rothermel, Margaret Li, Tim Rocktäschel, Jakob Foerster Abstract Self-supervised pre-training of large-scale transformer models on text corpora followed by fine-tuning has achieved state-of-the-art on a number of natural language processing tasks. Recently, Lu et al. (2021) claimed that frozen pretrained transformers (FPTs) match or outperform training from scratch as well as unfrozen (fine-tuned) pretrained transformers in a set of transfer tasks to other modalities. In our work, we find that this result is, in fact, […]

Read more

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

Abstract Single channel speech separation has experienced great progress in the last few years. However, training neural speech separation for a large number of speakers (e.g., more than 10 speakers) is out of reach for the current methods, which rely on the Permutation Invariant Training (PIT). In this work, we present a permutation invariant training that employs the Hungarian algorithm in order to train with an O (C 3) time complexity, where C is the number of speakers, in comparison […]

Read more

DeepViT: Towards Deeper Vision Transformer

DeepViT This repo is the official implementation of “DeepViT: Towards Deeper Vision Transformer”. The repo is based on the timm library (https://github.com/rwightman/pytorch-image-models) by Ross Wightman Deep Vision Transformer is initially described in arxiv, which observes the attention collapese phenomenon when training deep vision transformers: In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper. More specifically, we empirically observe that […]

Read more
1 19 20 21 22 23 51