Implementation of the specific Transformer architecture from PaLM – Scaling Language Modeling with Pathways

PaLM – Pytorch Implementation of the specific Transformer architecture from PaLM – Scaling Language Modeling with Pathways, in less than 200 lines of code. This model is pretty much SOTA on everything language. It obviously will not scale, but it is just for educational purposes. To elucidate the public how simple it all really is. Install $ pip install PaLM-pytorch Usage import torch    

Read more

Official Pytorch code for OW-DETR: Open-world Detection Transformer

[Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah (🌟 denotes equal contribution) Introduction Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on […]

Read more

Ensembling Hugging Face transformers made easy

Ensembling Hugging Face Transformers made easy! Why Ensemble Transformers? Ensembling is a simple yet powerful way of combining predictions from different models to increase performance. Since multiple models are used to derive a prediction, ensembling offers a way of decreasing variance and increasing robustness. Ensemble Transformers provides an intuitive interface for ensembling pretrained models available in Hugging Face transformers. Installation Ensemble Transformers is available on PyPI and can easily be installed with the pip package manager. pip install -U pip […]

Read more

BoxeR: Box-Attention for 2D and 3D Transformers

By Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees Snoek. This repository is an official implementation of the paper BoxeR: Box-Attention for 2D and 3D Transformers. Introduction TL; DR. BoxeR is a Transformer-based network for end-to-end 2D object detection and instance segmentation, along with 3D object detection. The core of the network is Box-Attention which predicts regions of interest to attend by learning the transformation (translation, scaling, and rotation) from reference windows, yielding competitive performance on several vision […]

Read more

MetaMorph: Learning Universal Controllers with Transformers

This is the code for the paper MetaMorph: Learning Universal Controllers with Transformers Agrim Gupta, Linxi Fan, Surya Ganguli, Fei-Fei Li Multiple domains like vision, natural language, and audio are witnessing tremendous progress by leveraging Transformers for large scale pre-training followed by task specific fine tuning. In contrast, in robotics we primarily train a single robot for a single task. However, modular robot systems now allow for the flexible combination of general-purpose building blocks into task optimized morphologies. However, given […]

Read more

Local-Global Context Aware Transformer for Language-Guided Video Segmentation

This repository is an official PyTorch implementation of paper: Local-Global Context Aware Transformer for Language-Guided Video Segmentation. Chen Liang, Wenguan Wang, Tianfei Zhou, Jiaxu Miao, Yawei Luo, Yi Yang arXiv 2022. News & Update Logs: [2022-03-17] Repo created. Paper, code, and data will come in a few days. Stay tuned. [2022-03-18] Inference code, pretrained weights, and data for A2D-S+ released. [2022-03-21] arXiv (full paper available) Instructions on usage Training code and detailed instructions Code for dataset creation Abstract We explore […]

Read more

A huggingface transformers implementation of Transformer Memory as a Differentiable Search Index

A huggingface transformers implementation of Transformer Memory as a Differentiable Search Index, Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler Requirements: python=3.8 transformers=4.17.0 datasets=1.18.3 wandb Note: This is not the official repository. Goal of this repository Reproduce the results of DSI Large, Naive String Docid, NQ10K. According to Table 3 in the original paper, we should have [email protected]=0.347,[email protected]=0.605 Step1: Create […]

Read more

Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space

This is the official repository to the CVPR 2022 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space” This repo is based on the training code in Deit and the tools in timm. Getting Started You will need Python 3.8 and the packages specified in requirements.txt. We recommend setting up a virtual environment with pip and installing the packages there. Install packages with: $ pip install -r requirements.txt Data preparation The layout of Imagenet data: /path/to/imagenet/ train/ class1/ img1.jpeg […]

Read more

It’s like Shape Editor in Maya but works with skeletons (transforms)

What is Skeleposer? Briefly, it’s like Shape Editor in Maya, but works with transforms and joints. It can be used to make complex facial rigs based on joints. It’s especially good for game engines and realtime graphics. Basic workflow You create skeleposer node, make joints, add them to skeleposer and then work with poses. Then you connect controls to the poses and that’s done! In practice, you work with skinCluster and poses at the same time. Features Skeleposer supports a […]

Read more

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

This repo contains the supported code and configuration files to reproduce semantic segmentaion results of TransDA. Paper Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation Abstract After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation. Unfortunately, straightforwardly applying local ViTs in domain adaptive semantic segmentation does not bring in expected improvement. We find that the pitfall of local ViTs is due to the severe high-frequency […]

Read more
1 2 3 7