A Transformer that Ponders, using the scheme from the PonderNet paper
Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of the input sequence, using the scheme from the PonderNet paper. Will also try to abstract out a pondering module that can be used with any block that returns an output with the halting probability. This repository would not have been possible without repeated viewings of Yannic’s educational video Install $ pip install ponder-transformer Usage import torch from ponder_transformer […]
Read more