A PyTorch library for decentralized deep learning across the Internet

Hivemind: decentralized deep learning in PyTorch

Hivemind is a PyTorch library for decentralized deep learning across the Internet. Its intended usage is training one large model on hundreds of computers from different universities, companies, and volunteers.

Key Features

  • Distributed training without a master node: Distributed Hash Table allows connecting computers in a decentralized
    network.
  • Fault-tolerant backpropagation: forward and backward passes succeed even if some nodes are unresponsive or take too
    long to respond.
  • Decentralized parameter averaging: iteratively aggregate updates from multiple workers without the need to
    synchronize across the entire network (paper).
  • Train neural networks of arbitrary size: parts of their layers are distributed across the participants with the
    Decentralized Mixture-of-Experts (paper).

To learn more about the ideas behind this library, see https://learning-at-home.github.io or read
the NeurIPS 2020 paper.

Installation

Before installing, make sure that your environment

 

 

 

To finish reading, please visit source site