A PyTorch library for decentralized deep learning across the Internet
Hivemind: decentralized deep learning in PyTorch Hivemind is a PyTorch library for decentralized deep learning across the Internet. Its intended usage is training one large model on hundreds of computers from different universities, companies, and volunteers. Key Features Distributed training without a master node: Distributed Hash Table allows connecting computers in a decentralizednetwork. Fault-tolerant backpropagation: forward and backward passes succeed even if some nodes are unresponsive or take toolong to respond. Decentralized parameter averaging: iteratively aggregate updates from multiple workers […]
Read more