Author: Deep Learner
A simple command-line utility for querying and monitoring GPU status
Just less than nvidia-smi? NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome! Self-Promotion: A web interface of gpustat is available (in alpha)! Check out gpustat-web. Usage $ gpustat Options: –color : Force colored output (even when stdout is not a tty) –no-color : Suppress colored output -u, –show-user : Display username of the process owner -c, –show-cmd : Display the process name -f, –show-full-cmd : Display full command and cpu stats […]
Read moreA Python library intended to liberate data scientists and machine learning engineers
lazycluster is a Python library intended to liberate data scientists and machine learning engineers by abstracting away cluster management and configuration so that they are able to focus on their actual tasks. Especially, the easy and convenient cluster setup with Python for various distributed machine learning frameworks is emphasized. Highlights High-Level API for starting clusters: DASK Hyperopt More lazyclusters (e.g. Ray, PyTorch, Tensorflow, Horovod, Spark) to come … Lower-level API for: Managing Runtimes or RuntimeGroups to: A-/synchronously execute RuntimeTasks by […]
Read moreTensorFrames lets you manipulate Apache Spark’s DataFrames with TensorFlow programs
Note: TensorFrames is deprecated. You can use pandas UDF instead. Experimental TensorFlow binding for Scala andApache Spark. TensorFrames (TensorFlow on Spark DataFrames) lets you manipulate Apache Spark’s DataFrames withTensorFlow programs. This package is experimental and is provided as a technical preview only. While theinterfaces are all implemented and working, there are still some areas of low performance. Supported platforms: This package only officially supports linux 64bit platforms as a target.Contributions are welcome for other platforms. See the file project/Dependencies.scala for […]
Read moreA novel evolutionary computation framework for rapid prototyping and testing of ideas
DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanisms such as multiprocessing and SCOOP. DEAP includes the following features: Genetic algorithm using any imaginable representation List, Array, Set, Dictionary, Tree, Numpy Array, etc. Genetic programing using prefix trees Loosely typed, Strongly typed Automatically defined functions Evolution strategies (including CMA-ES) Multi-objective optimisation (NSGA-II, NSGA-III, SPEA2, MO-CMA-ES) Co-evolution (cooperative […]
Read moreMassively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters
Somoclu is a massively parallel implementation of self-organizing maps. It exploits multicore CPUs, it is able to rely on MPI for distributing the workload in a cluster, and it can be accelerated by CUDA. A sparse kernel is also included, which is useful for training maps on vector spaces generated in text mining processes. Key features: Fast execution by parallelization: OpenMP, MPI, and CUDA are supported. Multi-platform: Linux, macOS, and Windows are supported. Planar and toroid maps. Rectangular and hexagonal […]
Read moreA PyTorch library for decentralized deep learning across the Internet
Hivemind: decentralized deep learning in PyTorch Hivemind is a PyTorch library for decentralized deep learning across the Internet. Its intended usage is training one large model on hundreds of computers from different universities, companies, and volunteers. Key Features Distributed training without a master node: Distributed Hash Table allows connecting computers in a decentralizednetwork. Fault-tolerant backpropagation: forward and backward passes succeed even if some nodes are unresponsive or take toolong to respond. Decentralized parameter averaging: iteratively aggregate updates from multiple workers […]
Read moreA Python distributed computing library for modern computer clusters
Distributed Computing for AI Made Simple This project is experimental and the APIs are not considered stable. Fiber is a Python distributed computing library for modern computer clusters. It is easy to use. Fiber allows you to write programs that run on a computer cluster level without the need to dive into the details of computer cluster. It is easy to learn. Fiber provides the same API as Python’s standard multiprocessing library that you are familiar with. If you know […]
Read moreA high performance and generic framework for distributed DNN training
BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on either TCP or RDMA network. BytePS outperforms existing open-sourced distributed training frameworks by a large margin. For example, on BERT-large training, BytePS can achieve ~90% scaling efficiency with 256 GPUs (see below), which is much higher than Horovod+NCCL. In certain scenarios, BytePS can double the training speed compared with Horovod+NCCL. Performance We show our experiment on BERT-large training, which […]
Read moreA lightweight tool for submitting Python functions for computation within a Slurm cluster
What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster.It basically wraps submission and provide access to results, logs and more.Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.Submitit allows to switch seamlessly between executing on Slurm or locally. An example is worth a thousand words: performing an addition From inside an environment with submitit installed: import submitit def add(a, […]
Read more