DeepSpeed logo

DeepSpeed

A deep learning optimization library that makes distributed training easy, efficient, and effective (By Microsoft)
8
16
+ 1
0

What is DeepSpeed?

It is a deep learning optimization library that makes distributed training easy, efficient, and effective. It can train DL models with over a hundred billion parameters on the current generation of GPU clusters while achieving over 5x in system performance compared to the state-of-art. Early adopters of DeepSpeed have already produced a language model (LM) with over 17B parameters called Turing-NLG, establishing a new SOTA in the LM category.
DeepSpeed is a tool in the Machine Learning Tools category of a tech stack.
DeepSpeed is an open source tool with 32.1K GitHub stars and 3.8K GitHub forks. Here’s a link to DeepSpeed's open source repository on GitHub

Who uses DeepSpeed?

Developers
8 developers on StackShare have stated that they use DeepSpeed.

DeepSpeed Integrations

DeepSpeed's Features

  • Distributed Training with Mixed Precision
  • Model Parallelism
  • Memory and Bandwidth Optimizations
  • Simplified training API
  • Gradient Clipping
  • Automatic loss scaling with mixed precision
  • Simplified Data Loader
  • Performance Analysis and Debugging

DeepSpeed Alternatives & Comparisons

What are some alternatives to DeepSpeed?
TensorFlow
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.
PyTorch
PyTorch is not a Python binding into a monolothic C++ framework. It is built to be deeply integrated into Python. You can use it naturally like you would use numpy / scipy / scikit-learn etc.
scikit-learn
scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.
Keras
Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on TensorFlow or Theano. https://keras.io/
CUDA
A parallel computing platform and application programming interface model,it enables developers to speed up compute-intensive applications by harnessing the power of GPUs for the parallelizable part of the computation.
See all alternatives

DeepSpeed's Followers
16 developers follow DeepSpeed to keep up with related blogs and decisions.