Today, the researchers and engineers behind the MLPerf™ benchmark suite released their first round of results. The results measure the speed of major machine learning (ML) hardware platforms, including Google TPUs, Intel CPUs, and NVIDIA GPUs. The results also offer insight into the speed of ML software frameworks such TensorFlow, PyTorch, and MXNet. The MLPerf results are intended to help decision makers assess existing offerings and focus future development. To see the results, go to mlcommons.org/en/training-normal-05/.

Historically, technological competition with a clear metric has resulted in rapid progress. Examples include the space race that led to people walking on the moon within two decades, the SPEC benchmark that helped drive CPU performance by 1.6X/year for the next 15 years, and the DARPA Grand Challenge that helped make self-driving cars a reality. MLPerf aims to bring this same rapid progress to ML system performance. Given that large scale ML experiments still take days or weeks, improving ML system performance is critical to unlocking the potential of ML.

MLPerf was launched in May by a small group of researchers and engineers, and it has since grown rapidly. MLPerf is now supported by over thirty major companies and startups including hardware vendors such as Intel and NVIDIA (NASDAQ: NVDA), and internet leaders like Baidu (NASDAQ: BIDU) and Google (NASDAQ: GOOGL). MLPerf is also supported by researchers from seven different universities. Today, Facebook (NASDAQ: FB) and Microsoft (NASDAQ: MSFT) are announcing their support for MLPerf.

Benchmarks like MLPerf are important to the entire industry:

Today’s published results are for the MLPerf training benchmark suite. The training benchmark suite consists of seven benchmarks including image classification, object detection, translation, recommendation, and reinforcement learning. The metric is time required to train a model to a target level of quality. MLPerf timing results are then normalized to unoptimized reference implementations running on a single NVIDIA Pascal P100 GPU. Future MLPerf benchmarks will include inference as well.

MLPerf categorizes results based on both a division and a given product or platform’s availability. There are two divisions: Closed and Open. Submissions to the Closed division, intended for apples-to-apples comparisons of ML hardware and ML frameworks, must use the same model (e.g. ResNet-50 for image classification) and optimizer. In the Open division, participants can submit any model. Within each division, submissions are classified by availability: in the Cloud, On-premise, Preview, or Research. Preview systems will be available by the next submission round. Research systems either include experimental hardware or software, or are at a scale not yet publicly available.

MLPerf is an agile and open benchmark. This is an “alpha” release of the benchmark, and the MLPerf community intends to rapidly iterate. MLPerf welcomes feedback and invites everyone to get involved in the community. To learn more about MLPerf Training go to mlcommons.org/en/training-normal-05/ or email info@mlperf.org.