Today, the researchers and engineers behind the MLPerf™ benchmark suite released their first round of results. The results measure the speed of major machine learning (ML) hardware platforms, including Google TPUs, Intel CPUs, and NVIDIA GPUs. The results also offer insight into the speed of ML software frameworks such TensorFlow, PyTorch, and MXNet. The MLPerf results are intended to help decision makers assess existing offerings and focus future development. To see the results, go to mlcommons.org/en/training-normal-05/.
Historically, technological competition with a clear metric has resulted in rapid progress. Examples include the space race that led to people walking on the moon within two decades, the SPEC benchmark that helped drive CPU performance by 1.6X/year for the next 15 years, and the DARPA Grand Challenge that helped make self-driving cars a reality. MLPerf aims to bring this same rapid progress to ML system performance. Given that large scale ML experiments still take days or weeks, improving ML system performance is critical to unlocking the potential of ML.
MLPerf was launched in May by a small group of researchers and engineers, and it has since grown rapidly. MLPerf is now supported by over thirty major companies and startups including hardware vendors such as Intel and NVIDIA (NASDAQ: NVDA), and internet leaders like Baidu (NASDAQ: BIDU) and Google (NASDAQ: GOOGL). MLPerf is also supported by researchers from seven different universities. Today, Facebook (NASDAQ: FB) and Microsoft (NASDAQ: MSFT) are announcing their support for MLPerf.
Benchmarks like MLPerf are important to the entire industry:
- “We are glad to see MLPerf grow from just a concept to a major consortium supported by a wide variety of companies and academic institutions. The results released today will set a new precedent for the industry to improve upon to drive advances in AI,” reports Haifeng Wang, Senior Vice President of Baidu who oversees the AI Group.
- “Open standards such as MLPerf and Open Neural Network Exchange (ONNX) are key to driving innovation and collaboration in machine learning across the industry,” said Bill Jia, VP, AI Infrastructure at Facebook. “We look forward to participating in MLPerf with its charter to standardize benchmarks.”
- “MLPerf can help people choose the right ML infrastructure for their applications. As machine learning continues to become more and more central to their business, enterprises are turning to the cloud for the high performance and low cost of training of ML models,” – Urs Hölzle, Senior Vice President of Technical Infrastructure, Google.
- “We believe that an open ecosystem enables AI developers to deliver innovation faster. In addition to existing efforts through ONNX, Microsoft is excited to participate in MLPerf to support an open and standard set of performance benchmarks to drive transparency and innovation in the industry.” – Eric Boyd, CVP of AI Platform, Microsoft
- “MLPerf demonstrates the importance of innovating in scale-up computing as well as at all levels of the computing stack — from hardware architecture to software and optimizations across multiple frameworks.” --Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA
Today’s published results are for the MLPerf training benchmark suite. The training benchmark suite consists of seven benchmarks including image classification, object detection, translation, recommendation, and reinforcement learning. The metric is time required to train a model to a target level of quality. MLPerf timing results are then normalized to unoptimized reference implementations running on a single NVIDIA Pascal P100 GPU. Future MLPerf benchmarks will include inference as well.
MLPerf categorizes results based on both a division and a given product or platform’s availability. There are two divisions: Closed and Open. Submissions to the Closed division, intended for apples-to-apples comparisons of ML hardware and ML frameworks, must use the same model (e.g. ResNet-50 for image classification) and optimizer. In the Open division, participants can submit any model. Within each division, submissions are classified by availability: in the Cloud, On-premise, Preview, or Research. Preview systems will be available by the next submission round. Research systems either include experimental hardware or software, or are at a scale not yet publicly available.
MLPerf is an agile and open benchmark. This is an “alpha” release of the benchmark, and the MLPerf community intends to rapidly iterate. MLPerf welcomes feedback and invites everyone to get involved in the community. To learn more about MLPerf Training go to mlcommons.org/en/training-normal-05/ or email info@mlperf.org.