Machine learning innovation to benefit everyone.
What’s New
-
Latest MLPerf Results Display Gains for All
MLCommons’ benchmark suites demonstrate performance gains up to 5X for systems from microwatts to megawatts, advancing the frontiers of AI -
MLPerf Results Show Advances in Machine Learning Inference
MLCommons establishes a new record with over 5,300 performance results and 2,400 power measurement results, 1.37X and 1.09X more than the previous round. -
MLCommons Adopts the Dynabench Platform
Building Data-centric AI for the Community -
Harnessing Human-AI Collaboration
Dynamic Adversarial Data Collection augments large scale datasets by adding diverse and high-quality data
MLCommons aims to accelerate machine learning innovation to benefit everyone.
MLCommons aims to accelerate machine learning innovation to benefit everyone. Machine learning has tremendous potential to save lives in areas like healthcare and automotive safety and to improve information access and understanding through technologies like voice interfaces, automatic translation, and natural language processing. However, machine learning is completely unlike conventional software -- developers train an application rather than program it -- and requires a whole new set of techniques analogous to the breakthroughs in precision measurement, raw materials, and manufacturing that drove the industrial revolution.
MLCommons aims to answer the needs of the nascent machine learning industry through open, collaborative engineering in three areas:
Benchmarking
Benchmarks provide consistent measurements of accuracy, speed, and efficiency. Consistent measurements enable engineers to design reliable products and services, and enable researchers to compare innovations and choose the best ideas to drive the solutions of tomorrow.
Datasets
Datasets are the raw materials for all of machine learning. Models are only as good as the data they are trained on. Academics and entrepreneurs in particular depend on public datasets to create new technologies and new companies.
Best Practices
Best Practices empower researchers and engineers to more easily exchange models, reproduce experiments, and build applications that leverages machine learning. Improving best practices accelerates progress in, and grows the market for, machine learning.
People’s Speech
The People’s Speech Dataset is among the world’s largest English speech recognition corpus today that is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0. It includes 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. This open dataset is large enough to train speech-to-text systems and crucially is available with a permissive license. Just as ImageNet catalyzed machine learning for vision,the People’s Speech will unleash innovation in speech research and products that are available to users across the globe.
MLCube
MLCube is a set of best practices for creating ML software that can just "plug-and-play" on many different systems. MLCube makes it easier for researchers to share innovative ML models, for a developer to experiment with many different models, and for software companies to create infrastructure for models. It creates opportunities by putting ML in the hands of more people. MLCube isn’t a new framework or service; MLCube is a consistent interface to machine learning models in containers like Docker. Models published with the MLCube interface can be run on local machines, on a variety of major clouds, or in Kubernetes clusters -- all using the same code.