Inference Working Group

Mission

Create a set of fair and representative inference benchmarks.

Purpose

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf™ Inference answers that call.

Deliverables

Inference benchmark rules and definitions
Inference benchmark reference software
Inference benchmark submission rules
Inference benchmark roadmap
Publish inference benchmark results every ~6 months

Meeting Schedule

Weekly on Tuesday from 8:30-10:30AM Pacific.

How to Join

Use this link to request to join the group/mailing list, and receive the meeting invite:
Inference Google Group.
Requests are manually reviewed, so please be patient.

Working Group Resources

Shared documents and meeting minutes:
1. Associate a Google account with your e-mail address.
2. Ask to join our Public Google Group.
3. Ask to join our Members Google Group.
4. Once approved, go to the Inference folder in the Members Google Drive.
GitHub (public)
1. If you want to contribute code, please sign our CLA first.
2. GitHub link.

Working Group Chair Emails

Ramesh Chukka (ramesh.n.chukka@intel.com)

Tom Jablin (tjablin@google.com)

Working Group Chair Bios

Ramesh Chukka is a Deep Learning Manager at Intel with focus on performance analysis and benchmarking. He has 14+ years of experience leading benchmark development and working with industry benchmark consortiums. Ramesh received M.Tech from IIT Madras and B.E from Andhra University, India.

Tom Jablin is a Staff Software Engineer working on the ML Inference Performance team at Google. He is interested in developing metrics that accurately reflect the experiences of real inference customers and supporting diverse and innovative computer architectures.