People’s Speech

The People’s Speech Dataset is among the world’s largest English speech recognition corpus today that is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0. It includes 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. This open dataset is large enough to train speech-to-text systems and crucially is available with a permissive license. Just as ImageNet catalyzed machine learning for vision,the People’s Speech will unleash innovation in speech research and products that are available to users across the globe.

  • Date 2022-11-17
  • Hours +30 K
  • Examples
    23.7 Millions
  • Audio Format FLAC