Treble Technologies and Hugging Face Address Voice AI’s Unspoken Dilemma With Groundbreaking Benchmark of ASR Models

Industry-first initiative open benchmarking of speech AI under realistic acoustic conditions to be highlighted in June 11 webinar
Key Takeaways
  • Treble Technologies and Hugging Face launch the Far Field ASR (FFASR) Leaderboard, the first open, community-driven benchmark to evaluate ASR models under realistic far-field acoustic conditions on Hugging Face.
  • The leaderboard enables developers and researchers to upload models and assess accuracy across reverberation, background noise, competing speech, and varying room acoustics using Treble’s virtual simulation to mirror real-world deployments.
  • Treble and Hugging Face will host a joint webinar on Thursday, June 11, 2026 to explain the benchmark and participation, and the effort already draws interest from NVIDIA, IBM, and Cohere.

Treble Technologies, the pioneer in cloud-based acoustic simulation and synthetic audio data generation, and Hugging Face, the leading open platform for machine learning, today announced the launch of the Far Field ASR (FFASR) Leaderboard — the industry’s first open, community-driven benchmark designed to evaluate automatic speech recognition (ASR) models under realistic far-field acoustic conditions, which will improve end user experience when interacting with speech recognition engines in real-world deployments. 

The Leaderboard can be found here: https://huggingface.co/spaces/treble-technologies/ffasr.

To coincide with the launch, Treble Technologies and Hugging Face will host a webinar on June 11 explaining the methodology behind the benchmark and how to participate, key industry challenges around real-world speech AI performance, and early insights from the leaderboard, which has attracted interest already from voice-enabled model developers from companies such as NVIDIA, IBM, Cohere and others. 

Available on Hugging Face, the FFASR leaderboard enables developers, researchers, and enterprise users of voice-recognition systems to upload ASR models and evaluate performance across a wide range of real-world scenarios involving reverberation, background noise, competing speech, and varying room acoustics. 

The real-world speech AI benchmarking problem 

Voice interfaces have evolved from basic ways to interact with products to become more useful in everyday devices (phones, watches, earbuds) and increasingly integrated in productivity applications and seamless communication features (vibe coding, humanoids, voice agents, automotive office on wheels). As a result, understanding performance under realistic acoustic conditions is becoming critical for improving reliability, usability, and user trust.  

Current ASR benchmarks used by both developers and users of AI voice often fail to reflect real-world performance, limiting the evaluation of speech recognition technologies that are reflective of these complex and varied use cases. Due to the complexity, scale and cost of building labs and other forms of data collection needed to train and evaluate models for every possible use case, most companies develop and evaluate their ASR systems using clean speech, recorded close to a microphone, and with minimal far field scenarios.  

While many speech recognition models score well under these clean, ideal conditions, real deployments must deal with reverberation, background noise, overlapping speech, microphone distance, and varying room acoustics, all of which can significantly impact accuracy.  

The Leaderboard underscores the importance of understanding how speech recognition systems perform in the environments where they are deployed — including meeting rooms, vehicles, homes, public spaces, and other acoustically challenging settings. This includes contending with a wide range of ambient noises, distractions, operating conditions, and environmental effects that challenge devices’ ability to hear users perfectly, all the time.  

Treble enables the ability to create a wide range of far-field conditions, all virtually.  This is exemplified in the ability to create an evaluation benchmark that is as varied and complex as the real world, but without the challenges of building labs or collecting real-world recordings.  With this underlying simulation technology, Treble for the first time enables a more practical evaluation vehicle to measure real-world ASR performance.  

Dr. George Saon, Manager Speech Technologies at IBM Research, noted, "Even though most people take it for granted, at IBM we believe that automatic speech recognition is not a solved problem. That is why the FFASR leaderboard is a helpful tool for measuring ASR progress in challenging acoustic environments.” 

“Open leaderboards have played a key role in advancing the field, and I hope FFASR helps bring greater attention to far-field ASR as a critical real-world setting. For example, it is crucial in Cohere’s North platform, where enterprise customers need high-quality meeting transcription even in challenging environments, such as large conference rooms with a single microphone,” said Julian Mack, Member of Technical Staff, Foundations at Cohere. “As a model developer, it is very useful to see the dry speech word error rates reported alongside the far-field ones as this helps separate core recognition quality from the additional challenges introduced by these real-world acoustics.”  

“The speech recognition industry has lacked a non-proprietary, community-driven way to measure how models perform outside ideal laboratory conditions,” said Dr. Finnur Pind, CEO and co-founder of Treble Technologies. “The Far Field ASR Leaderboard demonstrates how the Treble approach can help developers now evaluate models against the kinds of acoustic challenges users encounter every day. By partnering with Hugging Face, we’re making realistic, transparent evaluation accessible to the broader speech AI ecosystem.” 

“As voice interfaces expand into smart glasses, robotics, and other hands-free applications, evaluating ASR performance in noisy and far-field environments becomes increasingly important,” said Eric Bezzam, Audio ML Engineer at Hugging Face. “The FFASR Leaderboard is a significant step toward real-world evaluation. By combining Hugging Face's ML tooling with Treble's advanced acoustic simulation capabilities, it provides key insight into how models perform in far-field conditions, helping developers build more reliable voice-enabled products.” 

Get Started 

The Far Field ASR Leaderboard is available on June 11 on Hugging Face for developers and researchers to evaluate their own ASR models and explore the latest benchmark results across real-world acoustic scenarios. 

The Explainer Webinar: What to Expect 

To introduce this groundbreaking benchmark to the global machine learning and audio engineering communities, Treble and Hugging Face will host a joint live webinar on Thursday, June 11, 2026

The session will feature key insights from industry experts, including Eric Bezzam (Audio ML Engineer at Hugging Face), Dr. Daniel Gert Nielsen (Senior Product Manager for the Treble SDK), Professor Shinji Watanabe (Carnegie Mellon University), Nithin Rao Koluguri (Senior Research Scientist at NVIDIA), Julian Mack (Member of Technical Staff, Foundations at Cohere), and Dr. George Saon (Manager Speech Technologies at IBM Research) covering: 

  1. The FFASR Structure: How to navigate the interactive leaderboard, submit models, and interpret performance metrics across different acoustic environments. 

  1. The Physics of Far-Field Data: An inside look at the high-fidelity simulation data powering the benchmark and why acoustic complexity is required to achieve real-world model robustness. 

  1. Interactive Q&A: A live forum answering technical implementation and benchmarking questions for participants. 

Webinar Event Details 

  • Date: Thursday, June 11, 2026 

  • Session Options:  

  • Europe: 9:00 AM UTC / 10:00 AM CET 

  • USA: 9:00 AM PDT / 12:00 PM EDT / 4:00 PM UTC 


About Treble Technologies 

Treble is a leading-edge technology company revolutionizing how the world models sound. Utilizing its proprietary, cloud-based simulation engine and advanced SDK, Treble bridges the gap between physical acoustic measurements and scalable virtual prototyping. Treble’s solutions enable spatial audio research, precision building design, and high-throughput synthetic data generation for the world’s most advanced audio AI systems. Using the Treble platform, developers and device manufacturers can generate custom synthetic datasets and create application-specific acoustic evaluation scenarios tailored to their own deployment environments. For organizations seeking faster evaluation and training capabilities, Treble also provides access to pre-built far-field datasets designed for ASR development, testing, and model optimization. www.treble.tech  

About Hugging Face 

Hugging Face is the collaboration platform for the machine learning community. The Hugging Face Hub works as a central place where anyone can share, explore, discover, and experiment with open-source ML. HF empowers the next generation of machine learning engineers, scientists, and end users to learn, collaborate, and share their work to build an open and ethical AI future together. With the fast-growing community, some of the most used open-source ML libraries and tools, and a talented science team exploring the edge of tech, Hugging Face is at the heart of the AI revolution.

Resources
Media Gallery
“Open leaderboards have played a key role in advancing the field, and I hope FFASR helps bring greater attention to far-field ASR as a critical real-world setting. For example, it is crucial in Cohere’s North platform, where enterprise customers need high-quality meeting transcription even in cha...
Julian MackMember of Technical Staff, Foundations at Cohere
Treble Technologies
Vineet Ganju
Vg@treble.tech
1 949 294 5018
Wired Island PR
Mike Sottak
mike@wiredislandpr.com
+1 650 248 9597