
In this episode of AI Unveiled, I sit down with Jean-Louis Quéguiner, Founder and CEO of Gladia, a cutting-edge audio infrastructure platform powering voice-first applications. Gladia’s mission? To make voice the ultimate interface for human and machine interactions, starting with state-of-the-art transcription.
Jean-Louis shares his journey from leading AI and big data infrastructure at OVHcloud to launching Gladia, and how the company pivoted from offering 500 APIs to focusing exclusively on voice solutions. He dives into the challenges of building and maintaining real-time voice infrastructure and tackling hallucinations in speech-to-text. Gladia doesn’t do pre-training and instead fine-tunes foundational models. We discuss the risk of commoditization as multimodal models mature and how Gladia is able to differentiate by ingesting and modeling their customer’s specific business context and data,
The conversation explores AI’s impact on industries like healthcare, education, and law, as well as the future of real-time transcription and multimodal AI. Jean-Louis also provides insights into the importance of customer intimacy, how Gladia competes with hyperscalers, and why a long-term focus on real-world business problems is more important than chasing hype.
Tune in to learn how Gladia is redefining voice as the interface of the future and what it takes to build a high-impact AI infrastructure company.
Enjoy!
Timestamps
* 4:58 – What should come first: product or customer development?
* 9:41 – Industries that will be most impacted by voice AI
* 14:25 – Tackling hallucination
* 39:56 – The impact of multimodal AI
* 44:48 – How AI founders can stay ahead of trends
Highlighted Excerpts:
JEAN-LOUIS: I think there’s something that you see with second time founders a lot. I’m meeting a lot of them because I’m having problems that they used to have. And it’s funny to see how they are not running heads down into an idea. When you have first-time founders, you have people that are burning, crazy amounts of energy to build something because they are just crazy.
They love the idea, and they want to go through it. When you see second-time founders, they are more like calm people. They study the market better. They don’t write a single line of code. They talk to as many people as possible. They don’t write anything until they find a market. So you don’t have the same energy, but you have more efficiency.
JEAN-LOUIS: When you’re a tech founder, you think the more features you have, the better it is. It’s not the case. You need to be extremely focused, and by doing everything, I realized we were doing nothing.
JEAN-LOUIS: Voice is the most natural way for humans to interact, not just with each other but also with machines. The interaction between humans and machines will only become more natural over time.