Description
If AI is to power the entire economy, inference must become affordable, scalable and widely available.
In this third part of Opening Voices, Quentin Adam continues the conversation with Steeve Morin, founder and CEO of ZML, to explore what it really takes to industrialise inference. They discuss:
why AI must move from “chatbots as products” to AI as an infrastructure primitive
why inference will power every sector — banks, startups, industry
how efficiency gains (sometimes 5x, 10x, even 100x+) are still possible
why GPUs are not the only path forward
how new chips (TPUs, NPUs and emerging players) are reopening the semiconductor market
why power, density and optimisation now matter more than raw experimentation
This episode explains why the next wave is not about building better models, but about making inference economically viable at scale.
—
Episode Chapters: Making Inference Available
00:00 – Introduction and Context
01:38 – AI as a Primitive vs. AI as a Product
04:19 – The Economic Unit of the Token
05:15 – Scaling Compute for Inference
07:31 – A Revolution Comparable to Mobile
08:43 – Beyond GPUs
10:56 – Compiler Errors and Efficiency Waste
12:38 – Understanding Chips
15:31 – The New "Blue Ocean" of Semiconductors
20:40 – Nvidia's Strategy and Competition
21:43 – Conclusion and Next Episode
Hosted on Ausha. See ausha.co/privacy-policy for more information.





