Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch

Channel: Alex Kantrowitz

Published at: 2026-01-16

YouTube video id: xxUTdyEDpbU

Source: https://www.youtube.com/watch?v=xxUTdyEDpbU

What does the AI business look if all
the leading models perform the same,
which they kind of are, we'll find out
with the CEO of Mistral right after
this. Welcome to Big Technology Podcast,
a show for Coolheaded and Nuance
conversation of the tech world and
beyond. We have a great show for you
today. We're going to talk all about
what's happening to the AI business and
technology race as some of the leading
foundational models start to look the
same and how that changes the balance of
power in the industry. We're joined by
the perfect guest to do it. Arthur Mench
is here with us. He is the CEO and
co-founder of Mistral. Arthur, welcome.
>> I'm happy to be here. Uh, and uh, thank
you for hosting us.
>> No, it's it's great to have you. So, uh,
Mr. is a name that those who are deep in
the AI world know very well uh, but
might be new to some of our uh,
listeners and viewers. So, for folks who
are new to MR, let me give you a couple
of stats. It is Mr. is an AI model
builder. Does some other things which
we're going to get to. It's based in
France. Company is valued at $14 billion
after starting in April 2023. So little
under three years or two and a half
years to make a 14 billion business. Not
bad. Uh there's 500 people at the
company. And Arthur, you are leading it
after spending some time in the uh
academy and two and a half years at Deep
Mind.
>> Exactly. uh we're headquartered in in
Paris but we have around four4 workforce
which is actually in the US and a lot of
our activity is actually here so that's
why I'm spending a lot of time as well
and that's why we are here in New York.
>> All right. Well, great to have you in
studio. Uh let's just go right to what I
think is the most pressing pressing
issue for AI today. Uh there's been so
much talk about how Google uh at the end
of 2025
uh started to equal OpenAI's models and
how OpenAI's models were somewhat on par
with others. Uh and to me it seems like
we're just hitting commoditization of
the foundational model much faster than
I thought it would be. I thought that
there was going to be a race where some
companies would leap out further ahead
and would take others some time to catch
up. But it looks like right now you have
lots of model builders with their
frontier models uh exhibiting
performance that's so similar it's
difficult to tell which is the best. So
what do you make of that?
>> I would say that uh inherently this is a
technology that is going to get
commoditized. Uh the reason for that is
that it's actually not hard to build. uh
you have around 10 labs in the world
that know how to build that technology
that get access to similar data uh that
follows the same recipes and algorithms
which are very uh it's very short
actually like the knowledge you need to
actually train a model is fairly short
so because it's short it actually
circulates uh so there's no IP
differentiation gap that you can create
so it's very hard to actually leap frog
and to be way ahead of the competition
because there's some diffusion of
knowledge that is just making everybody
do the same things. And so the question
there is therefore where is the value
acrewing? Uh and what kind of business
model should you pursue to actually make
sure that in the end you're turning
profitable. Uh and then the challenge
that we see with some of our competitors
is that they're investing billions or
hundreds of billions into creating
assets that are deprecating fairly fast
because those are communities. And so
for us it has always been at mist it has
always been question and one of the
biggest question of the industry uh is
that you need to invest enough to
actually bring value to enterprises but
you also need to invest uh reasonably so
that you can build unit economics that
makes sense in a world where the
creation of model which is capital
intensive is actually just bringing you
assets that are just uh in in a
community competition. So let's talk a
little bit then about this race to build
you know the best possible model. I mean
like you mentioned it's very expensive.
Uh OpenAI is going to put $1.4 trillion
into building infrastructure for its
models or at least it says so. um if the
models are effectively at par, are
companies going to say, "Hey, wait a
second. Maybe it doesn't make sense for
us to invest all this money uh into
building the next evolution of a better
model because people can catch up. I
mean, strategically, I think it's it's
definitely uh there's some cursor to be
set. How much do you invest in creating
assets that are valuable enough for for
for one company to bring to for one
technology company to bring value to an
enterprise uh or to bring value to a
consumer. Uh and at the end of the day
all of these investments will need to be
funded by uh the free cash flow and
value creation that is being made
downstream. And so the focus that we
have as a company but that I think is
the reasonable focus is to be more on
the downstream applications and to
figure out what is the friction that
enterprises are look are running into
and try to lift these frictions because
at the end of the day I think one of the
major challenge that the industry is
facing today is that AI brought a lot of
promises like three four years ago. Uh
but if you ask an enterprise did you
actually make money out of it they will
in general say no and the reason for
that is that they are not customizing
things enough uh and they are not uh
thinking backward from the problem they
want to solve. So they think about the
solution uh but they don't think about
the problem and so trying to help them
uh to actually go for the right use
cases and actually do the right amount
of customization so that when when it
was a team of 20 people actually
operating uh some supply chain workflow.
Suddenly you can actually operate that
with two people. Uh and there's a lot of
examples like this. But the the the
challenge that the industry will face is
that we need to get enterprises to value
fast enough to justify all of the
investments that is collectively being
made.
>> Yeah, it is very interesting because for
a long time you would hear these
companies focus model model right the
next what's GPT5 was let's say when you
think about open AI the biggest news.
Now they're starting to talk more about
how do you take the intelligence that
you have and build the applications uh
that work. Just one bit of reporting
that I can share uh a couple weeks ago
um you know I had this this story this
story uh basically inside a lunch with
Sam Alman and a bunch of news leaders in
New York City uh and Altman told him the
companies it's you know one of their
biggest priorities was building
applications for enterprise. Basically,
it's going to be a major uh uh priority
in 2026. And it's a little bit of a
shift in rhetoric from we want to build
AGI to we want to build applications for
business. So talk about why is that
happening? Is that is it an an off
offshoot of this commoditization issue?
>> Well, I think the issue is well first of
all AGI is a very simple concept. So uh
probably too simple for enterprises. Uh
there's no such thing as like one system
that is going to be solving all of the
problems of the world. And so at the end
of the day
>> yet or you just don't believe in that
concept at all.
>> It's never going to exist. I mean
there's you have a wealth of problems
just like you don't have any human that
is able to solve every task on the
world. You of course need to have some
amount of specialization uh to actually
solve problems. And so we're back from
magical thinking to system thinking. Uh
we need to figure out what is the data
that is going to be used to make the
model better at a specific task. what is
the the fi wheel that we need to set so
that we acrewue more signal from humans
interacting with the system so that
eventually the application becomes
better and better and so in real life
enterprises are just complex systems and
uh you can't solve that with like a
single abstraction which is AGI and so
AGI to to a large extent is what we were
not able to achieve and which is
basically the northstar of I'm just
going to make the system better over
time uh but because as you said uh it's
hard to explain no to investors that the
technology you're building is never
going to be matched by your competitors.
Then there's of course a shift in the
narrative that you're not like companies
are not building like a northstar single
system that is going to be solving all
problems but that we'll need to go into
the weeds of enterprises and solving
their actual problems. And I think at
mist we've been ahead of time in
thinking about this. That's that's kind
of that set us our our story. Our story
has been to to assume that eventually AI
will be more decentralized. Uh that more
customization would be needed because we
were running into the limits of of the
amount of data we could acrue and the
limits of scaling loads and because of
that we created the company on that
premise on the fact that we'll bring
more customization ability to enter
prices.
>> Yeah. And we'll get to the MR story in a
little bit but one more question about
this. Uh it seemed to me and I wonder if
you think this has been a shift. You you
were ahead of this for sure. Uh but it
seems to me like there's been a shift in
the AI industry where the idea was um
effectively make the models smarter and
they'll try to figure out these they'll
be able to figure out these problems on
their own. Uh like for instance I'll
just make it concrete make the model
smarter and it will be able to do uh you
know a lower level associates job or
maybe do data entry for multiple systems
and be able to file reports. Um and now
it seems like there's been a shift from
do that to actually build out the
infrastructure that the models are just
one component that the infrastructure is
super important and things like
orchestration and you know working
through the applications that are built
on top of the models is going to be
where the value is found. It's
interesting.
>> Yes. I think if you look at it from a
system perspective you have two
components and we'll always have these
two components. The first components are
like static uh definitions of how what
the workflow should be and what a how a
system should behave and those static
definitions are set by humans that are
defining how the how the system should
behave and so there's a this is this
corresponds to the manual information
that you're using to define the system
and then there's a dynamic component
where uh you're creating
you're connecting a model to tools and
you're you're giving instruction to the
model and the model can go and call the
tools itself. And so it can decide on
the graph of execution that it's going
to follow. And so that part is dynamic
and there's a static part where you're
setting up guardrails or you're deciding
you have a tree of decision sometimes
and I think it's a bit uh utopist and
and irrealistic to think that you can
solve everything with dynamic system
without guidance from humans and what
has happened in the industry the last
three years is that effectively the
dynamic part has grown because models
can think for longer because they can
they can call multiple tools uh because
they can code um but the static part
remains extremely important and Even if
if the dynamic part grows then the
static part allows you to create system
that are even better and more
interesting and you can solve problems
that you were not able to solve before.
So the combination of these static
systems which you can call orchestration
if you want and the dynamic systems that
you can call agents uh is going to stay
super important because the two things
are moving up together so that we can
tackle problems that are more and more
complex.
>> Okay. And so now like with that
established I'm thinking through like
what the businesses let's say the model
has commoditized. So what are the
businesses going to be in AI? It will be
I imagine some form of consumer products
like chat bots where you could put open
AAI in that bucket. Uh there there will
be a business where you could make your
existing products better. Uh like for
instance maybe chatting with Microsoft
Excel. Uh that could be one one you know
way that current companies can make
their products better. But then there is
this other big bucket which we've talked
about a little bit already which is the
enterprise side of things. So how would
you rank the business opportunity in
those three buckets?
>> Well yes I think on consumer side on the
consumer side because AI is starting to
be uh well is becoming the way you you
access information. You basically have
an ads business to be built. Uh and
that's pretty clearly going to be built.
It's not the focus of of our company. Uh
and then if you look at the enterprise
side
we're basically replatforming all
enterprise software. Uh so enterprise is
about having the right in enterprises
you have people you have data and then
you have processes. Uh historically
there was a fragmentation of the tools
to run multiple processes multiple data
systems multiple system of records. Uh
and there was a fragmentation in teams
that were not able to access all
information at the at the same time. And
essentially what AI allows you to do uh
in an enterprise is to have you start
with a unified uh data or even you can
start with fragmented data data sources
because the AI is able to navigate them.
Then you put an AI on top that is
building the right amount of
intelligence understanding what's going
on in the enterprise and then the the AI
system is able to somehat generate the
interfaces that is useful for every
human to actually work. uh and so that
part that replplatforming of the entire
enterprise enterprise software uh stack
is the one thing where a lot of value
can be created in the enterprise owning
the the context engine so the the the
system that is constantly running that
is looking at what's happening and
figuring out uh creating documentation
for what's happening owning the the the
front end as well that are more and more
getting generated on demand uh so let's
say I'm a lawyer I want to fix one my
problem and very specific review to
make. I just bring my document and then
the the system actually evolve in like
showing me the right widgets and show me
the the right information I need. So the
generative uh interfaces on top of a
context engine that is constantly
updating its representation of what's
happening in the enterprise on top of
system of records that are essentially
going to be uh just pure databases. You
you don't need everything that was
sitting on top before. uh this is where
this is going and that replatforming is
going to be I think it's going to take a
decade because it takes a a while to uh
to get enterprises to adopt these things
but that there's just immense value to
be created because suddenly you can
reorganize your company around the fact
that for many of the processes where you
had a lot of people uh you can actually
run those very much faster that's on one
side efficiency and the other thing
which is the most so that's I'd say
that's one of the the business modality
the enterprise the second done in the
enterprise is about working with
enterprises to help them take their
really proprietary data, the assets
being produced by their machines if it's
in the manufacturing industry for
instance, and turning that into
intelligence that nobody else can
reproduce. Uh and so making models
specifically specifically good at a
certain kind of physics when you're when
we're working with with a company doing
planes for instance or when we're
working with ASML making models that are
specifically good at operating their
machines. Uh that's huge value because
suddenly you're not building efficiency
within the company but you're
effectively unlocking technological
progress that was locked uh by the
absence of AI. So, so that unlock that
is that uh that the new systems are
providing that's immense amount of
growth. It's actually harder to measure
because the first one is shorter term.
You can look at what the company will
look like in five years because you
you've reduced certain parts of the
company. You've reoriented other people
to be creating growth. That's you can
create models of that. On the
technological side, I think it's a
little harder because we know there are
things like nuclear fusion or sharper um
engraving of semiconductors for
instance. These are things where we are
starting to run into physical
constraints and artificial intelligence
can actually help to lift those physical
constraints and so the acceleration of
technological progress is I think where
most of the value creation will be. it
will take a little bit of time uh and it
will be less measurable and less
predictable than the efficiency gains
that AI is going to produce. But the two
things are as as important.
>> Okay, so let me see if I can sort of
game this out here a little bit. So if
that is going to be the key driver of
value in the AI world, there's two ways
to do it. One is to build a model that's
better than everybody else and sell it
for a premium. But we've already talked
about the fact that like that doesn't
seem like it's going to be a mode
forever. And the other way is you know
the model is actually not the value it's
the knowhow of and and the
implementation side of things. So you
can make the model open source but then
provide a service to businesses to be
able to figure out how to take that
model and put it into action and
actually get results. Are those the two
choices?
>> Uh yeah that's kind of the fork that we
see in the industry. uh and uh our view
there has been to be on the second one
uh to really
>> the open source implementation
>> which brings customization but it also
brings decentralization in that uh if
you assume that the entire economy is
going to run on AI systems uh well
enterprises will just want to make sure
that nobody can turn off their systems.
So the same way if you have a factory
you connect it to to the grid you want
to make sure that nobody's going to turn
off the grid uh because they don't like
you. Uh if AI effectively becomes a
community which is what's happening uh
and if you treat intelligence as
electricity then you just want to make
sure that your your access to
intelligence cannot be throttled. Uh,
and so that's also one of the thing that
opensource technology can bring. And so
>> if you're using open source, you don't
have to worry about like going astray of
I'm just saying like anthropics, you
know, user uh terms. And so then pausing
your ability to do what you do. If you
use open open source, you can basically
run it on your own terms.
>> Yeah, you run it on your own terms. You
create the own the redundancy you need.
Uh you can serve with higher quality of
service. uh you can make sure that
whatever like the geopolitical situation
may be you can still run the systems if
you want. Uh and then so that's really
on the IT side. So if I'm a CIO I really
look at open source as a way to create
leverage and independence. Uh but on
more on the on the scientific side uh
it's also the only way in which you can
create systems that are effectively
using your the the the folklore
knowledge of your employees. That's the
the knowledge that you've recruited for
decades. The only way in which to turn
it into an asset that nobody get access
to is to create your own models based on
those open source models. And so that's
but it's hard. It's hard to actually
build those, right? And so that's where
you need the right tools. You you need
the right expertise. And that's like the
complement business model to building
open source models.
>> But even the closed source model
providers, companies like Anthropic will
say they'll be able to customize their
models with your data. You don't believe
that?
>> They will say that, but then they will
put some guardrails on top of it. So uh
you're basically trusting that their
engineers are going to give you enough
access to the depth of the system. And
can you trust that for for for eternity?
I'm not sure. Uh so the the issue there
is as much a question of control as a
question of of customization, right?
>> Uh like a vendor is going to try to lock
you in. So if you get access and if you
build on top of open source models like
like our like our open source models or
anyone uh you're basically less locked
into the vendor and this is a technology
which is so important uh that you don't
want to be locked into a single vendor.
So that's also the opportunity we bring.
Uh, you know what's stunning to me? Uh,
it it's we're three years past ChatBT
which basically brought this into a lot
of people's consciousness. Although I
think big technology listeners would
have known about it a little bit
beforehand, especially since we were
interviewing the people that thought
this stuff was sentient before Chad GPT
came out, but that's a conversation for
another time. But but what we're
basically saying today, I'm going to sum
up two of the main points that you've
made. Uh, one is that today's AI models
can't do it all themselves. They need
orchestration. And the second big point
that you made is uh to do that sort of
orchestration or implementation with the
current intelligence you need a service
like a managed service. So I it is
interesting to me that like we've gone
from like this perspective of you know
maybe working towards a god model that
could do it all to the fact that you
know this this may be the most powerful
technology that we've seen come through
in our lifetimes. However when you
actually want to use it you kind of need
it becomes a managed service in a way.
Yes, this is true. I don't think it's
the first time that we observe it in
history. It's a new technology. It's a
new platform. And so you the the
knowledge on how to use it is actually
still pretty scarce. So, uh there aren't
that many people that can build systems
that are performing at scale, uh that
can run at scale reliably, uh that can
actually solve an actual issue. Um, and
so when working with enterprises, you
always need to have some services on top
because of the complexity of of
implementation even with like fairly
well understood technology like like
databases.
>> But for artificial intelligence, it's
even more necessary in that it requires
to transform businesses. So you need to
also help in thinking how the team
should perform around the system itself.
And it does require to customize things.
So you need data scientists that know
how to uh leverage data and turn it into
intelligence and today this is still a
pretty scarce resource. I would say I do
expect the part of the of the software
uh in those deployment to increase. Uh
so the the amount of um the way
customization occurs today uh with
fine-tuning, reinforcement learning,
these kind of things, this is going to
be abstracted away from the enterprise
buyer uh because it's too complex and uh
they actually should just worry about
having adaptive systems that are
learning from experience and from
deployment with people instead of
thinking about should I use fine-tuning
or should I use reinforcement learning
to actually put that knowledge into my
models. And the work that we are doing
is to try and abstract away uh from
lower lower level uh routines uh that
data scientists understand to higher
level systems that business owners can
actually use. Uh and so it's going to
occur uh and we're working on it. But
the but the service part is still going
to be quite important. And today the
combination of the two things is the
fastest way to value if you're an
enterprise. So we've been combining the
two. you know, I I started our
conversation by um calling you a model
builder and I kind of paused on it and I
said in some other things that we're
going to get into it later. Uh and here
we are. Basically, what I'm hearing from
you is that Mral obviously proud model
builder, but um it seems like without
the services, without being able to sit
with a business and showing them how to
use it, uh just would be an incomplete
puzzle. So are do you consider yourself
like as the most important thing you do
building the models or is is the most
important thing you do the service or
are you primarily a model builder or
primarily service provider?
>> I mean we are there to help our
customers get to value.
>> So service
>> we're here to but to get to value they
need to have great models and to get to
value they need to have the right tools
to train the models and so the best way
to train to create to create those tools
is effectively to train the best models.
So the two things are extremely linked
together. uh we create models that are
very easy to customize. Uh we create
models with tools that we then export to
our customers so that they can use them
and we help our customers train their
own models. So you can't go and sell to
an enterprise that you're going to help
them create very custom systems if you
can't show to the world that you're
effectively the leader uh in open source
technology. Uh and so that's uh the two
parts are equally important. uh the
first is enabling the other and there's
effectively a flywheel there because we
make our choices when it comes to the
model design in a way that is enabling
the various customers we have. Uh so one
example is that we've put a lot of
emphasis on having models that are great
at physics because we work with
manufacturing companies that runs into
physical problems. So that's that's the
that's the flywheel that that we have
set up by having the science team and
the and the business team actually sit
together. Okay, we're here with Arthur
Mench. He is the CEO of Mistral, also
co-founder. Uh when we come back after
the break, we are going to talk about
open source uh the open source movement
versus closed source. Remember DeepSeek
and open source was supposed to surpass
closed source. Well, has it? Uh we'll
also talk about the geopolitics and
regulation and whether that's going to
give uh this company a leg up and then
maybe get into some more practical
examples because we should talk about
how the technology is being used on the
ground. We'll be back right after this.
And we're back here on Big Technology
Podcast with Arthur Mench. He's the CEO
of Mistral. Arthur, I want to ask you
about um you know the progression of
open source over the past year. I
remember
reading about deepseek doing reporting
on deepseek in January and the
overriding theme was um it was such a
leap forward for open source that soon
the closed models models like open GPT
and anthropic anthropics claude uh and
maybe Google's Gemini would be surpassed
by open source because uh open source
the open source community was working
together uh and and building on each
each other's innovations where the
closed source community was kind of uh
going at it on their own. Uh we just had
this moment we talked in the beginning
of the show about how uh maybe Gemini
commoditized
GP open AI's GPT models but that
conversation was not being had about
like open source uh being you know
living up to that expectation from the
beginning of the year. So am I missing
something or am I reading it wrong or
what do you think if if something has
held back open source what has it been?
Well, if you look at the trends uh in
2024,
uh I'd say there might have been like a
six months gap. Uh if you look at the
trend in 2025, I think the gap is more
around three months. So I guess it's uh
up to anyone else anyone to guess what
the gap is going to be next year. Uh but
effectively uh this gap has been
shrinking has been shrinking uh quite
significantly.
The reason for that is that basically
you have a saturation effect uh when you
pre-train models uh around 10 to the^ 26
flops. Uh the reason for that is that
there's only that much data you can find
uh to uh compress when you pre-train
models. And so effectively
labs that maybe started a little behind
uh created enough comput capacity to
train models at this kind of scale. and
efficiency has also increased. And so
what it means is that today everybody
has access to 10 to 10 26 uh flops
facilities over the course of a few
months.
>> And that's a measure of compute.
>> That's a measure of compute. And well
you that's a measure of compute times
times. So you you need to um uh yeah 102
per 26 flops is something that any lab
today can achieve in a couple of months.
And because of that uh the saturation
effect means that uh open source models
have caught up because closed source
models that were started ahead kind of
run into that wall of uh of
pre-training. Um so what that means uh
is that this is only going to continue
shrinking. Uh and if you look at like
the latest open release we did which is
death desk 2 which is a coding model
well it's performing I think around the
performance of uh entropic around two or
three months ago. Uh so yeah I think the
gap is shrinkening um and again I think
the question is probably not posed in
the right way that way because it's also
offering very two different distinct
value proposition because on one side
this is well managed and and uh you you
will depend on the provider itself. On
the other side well it takes a little
more effort because you will need to uh
own it more. You will need to learn
about how to customize it. you will need
to use the right tools for doing so. Uh
you will need to maintain its deployment
if you choose to deploy it on your own
facilities. But at the end this is
creating the leverage you need uh for uh
against uh close source providers. So
the two categories are effectively
different but if you look at the pure
performance side they are definitely
converging.
>> Uh you mentioned that there's a
saturation effect. So uh without getting
too technical are are the models sort of
done with getting better like are are
let me put it this way are AI models
going to continue to get better given
the fact that they all seem to be
hitting saturation they will get better
in more and more specific domains uh in
that uh I think we've really
collectively made them very clever and
able to reason about long context and
able to call multiple tools etc. But if
you go and want to effectively put them
into production in a bank or in a
manufacturing company, well, the models
need to learn about the all of the
knowledge that is contained into the
companies themselves. And so what it
effectively means is that for very
precise directions, let's say I want to
make my model extremely good at
discovering materials or extremely good
at designing uh plane uh designing
planes, I will need to go and sweat it a
little bit and and get the right reward
signal and get the right experts and ask
them to make my model specifically good
in that very precise direction. And so
we are definitely not done doing that.
uh because what we are all racing for is
the right environment and the right
signal provider for specific
capabilities. Uh but the broad
horizontal reasoning capabilities we're
still going to improve them but nobody
is going to improve them in a way that
is creating strong that is creating a
strong gap versus its competitors. So
the strong gap is actually in the in the
in working with vertical experts that
know exactly how they design a plane and
that actually explain to the model how
to do it and you have like a wealth of
directions that you can take uh because
you can do it in physics, you can do it
in chemistry, pharmaceutical, in biology
and so to me the most exciting part of
what's going to happen in the next two
years is that explosion of very precise
directions in which the model are going
to get better. So and for us the
opportunity is to have the right
platform for enabling those kind of of
verticalization whether with enterprises
or you have like AI startups actually
that are working on very verticalized uh
um capabilities and we're happy to help
them as well. So that's my view of of
where the field is going to go. We have
been about horizontal intelligence
growing and things getting clever more
and more clever. uh and the next two
years is going to be about taking model
and making them extremely good at a
certain uh skill set. Uh and that's
that's actually more exciting because
we're getting to a point where you pick
a domain can just make it you superhuman
but we're not going to make it
superhuman in every domain at the same
time.
>> Okay. But then on on that note earlier
in our conversation you said that you're
not going to have a model that can do
everything but if that training gets
done in certain verticals why not? Well,
we are also getting to a point where the
verticals that you choose do not really
transfer to the others. So there's no
point in making a model that is good at
very precise biology and very precise
physics. Uh because they are the trans
between those things actually pretty
unclear. The problem is that if you
actually want your model to be able to
solve every problem at the same time,
you're making it very big, very
expensive and very costly to serve. So
specialized models is really you're
going to specialize one for bio, one for
chemistry, one for like this particular
physics problem.
>> Well, it actually makes more makes more
sense because if you want to run it at
scale, if you want it to run on the
background, if you want it to run day
and night thinking about specific
problems,
>> well, you want it to be as small as
possible because the the cost of a model
is actually proportional to its size.
And if you inflate the size by making
the model great at multiple modes, uh
well you're actually not very efficient
if you want to deploy it uh and and use
it as much as possible. So if you look
at the economies of it, it does make
sense to make specialized model in
certain directions.
>> Let me ask you a little bit about the
Mistral competitive uh area. I think
that we're here in the US. I'll just
tell you what people in the US say um
and let you address it because it's
worth talking about. I think there is a
feeling among some not all but some that
you know mistral has been set up in
Europe uh to um effectively take
advantage of regulatory capture because
US companies have a hard time uh
competing in Europe and therefore MR
will be there to like pick up all the AI
business. What do you think about that
argument? Well, you know, we've built
our technology so that we could serve uh
companies and states uh that wanted to
have enough control. Artificial
intelligence is not a technology that
you want uh to fully delegate to a
vendor, especially if it's a vendor that
is from a foreign entity. And that is
that was true before. It was true for
data. It's going to be all the more true
for artificial intelligence for multiple
reasons, but one of them is the the fact
that this is if you're depending on an
external vendor, your your commercial
balance is effectively increasing and
you're importing services and that
becomes a problem long term if you're
importing too much digital services for
instance. Uh so that's one thing. uh and
then sovereignty and this kind of topic
uh is also very important for defense as
if you're an independent country you
want to have independent defense systems
and if you want to have independent
defense systems you will need them to to
you will need your own independent
artificial intelligence because this is
making it into the defense systems so
>> so it's really working for you this
pitch being like we are not an American
company we're based in Europe we'll be
able to help you build whether it's
something uh with like important data
protection in our national security like
defense.
>> Well, it's a technological
differentiation we've built. So, because
we can build on the edge, because we can
deploy wherever our customers wants us
to deploy, uh we effectively can die and
the system is going to still be up,
which is which actually matters for many
many industry and the more critical it
gets, the more it matters. And so what
that that also means is that we can
serve the US uh US customers. Uh we can
serve US customers that want to depend
less on certain providers. uh we can
serve banks that wants to have more
customization, more control that are
more regulated. It also means we can
serve we can of course serve the
European industry uh where historically
that's where we were based. We you you
sell next door when you start your
company and that's what we did. Uh but
we also serve Asian countries uh and
Asian countries they have similar
problems. They want to have a technology
that they can rely on even if we were to
die. Uh they want to have a technology
that they can customize to their own
cultural needs. uh and so that's uh that
has that has been driving our business
for sure that aspect that technological
differentiation that we've built around
control open source uh like a technology
built on open source models around
customization
>> and do you have like European
governments coming to you and being like
we just don't trust Google or anthropic
and we'd prefer not to build on them
>> well we have European governments
actually coming to us because they want
to build the technology and uh they want
to serve their citizens
They want to increase the efficiency of
their public sector and we happen to
have a good uh proposition for them
which is
>> which is deployable on their premises
where we can go send forward deployment
people to help them get to value and it
turns out we're European as well. So
it's actually pretty good for uh for
European country for European countries
to invest in European technology because
the investment they're making the
revenue that they are creating for us is
a revenue that we reinvest in Europe and
we're effectively creating an ecosystem
around us. So that investment of the the
the flow of revenue from European
countries to European technology
provider is something that is very
beneficial and to be honest in the US
that has been working for the next the
last 80 years and I think in Europe we
haven't been doing it enough for sure.
>> Speaking of open-source uh companies or
efforts that have some links to
geography, what do you think about
China's open-source effort? Because
obviously they've made a lot of noise.
Uh it seems like things are going quite
well there.
>> Yeah, I mean China is is very strong on
artificial intelligence. Uh we were the
first actually to release uh open source
models and they realized it was a good
strategy. Uh and they've been they've
proved to be very strong actually. Uh
and so we've been uh not sure if we're
competing because the good thing about
open source it's not really competition.
You you build on top of one another,
>> right? You see everything they have out
there and you learn what works well.
>> Yeah. And the same is true the the
reverse is true. Uh like we released the
first sparse mixture of experts uh back
at the beginning of 2024
>> and they built on top and they released
Deepseek 3 and then
>> Deepseek was built on top of that.
>> Well, it was it's the same architecture
and we released like everything that was
needed to rebuild this kind of
architecture and the same is true. I
mean everything that uh companies that
are investing on open source are
releasing are things that other open
source companies are reusing. uh and
actually it's it's kind of the purpose.
Uh R&D is just much more efficient if
you share your findings across different
labs and so it's been very effective in
China. They they share knowledge across
the different labs. It's been pretty
inefficient here in the US because
there's actually no uh there's like US
incorporated company are not investing
on open source and we've taken the lead
on just being the west open source
provider and uh I think it's going to be
very much needed to have a western open
source provider. What do you think
China's strategy is? And do you think
that there's like in the US there's
often this kind this very large
conversation about u the need to stay
ahead of China? Um do you do you think
there's a risk if China runs away with
this?
>> Well, I think China is very strong. It's
vertically integrated. Uh they have
strong engineers. They have compute.
They have energy. They have everything
they need to compete. Uh Europe also has
everything it needs to compete. I don't
think we'll be in a setting where anyone
is going to have one artificial
intelligence ahead of the others. And if
you look at like the world in it like in
its entirety, every
large enough sovereign entity which is a
big economy is going to want uh some
form of autonomy uh in its usage of AI
and its deployment of AI. So that does
justify the emergence of multiple
centers of excellence. I would say one
of them which is in Europe which is led
by us one other which is more in Nangu
in China and then you have a bunch of
companies here in the west coast.
>> Why do you think it's in China's
strategic interest to develop these open
source models?
>> I mean
>> because they don't have a similar
business as as you do right they're not
real they're not like going out globally
and becoming implementers.
>> They have a big business in in China.
Okay.
>> For sure. uh the companies that are
building open source models in China are
are actually cloud providers in general.
You have a bunch of startups but but you
also have Alibaba which is a cloud
provider right and so they have this
vertical integration that allows them to
create value there internally so in
China but also in the markets where they
are operating and growing. So in Asia
for instance which for us is a is a
place where we we tend to compete with
them not in China itself but in the rest
of Asia. So does make sense uh to for
them to compete internally and then
their best way of accessing the US
market is by just giving the things for
free. Uh and so it does make sense. It's
it's a very natural thing to do uh to
build a business in China which is
protected then to export the thing for
zero. Uh that's I would I would do the
same if I were in their shoes.
>> Right. All right. I want to talk a
little bit before we leave about the
practical uh applications of this
technology that you're building. You
know it's interesting. you were talking
a little bit about AI being used for
physics. Uh AI being used in other um
research applications, AI being used for
defense. Uh none of this sounds like a
chatbot. So talk a little bit about the
applications that you are working on and
whether we're going to see AI move
beyond the chatbot.
>> I mean the chatbot is often times the
interface uh because artificial
intelligence is a generative allows you
to interact with machines in a human
way. So say chatbot is a human machine
interface but it's not the the the rest
it's it's only that. Um now if you look
at the at the actual applications that
are strongly exciting for us you have
two things you have the things that are
really on the end to end workflow
automation that effectively changes the
way a business is fully run. Uh so
examples are like cargo dispatching uh
when we work with SEMA which is a
shipping company and we help them
dispatch all of their all of their
containers when the cargo the the ship
comes into the port and they need to
dispatch everything they need to contact
like hundreds of people they need to
contact the harbor they need to contact
the regulators they need to actionate 20
software differently and so that takes
like I mean I think few hundred people
to do it and by working together around
how to automate those things suddenly
you can save 80%.
>> So the LM is making those communications
and also deciding not not just making
the call but deciding who gets what.
>> It decides and it wires the things and
and you measure whether it's doing the
right thing and if it doesn't then you
improve the system.
>> How's it doing?
>> So it's uh it's working. It's live
actually in certain agencies. So uh so
that's very like it has a to me it's
very exciting because it has a physical
footprint. It takes decisions uh in a
safe way and it's effectively bringing a
very large efficiency gain uh to a
company. Now another example which is
more on the growth side are things that
we do with ASML. Uh we are working with
them on vision systems
>> and talk a little bit about what ASML is
for those that don't know. So SML is a
company that is doing computational
litography and scanning and their role
is to build those big machines that are
effectively engraving the wafers that
are then used uh as the chips in Nvidia
for instance
>> right so they're like key industrial
component of these semiconductor
manufacturing
>> they provide the machines for semi semif
and something so specialized you would
think how's generative AI going to help
them
>> well generative AI is is generally the
generative AI models are predictive AI
models. Uh and one good thing they have
is that they can see and reason about
what they see. Uh and so one of the
thing that SML needs to reason about are
um the images coming out of their
scanners that are verifying whether
there are errors uh in the engraving of
the chips. And it's actually fairly
complex because there's some logical
thinking to be done. And the combination
of images and logical thinking is what
enables us to actually automate those
things much faster. which means that the
throughput down the line of uh fabs is
going to increase and so in that setting
customization is key because the kind of
input that is coming in is nowhere to be
found elsewhere. SML is the only one who
has access to these images and so we we
find like a physical problem that is
effectively a bottleneck in like a
manufacturing process and we go and we
train models that are effectively
solving it. So that's and this is going
to occur like many many different
places. Um and generative is needed
there because you need a model that can
reason about images and so the reasoning
capabilities are are critical but
customizing those reasoning models for a
specific problem with a specific kind of
input is the one thing that is the
unlock there. Yeah, the industrial
applications of general AI to me are are
have been super surprising and
interesting. Like there has been
technology for instance computer vision
technology that can take a look at a
piece of machinery or an output and be
like ah that's that's not good or
actually that's what we need. Right? But
there's there hasn't been this nerve
center uh that it can that that
information can be channeled to and then
sort of made have a decision made about
it and then communicated to somebody in
in the field and that's what this stuff
is enabling is that that that full line
of um technical work is starting to be
able to be done by this technology.
>> Yeah. Basically what you need is are
models that can perceive uh multiple
kind of information and often times in
manufacturing information is visual. So
having very strong visual models is is
super useful. And then based on those
vision models, you can uh on on on these
inputs, you can make choices and you can
rely on the LLMs themselves to
orchestrate calling an agent or going
into the next step of the workflow or
actually calling a tool or writing
something in the database. and and and
that uh having dynamic agents that are
able to see what's happening in a
factory that are able to see what's
happening in a process and that can take
the next step whether it's actually an
automatic step or a call call an agent
step so that they validate a decision is
where a lot of the value can be created
and that's going to reorganize
manufacturing you know manufacturing had
to reorganize itself multiple times when
we invented the steam engine uh we had
to rebuild the entire factories around
like a central steam machine uh because
that was the energy provider. And so
what's going to happen I think in the
next 10 years is that all of the
manufacturing processes will be rebuilt
around LLM orchestrators. And it's super
super interesting because you have
physical problems to solve. The the
system has physical footprint. So
there's some safety issue that you need
to solve. Just the the the complexity of
the system itself is huge. And so that's
that's a fascinating problem for
engineers like us.
Let me see if I'm if I'm getting this
right. Okay. So, uh I think what we're
starting to see is the seeds of this
stuff starting to be able to really have
an impact uh in business. Uh we had we
just did an episode uh with a reporter
who was reporting on how some lawyers
are really able to use this to sift
through documents better. Is it perfect?
No. We heard it in the comments. Not
perfect. Uh but it's it has it's showing
potential. Same same thing in industry
and maybe also in in other areas that
that you touch on. uh but still feels
nent. So what's going to get it from
like where it is today to something
that's like you know effective in a way
that like we really see the impact in
the economy? Is it just like time and
patience on customization or is it
improving improvement of models or um
>> I think models are getting better which
helps uh whenever you have a stronger
model you can trust that it's going to
reason for a longer period of time and
that it's not going to fail uh it's
going to fail less.
>> Mh.
>> But then the thing that needs to be
embraced is uh iterations.
>> Uh you're we're never going to be able
to build systems that work out of the
box in a single shot. And the the one
thing that we try to convey to our
customers is that they need to build a
prototype that's going to work 80% of
the time. But then how do they get from
80% to 99% where they can move the thing
to production? And the way to get it is
to actually get feedback from users. Uh
if the system is not working, if the AI
software you've built is not working, it
means that you need more data and
signal. And that's something that is
quite different from the way we used to
build software. Because when the
software was not working before, you
basically would went back to coding and
you would fix the problem. But because
we're building organic systems, so
systems that imitate humans, the way to
make them better is to give them
feedback and then to retrain the system.
So that's that will take uh the seeds
that you mentioned that will make them
actual valuable things. Uh that's that's
going to work. And you mentioned
lawyers. Uh I think it's one of the area
where it's very knowledge intensive. you
have very little physical footprint. So,
it's a natural it's a lot of
>> and so it's the easiest one. It's the
easiest thing to do. Uh it's not it's
not easy at all. It's not done yet.
There's still a lot of subtleties to to
to fix to make models great at at
lawyering. Uh but if you go into the
physical world, then it gets even more
complex. So, we'll see applications on
the knowledge world go faster into
production than the one on the physical
world. But arguably the one on the
physical world would be more
transformative.
>> Uh that brings us to robotics. So let's
let's end here. Uh people have been
talking about how we could uh see a an
explosion uh in robotics because of LLMs
or the advancements in world models. Uh
but it still seems far off. I mean they
had the uh this demo what was it the Neo
the neouoid robot uh where there's like
a person controlling it till operating
it. Kind of weird. um there they might
be in your house. So uh we haven't seen
progress in robotics you know start to
move as fast as we've seen it uh in the
software side in the large language
model side. So where does that when does
that come if it ever does? I think in
robotics you have the combination of two
things that needs to work. uh hardware
platforms uh that needs to be you need
to have the right actuators with the
right haptic signals uh that needs to be
built at scale with uh with good
economics and this is starting to be
true uh and we've we are we're not the
one working on it but but the the
industry has made a lot of progress on
that domain. Then the other thing is
that you need to be able to have control
system that are sufficiently intelligent
to be deployed on those on those uh
robots. And so that's where actually we
we come in in that again you need to to
have custom models uh because the
problem is the model needs to be
customized to the platform to the
whether it's a humanoid robotic or
whether it's something on wheel or
whether it's a flying drone and it needs
to be customized to the mission uh
because the mission is going to bring
different kind of images. The kind of
actions that can be taken are going to
vary across the mission. Maybe the
guardrails are different. And so that
adaptation to the world and to the
wealth of data that the hardware
platform that is being deployed is
bringing does require the right platform
uh and the right training platform. And
so our bet in robotics and what we've
been doing with multiple companies uh in
defense in particular is uh to build
that platform that allows to train
models fit to purpose that can then be
deployed on the edge potentially. uh
because in robotics
strategically in robotics I believe
we'll see deployment of such systems
first in areas where you don't want to
send humans so firefighting I think is a
very good example so when the risk and
benefit uh the the risk of deploying the
system is way under the benefit of
deploying the system it's going to be
the case in manufacturing as well
because there are places where you just
want the factory to be dark uh and I
think that's where most of a lot of the
value will be created I would say
midterm and then maybe long term you
have things that are sitting in your
house but you know it's a bit dangerous
to have like some pretty strong thing
out there
>> and so the same way we've been waiting
for self-driving car for the last 15
years we'll be probably waiting for like
humanoid robotics inhouse
>> for meaningful time and before that what
we'll see is at scale deployment in
manufacturing uh and that will take the
right software platform and that's the
software platform that we're building.
>> Okay. All right. Really the last one. Uh
we've talked a lot about AI in business.
Uh some businesses have gotten a lot out
of it, some have not. Uh clearly
potential but also just like a ton
of investment. Um is Yeah. What do you
think about the bubble question? Are we
are we in a bubble right now?
>> Well, we're in we're in in a setting
where we need a lot of infrastructure.
So we need to invest and that's what we
we do in Europe for instance.
uh but then the viscosity of adoption in
enterprise is slow is high uh in that it
takes time to understand how to build
the software. It takes some building.
You can't buy off-the-shelf solutions
and then trust that you're going to make
immense uh progress in your
productivity. That has been the
disappointment that a lot of enterprises
went through in the last two years. So
there's some building to be done. you
need to maybe buy the primitives uh buy
a certain number of factorized uh
functions but then you need to bring
your own knowledge onto it. So it takes
some time you need to learn how to build
and then you need to learn how to
reorganize and that takes even longer
because the the teams are going to
change. You need less management because
you need less uh infrastructure to
circulate information because AI allows
to information to circulate faster. you
need uh certain functions are going to
disappear. Uh certain functions are
going to grow. So there's just a lot of
work to be done on reorganizing things
and it will take years. Um and so the
question is the the infrastructure
investment that are being made today.
Are they going to create long-term value
in two years, in five years or in 10
years? And that does define whether some
people are losing money or making money.
That's the uh that's that's the problem
and we don't really know. So maybe
people are overinvesting, maybe people
are underinvesting. Some people will
certainly lose money. Uh some people
will certainly uh like lack miss
opportunities as well. But today I would
say my view is that we're maybe
overinvesting a little bit uh and
overcommitting a little bit not mist but
some others uh because we see how
complex it is to actually create value
in enterprises uh but eventually we'll
get there eventually the entire economy
is going to run on AI systems that's for
sure but it might take 20 years because
it's actually fairly complex.
>> All right the website is mistral.ai AI.
Our guest has been Arthur Men, the CEO
of Mistraw. Arthur, thank you so much
for coming in. Really appreciate being
here.
>> Thank you for hosting me.
>> You bet. All right, everybody. Thank you
for listening and watching, and we will
see you next time on Big Technology
Podcast.