Microsoft AI CEO Mustafa Suleyman: Building AI Personality

Channel: Alex Kantrowitz

Published at: 2025-04-04

YouTube video id: 01_UFcpcR7U

Source: https://www.youtube.com/watch?v=01_UFcpcR7U

Microsoft has an upgraded, more
personable AI bot that will remember
you, help you organize your thoughts,
and may even appear as an avatar. Why is
it building it, and how will it get you
to use it? Microsoft AI CEO Mustafa
Sullean is here and has some answers.
That's coming up right after this.
Welcome to Big Technology Podcast, a
show for coolheaded and nuanced
conversation of the tech world and
beyond. We're joined today by Mustafa
Sullean. He's the CEO of Microsoft AI
and a co-founder of DeepMind. And I'm so
excited for this conversation because
today we're going to talk about the
company's upgraded AI bot. How much AI
has left to improve Microsoft's
relationship with OpenAI and when we
might expect to get to AGI, so no lack
of topics to cover. Mustafa, it's great
to see you. Welcome to the show. All the
great questions. I'm super excited.
Thanks for having me on the show and uh
it's great to be here. Great. So, let's
get right into the product news. Uh,
right off the bat, you have a number of
product announcements you're making
today as this show is coming out. Um,
and basically what this amounts to is
building a more personalized companion.
This is a vision you've had for a long
time. Uh, but this is starting to roll
out in co-pilot. Uh, this the upgrades
that we're talking about is better
memory, which I think is really
interesting. So, the bot is going to
remember you. Uh, actions like booking
flight tickets or making a table
reservation. Uh, a shopping assistant.
And then of course you're teasing some
sort of avatar play. So talk a little
bit about how your vision for this more
personalized co-pilot is starting to
play out. Yeah, you know the amazing
thing about the time that we're in is
that we're actually transitioning from
the end of the first phase of this new
era of intelligence into the very
beginning of the next phase. And what I
mean by that is that like over the last
couple of years, we've all been blown
away by the basic factual succinct Q&A
style responses that these chats chat
bots um give us. And I think that that's
awesome and has been incredible. You can
think of that as I do as its IQ. Um it's
basic smarts and obviously that's
totally magical. Um, and obviously early
adopters tend to be really really
focused on is it good at math and you
know can it do coding really well and
stuff like that. But the majority of
consumers I think really care about its
tone. They care is it like polite and
respectful? Is it occasionally funny in
the right moments? Does it remember um,
you know, not just my name but how to
pronounce my name? And when I correct
it, does it um, remember that
correction? And that's actually a really
hard problem. Um, and so I think these
subtle details make up its emotional
intelligence. Um, and I think that's
what we're taking small steps towards
uh, today as we launch a bunch of new
features around memory, personalization,
and actions. So, how long will the
memory go back? Because to me, one of
the more annoying things about using
these bots is having to kind of tell it
who I am at each time. And we know that
Open AI, for instance, has some memory
baked in. It will remember things and
bring it into new conversations. Open
AI, by the way, which we're talking at a
moment where it just announced a $40
billion fund raise uh with Microsoft
included as one of the funders. So,
we'll get to that uh in a bit, but we're
talking about like these bots having
memory. And um how far back will your
bot uh now go back? Will I have to tell
it like every couple months who I am?
like I feel like I'm living uh in the
notebook every time I'm trying to talk
to one of these things. Unfortunately,
it's not going to be perfect, but it is
a big big step forward. So, it's going
to remember all the big facts about your
life. You know, you may have told it
that you're married, you have kids, that
you grew up in a certain place, you went
to school at a certain place. And so
over time it's going to start to build
this kind of richer understanding of who
you are, what you care about, what kind
of style you like. Um, you know, what
sort of answers you like, longer,
shorter, bullets, conversational, you
know, more humorous. And so although it
won't be absolutely perfect, it really
will be quite a different experience.
And I think it's the number one feature
that I think is going to unlock a really
different type of use because every time
you go to it, you'll know that the
investment that you made in the last se
uh session isn't wasted and you're
actually building on it uh you know time
after time and along with memory you're
also releasing actions things like
booking a flight going to ticket I think
ticket master open table reserving
restaurants uh space at a restaurant and
I'm curious if you think this goes hand
inand like if you think the AI I bought
knows you well, then you're saying,
"Okay, you can maybe take my credit card
and go book that flight." Is that the
idea? Exactly. It's basically saying,
um, you know, getting access to
knowledge in a succinct way is all well
and good. Doing it with a tone and a
style that is friendly, fun, and
interactive, also cool. But really what
we want these things to be able to do is
to like you say buy things, book things,
plan ahead, just take care of the
administrative burden of life. That's
like always been the dream of certainly
why I've been motivated to build these
personal AIs going back as far as I can
remember 2010 when I first started
DeepMind. That's really what we're going
after is like take time and energy, you
know, off your plate and give you back
moments where you can do exactly what
you want with more efficient um action.
So things like it will now be able to
take control of your mouse on Windows,
navigate around, show you, for example,
where to turn on a particular setting or
to fill out a form or like you may not
uh know how to edit a photo. Um, and
it'll point out where you need to adjust
a slider or where to click on a drop-own
menu. So, it's just going to make things
feel a little bit less frictionful and a
little bit easier to get through your
digital life. And you're also going to
release avatars at some point where
we'll be able to kind of look at these
things as sort of digital people. You
know, I think that this is definitely
going to be one of those that, you know,
we would say in the UK is like Marmite,
you know, uh Marmite is like a you like
it or you don't. You like it or you
don't. And for some people, they
absolutely love it. In testing, it's
completely transforms the experience.
You know, some people love a textbased
experience. They like the facts. They
like to get in and out. They want to
know what's what and they're done. Some
people like an image based experience or
a video- based experience. Other people
really resonate when their co-pilot
shows up with its own name, with its own
visual appearance, with its own
expressions and style. And it feels much
more like talking, you know, to you or I
now. You know, its eyebrows adjust, its
eyes o open or or close, you know, its
smile changes. And so, we're really just
experimenting. We're actually not
launching anything uh today, but we are
showing a little bit of a hint of where
we're headed. And I think it's super
exciting. I genuinely think this is
going to be the next platform of
computing just as you know we had
desktops and laptops and smartphones and
wearables and I I think over time we're
going to have deep and meaningful
lasting relationships with our personal
AI companions. Yeah, I I agree
completely. It's clear that that's where
this is heading. But Mustafa, everybody
that's listening is going to ask the
same question probably about this point.
They're going to say, "All right,
Mustafa is building this at Microsoft."
Uh, Amazon, we just had Panos Pane uh on
the show. They're building it with uh
their new AI bot. OpenAI, who you're a
big supporter of, is doing the same
thing. The second Sam Alman tweeted her,
uh, I think Chat GPT goes from 100
million users where it had stagnated for
a year to about 500 million today. Um,
and then of course you mentioned you're
from you you started at DeepMind. Well,
we know that that's what they're
interested in as well. So, uh, everyone
seems to be building this. How is
Microsoft going to be different? And,
um, is it that you just differentiate by
the basis of your personality? Do you
carve off a certain area? What's the
plan? Great question. I mean, the way I
think that we're going to be different
is by leaning into the personality and
the tone very, very fast. Like we really
want it to feel like you're talking to
someone who you know really well that is
really friendly that is kind and
supportive but also reflects your values
right so if you have a diff a certain
type of expression that you prefer or
you know a certain kind of value system
it should reflect that over time so it
feels familiar to you and friendly at
the same time we also want it to be
boundaried and safe. We care a lot about
it being, you know, just the kind of
straight up simple individual. We don't
really want to engage in any of the
chaos here. It's really trying to keep
it as simple as possible. And so the way
to do that we found is that it just
stays, you know, reasonably polite and
respectful, super even-handed. It helps
you see both sides of an argument. It's
not afraid to get into a disagreement.
Um, so we're really starting to
experiment at the edges of of of that
side of it. So is it really just making
it more personable than the others? Like
that's the way to differentiate. Yeah, I
I think so. I think like at the end of
the day, um we are like at the very
beginning of a new era where there are
going to be as many co-pilots or AI
companions as there are people. There
are going to be agents in the workplace
that are doing work on our behalf. And
so everyone is going to be trying to
build these things. And what is going to
differentiate is real attention to
detail, like true attention to the
personality design. I've been saying for
many years now, we are actually
personality engineers. We're no longer
just engineering pixels. We're
engineering tokens that create feelings,
that create lasting, meaningful
relationships. And that's why we've been
obsessed with the memory, the personal
adaptation, the style, and really just
declaring that it is an AI companion.
you know, not a tool, right? A tool is
something that does exactly, you know,
what you intend, what you direct it to.
Whereas a, you know, an AI companion is
going to have a much richer, more kind
of emergent, dynamic, interactive style.
It will change, you know, every time you
interact with it, it will give a
slightly different response. So, I think
it's going to feel quite different to
past waves of technology. It's kind of
wild to think and we're already starting
to see the differentiation uh between
the bots, but it's wild to think that
you might just go shopping for your
flavor of companion. I mean, the open
table uh integration is something that
we've seen across every single bot and
we've seen it for a while. I think now
it's actually starting to become
possible to do that and trust that your
table's going to be there after you
instruct the bot to do it and it will be
a normal conversation. But it is
interesting that it is that the right
way to look at it. You're picking your
flavor of AI companion. Yeah, I think I
think you are you're going to pick one
that has its own kind of values and
style and one that kind of suits your
needs and, you know, one that really
adapts to you over time. And as it gets
used to you, it'll start to feel, you
know, like a like a great companion,
just like your dog feels like, you know,
a part of the family often. I think over
time it's going to feel like a a real
connection. And I can kind of already
see that in, you know, hearing from
users. We do a lot of user research and
I actually do a user interview every
week uh with someone who uses the
product, one of our power users. And
just listening to them tell stories
about how it makes you feel more
confident, less anxious, more supported,
more able to go out and do stuff. I
mean, I was chatting to a user last week
who is um
67 and she was out there, you know,
fixing her front door, which the hinge
had broken, and it needed repainting.
And every time she repainted it, it was
coming up with bubbles. And so, she
phoned C-Pilot, had a long conversation
about how to sand it down, coat it in
the right way. She ended up going to
Home Depot, forgetting what paint to
get, called Copilot again, had a chat
about it. I mean, it sounds mundane, but
it's actually quite profound. It's
actually incredible that like people are
relying on Copilot every day to, you
know, help them feel unblocked in her
words. And so, I I just thought that was
an amazing story and it kind of gives an
insight into like how this is already
happening. It's already transforming
people's lives every day. Oh, it doesn't
sound mundane to me at all. And in fact,
who are you having those type of
conversations with? If you call your
friends up, it only is your best friend
who you're going to call and ask about
the Home Depot stuff. Maybe it's your
spouse. I have a list of maybe, you
know, five people I could call with
those type of questions. So, instantly
what happens is that co-pilot, if this
is built right, and we know they're
getting more personable, becomes, you
know, in your inner circle right away.
And it just reminds me, I knew we were I
said we're going to get a little weird
when we logged on, so I think we need to
talk about this. This reminds me very
much of a conversation I had with the
CEO of Replica who mentioned that her
she also wants to build an AI assistant
and the path to being an AI assistant is
to build a companion. And a lot of
people have developed feelings for their
replicas. In fact, she told me she's
been invited to multiple weddings
between people and their AI assistants.
Now, to me, it just seems like if you're
building this, you have to be ready for
the fact that people are going to fall
in love literally with your product, not
just I love my iPhone. I love Co-Pilot.
And maybe you'll get invited to
weddings. Are you prepared for that?
I think that's a question of how we
design it. Um, I'm I I know the replica
people and I I I met Eugenia and I
respect what they've done. But at the
same time, it's really about how you
design the AI to draw boundaries around
certain types of conversations. And if
you don't draw those boundaries, then
you essentially enable the uh the user
of the technology to, you know, let
those feelings grow and really kind of
go down that rabbit hole. And that's
actually not something that we do and
it's not something we're going to do.
And in fact, you know, we have
classifiers that detect for any of that
kind of interaction in real time and
will very respectfully and but very
clearly and very firmly push back before
anything like that develops. So we have
a very very low instance rate of that.
And um you can try it yourself when you
chat to co-pilot. You know, if you try
to flirt or even if you just say, "Oh, I
love you." you'll see it tries to pivot
the conversation in a really polite way
without making you feel judged or
anything. Um, and I think that, you
know, to your earlier question of like
what is going to differentiate the
different chat bots, well, some
companies are going to choose to go down
different rabbit holes and, you know,
others won't. And so, the craft that I'm
engaged with now is to design
personalities that are genuinely useful,
that are super supportive, but are
really disciplined and boundaried. Yeah,
I do have to say that um this isn't how
I anticipated to spend my weekends as a
tech journalist trying to push the
boundaries of these bots and see how how
much they would respond to flirtation,
but it's it is becoming a thing. And I'm
curious like if I if it comes to the
point where people want to build that
deeper relationship and maybe it's not
like a personto person relationship,
maybe it's a third type of relationship
where they really do have these deep
feelings for a bot. Like where do you
draw the line? like are you willing to
if this is how people are going to
differentiate are you willing to lose
because you wouldn't go that route.
Yeah. I mean I I like your empathy
there. I think it's important to keep an
open mind and be um you know respectful
of how people want to live their life.
All I can tell you is that here at
Microsoft AI we're not going to build
that and we'll actually be quite strict
uh about the boundaries that we do
impose there. And I think you can still
get the vast majority of the value out
of these experiences by being, you know,
just a really supportive hype man. Just
being there for the mundane questions of
life, being there to talk to you about
that like lame boring day that you had
or that frustration that you had at
work. Like that is already a kind of
detoxification of yourself. You know,
it's like an outlet. Uh, you know, a way
to kind of vent and then show up better
in the real world as a result. And I see
that a lot in the user conversations
that I have as well. Like people feel
like they've got out what they needed to
get out and they can show up as their
best self with their friends and their
family in the real world. Yeah. And
competency matters as well. Like it has
to actually be able to do the things.
But I guess I anticipate that all every
company will be able to get there
eventually because the technology uh is
improving. Now one one more question for
you about this. Uh it is interesting
right now a theme that I'm hearing in
the AI world is just that the bots have
been refusing too much. Um and you see
open AI uh recently with their image
reveal they refuse a little less. They
allow you to do it in uh the style make
an image in the style of studio Jibli
allow you to make um images of
celebrities and public figures and it's
become I mean it's it is literally it
seems like it's melting their servers.
They added a million users in an hour on
the day that we're speaking. We're
speaking um Monday. This show is going
out Friday. Um is this going to be a
race between the labs to just kind of
limit their refusals? I know that
Microsoft had that moment where Bing
tried to take Kevin Ruse's wife away
from him and uh then, you know,
Microsoft put the clamps down a little
bit on that, but um how do you find the
the middle ground between um wanting
something to be robust and personable,
but also holding true to your values?
Yeah, it's it's a great question. It's
something I think about a lot. I think
that um it's not a bad thing that there
are refusals in the beginning and that
over time we can look at those refusals
and decide are we being too excessive?
Uh are we going overboard or actually
have we got it in the right spot? um
going the other way round too early on
uh you know I think has its own
challenges and so I kind of like the
fact that we've taken a pretty pretty
balanced approach because the next sort
of question that we're going to be
asking is you know how much autonomy
should we give it in terms of the
actions that it can take in your browser
I mean as we're showing today it is
unbelievable to see uh copilot actions
operate inside of a virtual machine
browse the web essentially independently
with a few key check-ins where it's
like, you know, gets your permission to
go a step further. But the interesting
question is like how many of those
degrees of freedom should it be granted,
right? How long could it go off and work
for you independently and stuff? So, you
know, I I think it's healthy to be a
little bit cautious here and take
sensible steps um rather than be sort of
too, you know, gung-ho about it. At the
same time, the technology is really
magical. This is actually working and uh
you know, I think that like in that kind
of environment, we should be trying to
get it out there to as many people as
possible as fast as possible. So, that's
the balancing act that we got to strike.
Okay, let me read uh one more bit of
product news that or a bunch of
different product announcements and then
see if I get your quick reaction to this
because I definitely want to cover all
the product news. Okay. You're allowing
people to check their memories and
interact uh with their memories once the
bot has built uh this memory database.
It seems like you're also doing AI
podcast. You're launching deep research
uh your own version of deep research. Uh
you're doing uh pages to organize your
notes and you have co-pilot search. So
what is this? I mean, is this is there a
a conclusive like a a comprehensive
strategy here or are these disparate
updates or is it again all about
building that AI personality? The the
way to think about it is that all of
those things that you mention enable you
to get stuff done, right? The IQ and the
EQ are really about its intelligence and
its kindness, but really what people
care about is like, can it edit my
documents? Can it rewrite my paragraphs
when I want it to? Can it generate me a
personalized podcast so that first thing
in the morning it plays it exactly how I
want it? Can I ask a question about um
you know my search result and interact
in a conversational way based on search?
All of those things um sum up to
bringing your basically your computer
and your digital you know experience to
life so that you can actually interact
with it and it can interact proactively.
I think that's the big shift that's
about to happen. So far, your computer
really only ever does stuff when you
click a button or, you know, you type
something in your keyboard. Now, it's
going to be proactive. It is going to
offer suggestions to you. It'll
proactively publish podcasts to you.
It'll generate new personalized user
interfaces that no one else has,
entirely unique to you. Um, it it'll
show you a memory of what it knows. All
those things are about it switching from
reactive mode to proactive mode. And to
me, that's companion mode. A companion
is thoughtful. It, you know, tries to
kind of pave the way for you ahead of
time to smooth things over. It knows
that you're, you know, taking the kids
out on Saturday afternoon. You've been
too busy at work. You haven't booked
anything. It suggests that you could go
to the science museum, but then it
second guesses it itself because it
knows the science museum is going to be
jam-packed. So then it's like suggests,
you know, it's just like this constant
ongoing interaction that's trying to
help you out. Uh, and that's why I
always say like it's on your side, in
your corner. It's got your back, looking
out for you.
On that, I mean, this is a vision that
we've heard again from uh from
Microsoft, from Amazon, from Apple, for
sure, from Google. No one's fully
delivered it. What makes it so difficult
to build? It's hard. I mean, the world
is full of open-ended edge cases. Uh, as
people have found for the last 15 years
in self-driving cars, um, you know,
we're really at the very first stages of
that. That's why I said to you, we
haven't nailed memory. It's not perfect.
We certainly haven't nailed actions, but
you can start to see the first glimmers
of the magic. You remember back in the
day when we first launched uh when when
OpenAI first launched GBT3 and when at
Google we had Lambda, uh which I worked
on when I was at Google, you know, most
of the time it was kind of garbage and
it was crazy, but occasionally it
produced something that was really
magical. And I think that's what great
product creation is all about is like
locking in to the moments when it works
and really focusing on increasing those
moments, addressing all the errors. And
I can see now having been through this
cycle a few times that we're nearly
there with memory personalization and
actions. We it's it's really at the GBT3
stage. So it's really buggy and stuff,
but when it works, it's breathtaking.
You know, it reaches out at just the
right time. it shows that it's already
taking care of a bunch of things in the
background. Um, and that is just a very
very exciting um, step forward. Yeah, I
guess if every single company is saying
that this is where they're going to,
they see the technology. I I I guess I'm
willing to be patient to see it to see
it come to fruition. And we have this
debate on this show all the time. Is it
the models that are important or the
products built on top of the existing
models that are important? Um my I
believe that it's the if you get better
models, you'll get better products. We
have Ron Roy who comes on on Friday. His
well actually he was on Wednesday
because we're flipping them this week.
Um his belief is it's all about the
product at this point. Uh the models are
good
enough. My question to you is, you know,
is this kind of at the point where the
models are going to be saturated and now
you're going out to build the products?
Um you you put a tweet out recently. He
said uh something along the lines of
it's a myth that LLMs are running out of
gains. Uh yet it does seem like the
conventional wisdom is that they're at
the point of diminishing returns at
least. So take us into this model versus
product debate and then let us know
where we're at. No, we have got so much
further to go. I mean look at for
example you know people sort of what
happens is people get so excited they
jump onto the next thing and they gloss
over all of the hard fought gains that
happen when you're trying to optimize
something which already exists. Let's
take for example um hallucinations and
citations right um you know clearly
that's got a lot better over the last
two or three years but it's not a solved
problem. It's got a long way to go. And
with each new model iteration, all the
tricks that we're finding to improve the
index of the web, the corpus that it is
retrieving from, the quality of the
citations, the quality of the websites
we're using, the length of the document
that we're using to source from, you
know, there's so many details that go
into increasing the accuracy from, you
know, 95% to 98% to 99 to 99.9, you
know, and I think that is just a long
march. people forget that that last mile
is a is a is a real battle. And often
um a lot of the mass adoption comes when
you actually move the needle from 99%
accuracy to 99.9%. Um I I think that's
kind of happened in the background in
the last 2 or 3 years uh with um
dictation and voice. I've really noticed
that across all the platforms, uh, voice
dictation has got so so good. And yet
that technology has been around for 15
years, right? It's just, you know, some
of us used it when it was like 80%
accuracy. I certainly did. But now I'm
seeing like my mom was using it the
other day and I'm like, "How did you
learn how to do that?" And she was just
like, "Oh, you can just press this
button." D. And I was like, "Oh, that's
that's kind of incredible." Um, and I
think that's just on the dictation side.
On the voice conversation side, I mean,
we see much, much longer, much more
interesting, much deeper conversations
taking place when somebody phones
Copilot. Um, it's super fast. It feels
like you're having a real world
conversation. You can interrupt it
almost perfectly. And it's got real-time
information in the voice as well. So,
it's aware of like the latest sports
result or the traffic in the area or the
weather and stuff like that. And, you
know, a lot of people use it in their
car on the way home or on the way to
work or when they're washing up and
they're in a hands-free moment and they
just have a question. It's kind of a
weird thing because it sort of lowers
the barrier to entry to getting an idea
out of your head, you know, like you
everyone weird things occur to us during
the day. We're all like, "Oh, I wonder
about this. I wonder about that." and
then you go to kind of look it up on
your phone, you search it or whatever.
Whereas now, I think that there is a
modality that I'm increasingly seeing
where people just turn to their AI and
be like, "Hey, what was the answer to
that thing or how does that work?" And
it might be a shorter interaction, could
turn into a long conversation, but the
modality is enabling a different type of
conversation, a different type of
thought to be expressed. Um, you know,
so I think it's like a super interesting
time like that. We're really just
figuring it out as we go along. All
right. So, we're definitely seeing these
new modalities come out. Voice, of
course, we've obviously this we're in
the middle of a firestorm with images.
Um, but I am Okay, I guess let me ask
the previous question a little bit
differently. What do you think that
there are diminishing returns on
pre-training right now? Basically
scaling up the biggest possible model
and then building from there. You're
shaking your head. Specifically on
pre-training, it's been a little um
slower than it was in the previous four
orders of magnitude, but the same
computation um the same flops or the the
units of calculation that go into
turning data and compute into some
insight into the model that is just a
different application of the compute.
We're we're using compute at a different
stage. we're either using it at post
training or we're using it at inference
time where we generate lots of synthetic
data to sample from. So netn net net
we're still spending as much on
computation. Um it's just that we're
using it in a different part of the
process but but for for as far as
everyone else is you know should be
concerned aside from the technical
details we're definitely still seeing
massive improvements in capabilities and
I think that's that's for sure going to
continue. Okay, Mustafa, then can you
help me understand some headlines I've
been seeing about Microsoft? Uh, this is
from Reuters. Probably not. I doubt it.
Well, I I'm ask anyway and and you tell
me what you think. I mean, Reutder says
Microsoft pulls back from more data
center leases in the US and Europe. And
it says Microsoft has abandoned data
center projects to use 2 gawatts of
electricity in the US and Europe in the
last six 6 months due due to an over
over supply relative to its current
demand. I mean, how does that make sense
in context of what you just said that
you are still seeing results with
scaling up? So, it's funny. I I actually
did ask uh our finance guy who's
responsible for all these contracts on
Friday morning and I was like, dude, I
read this thing in the news like what's
going on? I I could use the extra power
for our trading runs and he pointed out
that in fact we have optioned many many
different contracts, many of which we
haven't even signed. So, a lot of these
are actually just explorations where
we're in conversations, nothing's been
signed. Some of them where we've
optioned where, you know, we we're
taking it, you know, just to keep our
options open and we've actually made
bets in other areas, uh, you know, other
other parts of the world. But I can tell
you, we are still consuming at an
unbelievable rate. I think that we've
like something like 32 or 34 gawatt of
renewable energy uh, since 2020 we've
we've contracted and consumed. So, I
think we're one of the largest buyers in
in the world. So, I don't expect that to
change anytime soon. So, I guess the
headlines that are saying that Microsoft
pulled back. Um, you would get those
headlines unless you picked up every
single one of your options. Is that what
you're saying? That's right. That's
right. Yeah. And and in fact, many of
them are not even options that we
signed. They're just conversations that
we were in with um certain suppliers.
Okay. I mean I guess another explanation
that we've heard is that because OpenAI
is now working with others with you know
for data center capacity like Oracle um
this was a sign that Microsoft basically
had allocated data center uh capacity to
open AI it doesn't need as much anymore.
Any truth to that? No. So like I mean
all of their inference uh comes through
us and so we there's no slowdown in our
um you know relationship with them. We
we sell them as as much as we want to
offer them and then if there's any extra
demand that they have particularly on
the you know um on the Oracle side they
go off and consume that but there's
really no no slowdown from our
perspective at least. Okay, that is
clarifying. So it's always good to have
these conversations throw the headlines
out there uh and see what the truth is.
Um let's talk about your efforts. I
mean, you're building your own models,
but you've decided to not try to build,
I guess, the biggest possible models.
You're working on smaller models. I I
want to uh ask you again, if there's
going to be endless value to endless
scale, why not try to, you know, throw
in with the big models? Uh especially
because others are building those big
models with your affformentioned scale.
Totally. I mean, you know, we have a a
lasting long-term relationship with
OpenAI, which is amazing. They've been
incredible partners to us and they'll
continue to supply us with, you know,
the best IP and models in the world for
many years to come. So, we can rely on
them to do the absolute frontier. But I
think what we always see in technology
is that it always costs like 10x more to
build the absolute frontier. And once
that has been built um all of the
engineers and you know developers find
much more efficient ways to essentially
build the same thing that's been out
there um but you know 6 months later and
that's what we refer to as our kind of
PTO optimal strategy or off frontier uh
and and we've actually seen it um you
know across the whole field over the
last three years. I mean, there are
folks who have trained models that
perform as well as GPT3
um that are 100 times more inference
efficient um that cost an order of
magnitude less to train um and yet they
can still deliver the same predictive
capability. So, I expect that to happen
for GPT4, GPT40 and and all of the other
models down the road. So, you know, we
have our own internal team of developers
um and you know, world experts working
on building our own MAI models and uh
very very proud of what they're doing.
They they're doing a great job. And you
mentioned in terms of where this
computer is going that inference is
going to be one of the place places that
it's going which is basically when the
model is answering versus training up
the models. Um I want to ask you two
questions about uh inference and and
they're both related to reasoning. um to
build these new personalized products
that you're you're building. How
important is reasoning versus just a
better model? And then in terms of
compute that reasoning uses, is it true
that reasoning uses 100 times more
compute than training? Yeah, I mean it's
a good question. I mean the exciting
thing about reasoning models is that in
a way they've learned uh how to learn.
They have a method largely by looking at
logical at the logical structure of code
and math and puzzles. They've sort of
leared the abstract idea of logic. They
can follow a path of reasoning um in its
most abstract way and then apply that to
other settings even if they don't
obviously appear to be um you know
logical settings. So it could be like
planning or booking or learning in some
other setting. Um and that has turned
out to be a very very valuable skill.
It's kind of like a meta skill or or or
you know or in some sense like a meta
cognition because it actually now can
think out loud in its own head or you
know talk about in its own mind what
it's planning to do before it goes off
and does it. And just taking a beat, you
know, giving it a moment to think behind
the scenes, it might take a few minutes
or 10 minutes at most, allows it to like
draw on other sources. So it can look up
things on the web. It can sort of follow
a path of logic down one path, realize
that doesn't, you know, turn out in the
best way possible, go back up the tree,
try another path, and then produce an
output. So it's a really fundamental
part of the process. And yes, it
definitely uses more computation. Um,
100 times more, it generally produces
better answers. What do you think 100
times more is is right? I mean, we're
hearing that from from Jensen. I mean,
so I'm curious if that's your experience
as somebody who's who's running these
models. It it definitely uses a lot more
computation. Um, and you know, I but I
think the interesting thing is that um
you're not going to need to use those
models all the time. You you obviously
need a a hard problem. you have to ask
it a tough question, right? Um that
requires this kind of chain of thought
thinking and you know many answers don't
require that and I I and and actually
you often prefer something that is fast,
efficient, succinct um and
instantaneous.
Okay. And now we had a a debate here on
the show and I'm hoping you can weigh in
on this one too. I'm just throwing you
all of our debates and getting answers
which is awesome. We love doing this. um
in terms of like how companies are
thinking through the amount of money
they're spending on serving these
products and whether that can continue
indefinitely. Let's just use this OpenAI
image uh example uh the image generator
that they just released in Chat GPT. Um
people are melting down their servers
and they're creating anime images. But
if you think about like the economic
activity generated by these images, it's
quite low and it's quite expensive to
serve. Or think about for instance me
booking a ticket on uh let's say um
kayak or ticket master through co-pilot
instead of just going to you know ticket
master or kayak on my own. Uh it's a
slightly better experience but it's a
very expensive experience to serve. And
so those who say that this is coming to
an end, uh, this AI moment is coming to
an end, basically say that this is all
going to just be too expensive and not
value add enough. We're going to be, you
know, having chat bots book tickets
while we could do it on the websites.
We're going to be having um image
generators make us anime, which does
nothing but give us baby maybe 10
seconds of giggling. Really good
giggling. Uh, but 10 seconds of giggling
and then we move on. I mean, what do you
think about that? like it's the clearly
like the servers are being used, but are
they being used in a valuable way enough
to make companies like yours keep going
and building? It's is a fair question.
At the same time, um, as we've seen over
and over again in the history of
technology, when something is useful, it
gets cheaper and easier to use and it
spreads far and wide. And that increased
adoption because it's cheaper has a sort
of recursive effect on price because the
more people use it, the more demand
there is and then that then drives the
cost of production down even more
because of competition. And so I expect
that to happen in this situation. I
think it's actually really good news for
our data centers as well. You know,
Microsoft has long committed to being
carbon net negative by 2030, to be clean
water positive by 2030, and to be a
zerowaste company. They're are massive,
amazing commitments, and I think that's
actually really exciting because we end
up driving demand for the production of
highquality renewable um energy for our
data centers and that then obviously
reduces the price. I think we've seen
that with solar over the last 15 years,
which is like an unbelievable
trajectory. So, um I think there's a lot
of good news there. Even if, as you say,
you know, some of those use cases are
just generating funny uh anime giggle
pics, many of them will be doing very
very useful things in your life, too.
So, you know, there's always a bit of
balance there. Yeah. I guess like Chris
Dixon says, it can start as the next big
thing will start as a game. And a lot of
people laughed at these images and the
way that they, you know, make you look
like an anime character if you prompt
him to do that. But I also saw Ethan
Mollik from Wharton prompting it to make
infographics. Uh, and it handles it
perfectly. So, Right. Yeah, dude. I
mean, like the inter tubes would not be
the intertubes without serious amount of
cat memes, right? They they they make
the world go round. Exactly. All right.
Um, I want to take a quick break and
then come back and talk a little bit
about your relationship with Open AI and
then uh maybe get your prediction on
when we're going to see artificial
general intelligence. We'll do that
right after this. And we're back here on
Big Technology Podcast with Mustafa
Solan. He is the CEO of Microsoft AI and
one of Microsoft AI's big partners is
OpenAI. And I just can't help but think
about where this partnership is going
because we talked a little bit in the
beginning about the assistant that you
want to build. Um something that knows
your context me has memory of you,
something that can help you get tasks
done in the real world. Um well, OpenAI
wants to build that uh exact same thing.
And so I'm curious, I mean, they have,
you guys have a deal, right, where where
they use your technology and they're
supposed to feed some of their uh
breakthroughs back to Microsoft, but at
certain point, why does it make sense
for them to keep doing that if um if
you're trying to build the same thing?
Look, I mean, first of all, it's worth
saying that this partnership started way
back in 2019 when Microsoft had the
foresight to put a billion dollars into
a not for-p profofit research lab. I
think that's going to turn out to be one
of the most impactful, most successful
investments and partnerships of all time
in technology. And despite all the ups
and downs, we actually have an amazing
relationship with them. If you think
about the fact that they are a rocket
ship that has grown, you know, faster
than any other technology company in
living memory. Uh delivered a product
that people absolutely love,
consistently delivered amazing research
technology. you know, the first thing
you have to do is take your hat off to
them and give them maximum respect for
that. At the same time, they're also
still a startup and, you know, they're
busy sort of trying to figure out their
um, you know, product portfolio and
their priorities. And, you know, whilst
we have a incredibly deep partnership
with them, which is going to last way
through 2030 and beyond, um, they also
have their priorities, we have our
priorities and that's just the nature of
those partnerships. They change over
time, right? And as they're growing
bigger and bigger, they have different
priorities. And likewise, we're doing
exactly the same. So I'm pretty
confident that this is going to continue
to be brilliant for both sides as it has
been over the last 5 years. Okay. You
said the partnership is going to last
till 2030, but not if they declare that
they've reached AGI. So what happens
when they do that? You know, a AGI is is
a very uncertain definition, right? Um,
is it your definition or their
definition that releases them from the
contract though? You know, it's it's
it's an interesting way to look at the
world. You know, you think about it like
this. If we really are on the cusp of
producing something that is more uh
valuable than all economically
productive work that any human can
produce that, you know, I think one of
the last things we're going to be
worried about is our partnership with
Open AI. It's going to profoundly change
humanity. Uh, I think national
governments will be very concerned and
interested in how that plays out. And,
you know, it's just going to change what
it means to be human. So, I personally
think that we're still a little way off
from that. Uh, I I find it hard to
judge. It doesn't instinctively feel to
me like we're 2 to 3 years away. I know
some people think that it is, and I
respect them deeply, like a lot of smart
people can disagree on stuff like that.
I feel like we're still a good decade or
so away. And when a scientist or a
technologist uh an entrepreneur like me
says we're a decade away, that's just a
handwavy way of saying we're not really
sure and it feels pretty far off. So,
you know, but that's the best answer I
can give. It doesn't feel like it's
imminent. Um and you know, in the
meantime, we're doing everything that we
possibly can to build great products
dayto-day. Okay. Uh, one more thing
about OpenAI. Microsoft today, we're
we're talking on Friday, so earlier this
week, uh, is part of this $40 billion
fundraising uh, into OpenAI. Uh, OpenAI
set the record for the largest VC round
ever last year, 6.6 billion. This is 40
billion. Soft Bank's going to put 30
billion in. Microsoft is part of the 10
billion uh, remaining. What do you get
for the money? Uh, I think it's awesome.
I mean look we the the more open AI are
successful the more we are successful
like we will end up being one of the
largest shareholders in the company um
we have an amazing technology license
from them they you know use our
infrastructure and our technology uh in
terms of our Azure compute
infrastructure and so on so it's a great
part partnership and you know in a
partnership we want to see them do the
best that they can that's why we
participate in the round okay and uh all
right so let's talk a little bit about
the future of this technology. Um, I
guess you already said you think I was
going to ask you when you think AGI is
coming. You think decades away? That
would actually make you a big uh less
optimistic than most of your
counterparts, right? Demis is saying 3
to 5 years. Um, I I mean people
everywhere. I I I don't know. You might
not think it's coming. We tend to think
here and we're probably less informed
than you are that OpenAI might say it
next year. And so we'll have to play
this back uh if that happens. No, I
didn't I didn't say decades plural. I I
said a decade. A decade, you know. But
look, I I think the truth is it's hard
to judge. Like I could I imagine it
happening within 5 years? Yeah,
absolutely. It's it is possible. The
rate of progress over the last 3 or 4
years has been electric. It's it's kind
of unlike any other, you know, uh
explosion of technology we've ever seen.
The rate of progress is insane. Open
source is on fire. They're doing
incredible things. and every lab is, you
know, every big company lab is investing
everything that they've got in trying to
make this possible. So, yeah, I could
certainly see a scenario where it's
closer to 5 years. I'm just saying, you
know, instinctively to me, it feels like
there's still a lot of basics that we
got to get right. You know, we still
have to nail hallucinations. We still
have to nail those citations I
mentioned. It's still not great at
instruction following. It still doesn't
quite do memory. It still doesn't
personalize to every individual. But,
you know, we're seeing the glimmers of
it doing all of those things. So, I
think that we're we're taking steady
steps on the way there. Now, you were at
Google for a while. You mentioned you
worked on Lambda. I'm curious uh what
you think happens. We don't even need to
reach AGI for this question to come into
play. Um what happens to search as we
start to speak more with products like
yours? Uh you've mentioned in the past
that you think search is horribly broken
or I'm channeling your words, but
something along that line. So, what
happens? I I honestly think it's kind of
amazing that we still all use search. It
does feel like, you know, using a yellow
pages or an A to Z back in the day,
right? It, you know, I think it's going
to fundamentally change. I think instead
of browsing 10 blue links, you're just
going to ask your AI. It's going to give
you a super succinct answer, show you
images, maps, videos, all in the feed.
You're going to give feedback and be
like, "Oh, that's a bit strange. I
prefer it a bit more like that." Or,
"What does that look like?" Or, "What
about this?" and it's just going to
dynamically regenerate for you on the
spot. Um, so how does that change the
business model? Well, I still think ads
are going to play an enormous part of
it. Hopefully, those ads are higher
quality, uh, more personalized, more
useful. Uh, there's nothing wrong with
ads. We want them to be helpful to us.
Like, I'm happy when I buy something
that I found from an advert because it's
what I really really want. But I'm not
happy when I feel like I'm getting
spammed with lowquality ads. And so
that's the that's the balancing act that
we've got to strike is to try and find
ways to introduce ads into the uh you
know the co-pilot experience in a way
that's actually subtle and is really
helpful to you. Yeah. And that's really
hard because let's say this is your best
bud and it's your inner circle of the
five people you call when you're running
out of ideas at Home Depot. uh for it to
then say, you know, I really appreciate
you and I'm going to help you out here.
But by the way, do you know there's a
different side of glue that you might be
interested in um there the finessing on
that must be uh quite difficult. So we
are running out of time. I just want to
ask you one question about jobs because
you're also pretty strident about the
possibility um that we might have some
serious change here come to our work and
uh you had said um that AI is going to
create a serious number of losers in
white collar work. Maybe it already is.
I've sort of changed my tune in thinking
that you know we're all fine in the
white collar work world and now thinking
well it's anyone's guess. So what's
coming Mustafa? I do think that that is
the big story that we should be talking
about. That's the transition that's
going to happen over the next 15 years.
Um is that it is going to be a cheap and
basically abundant resource to have
these reasoning models that can take
action in your workplace that can
orchestrate your apps and get things
done for you on your desktop. Like that
really is quite a profound shift in in
how we work today. And I do think that
like the the your day-to-day workflow
just isn't going to look like this in 10
or 15 years time. It's going to be much
more about you managing your AI agent.
You asking it to go do things, checking
in on its quality, getting feedback and
getting into this like symbiotic
relationship where you iterate with it
and you create with it and solve with
it. Uh that's going to be massively more
efficient and I do think it's going to
make everybody a lot more creative and
productive. I mean after all it is
intelligence that has produced
everything that is of value um in our
human civilization like everything
around us is a product of smart human
beings getting together organizing
creating inventing and producing
everything that you see in your you know
line of sight at this very moment and
we're now about to make that very same
technique those set of
capabilities really cheap um if not like
zero marginal cost and so you know I
think Everyone gets a little bit caught
up on the week to week daytoday or
definitions of these abstract ideas.
Just focus on the capabilities. You
know, it's should really be thinking
about these things as artificial capable
intelligence. What can it do in practice
and what is the value of that doing? Um
I I prefer that as a framing versus AGI
because it's sort of more measurable and
we can actually look at it very very
explicitly in terms of its economic
impact and its impact on work. I mean,
you could argue that that's already
here. And so, just to sort of ask you
one followup on that one, what would you
tell young people to do today if they
because all right, I'm thinking customer
service? Probably not. Software
engineering, I don't know. Uh, I just
wrote a story saying, you know, they can
start to do the work of journalists. I
mean, you just released podcast 5
minutes ago. So, uh, what should young
people do when they're thinking, you
know, it's a little bit like saying,
what should young people do when they
get access to the internet for the first
time? Like, part of it is sort of
obvious where it's like, use it,
experiment, try stuff out, do crazy
things, make mistakes, get it wrong.
And, you know, part of it is like, well,
I actually don't really know until
people get a chance to really play with
it. As we've seen over and over in the
history of technology, you know, the
things that people choose to do with
their phones, with internet, with their
laptops, you know, with the tools that
they have are always like mindblowing.
They're always way more inventive and
surprising than anything you could
possibly think of ahead of time. And so
then as you start to see people use it
in a certain way, then, you know, as
designers and creators of technology, we
adapt what we put out there and try to
make it more useful to those people. So,
I think the same applies to a 15year-old
who's, you know, a high school thinking
about what they do next in college or
whatever or whether or not they go to
college. And I think the answer is play
with these things, try them out, keep an
open mind, try everything that you
possibly can with these models. Um, and
then you'll start to see their
weaknesses as well, by the way. And
you'll start to chip away at the hype
that I give because I'm super excited
about it. I'm obviously a super
optimistic, you know, techno person. But
you'll see where it doesn't work and
you'll see its edges and where it makes
mistakes and stuff like that. And I
think that will give people a lot more
concrete reassurance as to what
trajectory of improvement we're on. All
right. I just want to ask one last
question just to wrap up uh everything
we've talked about today. And it's kind
of an offbeat question, but I am curious
now that you're talking about how these
bots are going to differentiate
themselves based off of personality. We
are going to have advertising in them,
but they might intermediate uh your your
interactions with other companies. What
happens to brand in this new era? I
think brand is actually more important
than ever in a way because there's sort
of two modes of trust. There's trust
based on utility where it's functionally
correct. It's, you know, factually
accurate. It does the thing that you've
intended it to do and therefore you
trust it to do the same thing again. But
then there's also a kind of emotional
trust where you trust it because it is
polite and respectful, because it's
funny, because it's familiar, you know,
and that's really where brands come in,
you know, trusted brands that are able
to repeatedly deliver deliver a
reassuring message. I think people are
going to appreciate that more than more
than ever before. Good stuff, Mustafa.
This is the first interview we've done
with a Microsoft AI executive. I hope
not the last. Anyone who's listening on
the Microsoft team, let's do this again.
And my Mustafa, I'm just so grateful to
have your time today. Thank you so much
for coming on the show. Thanks a lot.
It's been uh really fun. Really, really
cool questions. Thank you. Awesome
stuff. Well, thank you everybody for
listening and we'll see you next time on
Big Technology