AI's Research Frontier: Memory, World Models, & Planning — With Joelle Pineau

Channel: Alex Kantrowitz

Published at: 2026-02-06

YouTube video id: nlSK8NA8ClU

Source: https://www.youtube.com/watch?v=nlSK8NA8ClU

Where is the cutting edge of AI research
leading today and how are some companies
already putting it into action? Let's
talk about it with cohhere chief AI
officer Joel Pino right after this.
Welcome to Big Technology Podcast, a
show for Coolheaded and Nuance
conversation of the tech world and
beyond. Today, we're going to look deep
into the state of AI research, where the
cutting edge is leading, whether there
are limitations with the current
methodologies, and how some companies
are already putting this technology into
action in a practical way. We're joined
by the perfect guest, Joel Pino, is
here. She's the chief AI scientist at
Coher. Joel, welcome to the show.
>> Thank you. Glad to be here.
>> So, for those that don't know Joelle, uh
she is a a you know, a researcher who's
been at this for a long time. You and I
met uh actually maybe a month after Chad
GPT was released and everybody was
asking whether AI was sentient. You at
that time were the head of the
fundamental AI research division at
Meta. Uh you you're also a professor at
McGill and currently you're the chief AI
officer at Coher Coher. We've had Aiden
Gomez on the show. He founded the
company in 2019. He's also one of the
authors of the attention is all you need
paper which basically kicked off the
generative AI moment. Um so coher is uh
seven years old at this point six seven
years old uh for the kids out there. Uh
it's raised 1.6 billion. It's worth 7
billion and it sells AI to enterprise.
So that sets the stage.
>> Yes.
>> Let's talk a little bit about AI
research. Uh there's so much discussion
people have been talking about whether
AI research is going to hit a wall and
whether you know these new methodologies
things like putting reinforcement
learning on top of large language models
going through reasoning um uh teaching
the models to use different tools. Um
there's so many different opinions of
where to focus right now. So what in
your opinion is the cutting edge of AI
research and where do you think it's
going to lead? M well I'm certainly not
worried about research hitting a wall
like there's so many questions that we
need to work on right now and I'd
separate it into two two interesting
angles right one is like what are the
right problems to be solving right now
what are the the things that the models
the current generation of models we have
can't do and then there's a question of
like how do we go about it right like
what's the hypothesis
that may give us the clue to how to
solve some of these problems So in terms
of of what problems to solve, I think um
an important one is um what do we do
about memory? Uh machines have the
ability to remember tremendous amounts
of information. You're just like
stocking it in there. The hard part is
knowing like when to pull on what piece
of information to make a prediction, to
generate information, to reason. And so
having this ability to be a lot more
selective about the all the information
you've seen in context is super
important. And already transformers were
an important piece of that. You know,
attention is all you need. Well, it
turns out it's not all you need. You
need a little bit more than that. You
need the ability to reason about
information at different time scales, at
different granularity, and so on and so
forth. So there's definitely a good
piece of work to be done there which
really involves and now we talk about
the how you know the choice of
architecture the choice of learning
mechanisms the type of data sets the
type of use cases that we need to we
need to look into.
Another big research theme is on
building world models. We hear a lot
about world models which are essentially
the ability to take in all this
information and predict the effect of
actions you know. So when we talk about
causality, how are actions transforming
the world? This is what a world model
should be able to do. World models are
absolutely essential when you want to
build agents because these agents are
going to take actions which is going to
change the world. You want to be able to
predict these effects. So whether you're
building robots and then we talk about
physical world models but also the
agents getting deployed on the web you
need to build digital world models so
that these agents you know whether
they're making financial decisions
communicating on your behalf organizing
meetings that they have the ability to
predict the consequence of their
actions. So that's a big theme and
there's a lot of different hypotheses
about how to go about building these
world models. And and the third theme
I'm I'll highlight and there's many more
but like me at least pick out the top
three choice um is about how do we uh
build in reasoning efficiently. Um and
right now a lot of the reasoning methods
are still uh quite thorough based on
sort of forward search methods and and
learning the right reward function. But
I do think there's, you know, like the
the transformer moment for for reasoning
and choosing action and being able to
plan at different levels of granularity.
We're still far away from doing that.
And so there's all sorts of ways that
it's being baked in, you know, LLM as a
judge and things like that where AI
systems give feedback to AI systems in
order to train them is still very early
days.
>> Okay. I I want to dig into a lot of what
you just said. Let's start at the
beginning. Let's start with memory. Uh,
is memory and continual learning two
sides of the same coin? I mean, there's
this idea that the models can search the
web and they can find something in a
session, but as soon as you close that
session, they forget it. Um, and I guess
the reason why I'm going there is
because a way that some people have
suggested solving both of these is just
making the context window massive and
then just becoming efficient in the way
that you navigate that.
>> Yeah.
>> What do you think about that hypothesis?
Um the the the two concepts are related,
but they're not exactly the same. Um and
so memory is is really about how do you
address sort of what information to pull
in in the context of the task you're
trying to solve. Continual learning
makes the assumption that the context
keep on changing. Therefore, what you've
learned keeps on changing. So there's a
notion of non-stationerity
that is really key to continual
learning. I I confess I have a little
bit of trouble with continual learning
uh as a concept because I feel the
community has never been able to nail
like how do we articulate the problem in
a way that we all agree on it and so
everyone who does work on continual
learning takes a different flavor of it
which makes it at least in my eyes and I
haven't worked a lot in this area but
makes it a little bit hard to know
whether we're making progress or not on
on memory it's a little bit more
standardized the tension really is about
it's a question of efficiency and
relevance. So the way to measure whether
you're doing that is a little bit better
standardized and you don't want to be
just sort of remembering everything. Um
and so it's a little bit better
standardized how we articulate the
tasks.
>> Okay, let's go. We're going to touch on
both of those now and then we'll keep
going down the list. Um with continual
learning maybe I'm you know so far
removed from it I'm not struggling with
it. So I'll give you my caveman thought
about what this is and you can help us
break it down a little bit. I mean the
problem has been articulated that the
models they are they don't change as
they go about all these I mean think
about how powerful it would be if let's
say the GPT model which is speaking with
800 million or maybe more uh by the time
this comes out million people a week
could I mean might be scary actually but
could internalize those conversations
and learn from the discussions that it's
having um that would almost you know you
I I agree with you that the wall we're
not at the wall but the question is
>> is there going to be enough data to keep
making these machines smarter and you
know as they have these conversations
that opens up that ability to continue
to grow and learn but the model stays
static despite all the fact all these
conversations that it's having with
people isn't that the problem
>> and I mean don't get me wrong right like
I absolutely believe we need to address
the fact that these models need to keep
on evolving I have no doubt about that I
just mean right now the progress in the
research community that's working on
continual learning isn't necessarily
connecting to the work that's going on
on scaling now the models that are
released right you know they keep on
evolving I would say you know the
generative models we have today whether
it's Chad GPD whether it's Gemini
whether it's our the command models that
the coher team is building these models
keep on improving it's just we don't
necessarily let them improve online but
we you know we ship at definite times
like a release of a model which has a
particular characteristic
advantage of doing that frankly is you
can really test the model before you put
it out there. You can put it through its
paces in terms of performance, in terms
of safety and so on. And I would be a
little bit reluctant to just let the
model keep running on its own because
you know the learning can go very very
fast and you can switch out of a mode
that seems completely reasonable very
quickly which you know we have seen a
few times in the past.
>> Yeah, I think we might be thinking about
one of the same instances when Microsoft
had this bot called Tay. Yes. Uh, I'll
tell you a story. I actually broke the
news of Tay that Microsoft was going had
this great bot, spoke with the people,
wrote the first story about it when I
was at Buzzfeed. I pinned it to my
Twitter profile. I went to sleep on the
West Coast and I woke up with all these
messages being like, "Hey, that chatbot
that you wrote about, the fun teen
chatbot is actually espousing Nazi
ideology. You might want to unpin that
tweet." And it was because it kept
learning. So okay, maybe continue
learning, you know, if it's done cuz it
has to also be done with some sort of
fine-tuning where you want to make sure
that behavior maybe it's preemptive
fine-tuning even.
>> Well, let's not release continual
learning till we've achieved continual
testing.
>> That sounds like a very reasonable uh
plan. All right, memory. Uh what makes
it so difficult? I'll tell you one
story. uh
>> uh my Friday co-host and I uh with the
my Friday co-host uh Ranjan Roy and I uh
we both went into Gemini on Google and
uh on Gmail and we asked can you find my
the first email that I ever sent uh with
my wife.
>> Okay.
>> Couldn't do it. Y
>> um is that because there's is it just
because there's so many emails in there
that actually like applying AI to to try
to figure out like what conversations
have been had is that difficult or is it
kind of a product problem from Google
like where why is memory so difficult
and how how are we going to end up like
how is the research community going to
tackle this?
>> I mean like it's a little bit difficult
to diagnose just from your description.
I feel like I'm a little bit like, you
know, a surgeon who's, you know, on the
phone hearing the description of the
patient. So,
>> have you asked Hat about your symptoms?
>> So, I won't necessarily venture, you
know, a precise diagnosis for your case.
But, but nonetheless, I don't think it's
that difficult to to figure out. I mean,
I'd have to know what information is is
the bot pulling from, right? Like just
in terms of like visibility and privacy.
Did you give it access to all of the
information it needed to answer that?
>> Um, and that's the first one. And you
know we do a lot you know go back to
what we're building at cohhere actually
like we do a lot of deployments on site.
So sometimes it's just a question like
we didn't activate the access to the
right information to do it. So you need
to figure out whether that access to the
right information is there.
>> Um and there's all sorts of reason that
you may not want to give the bots access
to all of your information all the time.
So that's one one practical
consideration. Um the other one is like
retrieving the right information. And
so, you know, did the query match how
the information was encoded? Because in
most of these, you may not want to just
leave the information in raw form. It
gets very expensive. I mean, you're one
person, but at the scale that some of
these companies are operating, you have
to compress it, which we often call
embeddings. So, you create like
embeddings of this representation. And
so, it may not have embedded the
information properly. And then there's
like retrieving that information. and
maybe it retrieved like 10,000 different
items and didn't didn't rank this one
close to the top and so it didn't
generate the right response. But it
could be that it knows of it. It just
didn't show up at the top.
>> Um so there's like a few different
reasons which makes it hard. One of them
is like the access to the information
when it's encoding that information.
Then it's like retrieve the information
at the right moment.
>> But when when this stuff works, it's
pretty magical. I was just in uh Claude
actually and I noticed that Claude's uh
memory capabilities have really
improved. I was speaking so I love to
upload the transcripts of my interviews
and like just you know get get a a
grading out like uh give me a rating on
a variety of metrics. You decide to tell
the bot you decide.
>> Do you agree with the ratings that the
body is giving you?
>> Definitely.
>> Okay.
>> Usually well you've trained it well.
>> I did. So some are good, some are bad.
Um I I actually had Gemini do a bunch of
ratings and it was like five of five on
all categories and I was like
>> that is wrong. Uh and then I went to
ChachiPT and Claude and they were
actually much more reasonable about it.
But one of the interesting things that
Claude did uh when I asked it this week,
it started comparing it uh to the other
interviews I had done. Okay.
>> And it said, you know, you actually hit
better points on this one and this is
why this one didn't resonate in my
opinion. Did you benchmark it with a
sample of your audience?
>> That that that's probably the next and
it'll probably when when it when I
because I'll take um data out of the
podcast analytics and drop it in these
bots. It's going to be able to cross
reference. So when it works, it's
magical. And you know, you've identified
this as one of the areas where AI
research really needs to, you know,
concentrate and this is the cutting
edge.
>> Um how good can this get and what do you
think? Do you think that it's at a
moment of real progress or is it sort of
party tricks to be able to get Claude to
be able to do the things that I talked
about?
>> Um the question on rating specifically
like analyzing the information and sort
of distilling some feedback
>> more more about the memory the fact that
it can call back memory in particular.
>> Um no I mean we're making good progress
on that. You know extending the context
length is kind of the easiest way to go
about it but there's quite a bit of
progress that is that is being made on
this. Okay, let's talk about reasoning.
You mentioned reasoning as a as a
cutting edge
>> uh moment. The problem is efficiency. Is
that is that really the issue here? I
mean, so so reasoning is the model
basically goes step by step. It it tries
to answer, checks the answer,
>> tries a different answer, then
eventually decides, okay, this is
probably what they want, and then it
spits something out.
>> Yes. I mean, that roughly happens this
way. I think the the challenge is really
being able to
plan at different levels of sort of
temporal granularity, right? So, in
terms of how you execute actions, let's
say, you know, you're you're planning a
trip, right? You're not going to start
by thinking of like what are the shoes
that I put on to go on my trip, right?
You're going to start by talking
thinking like roughly what season,
roughly what, you know, part of the
world do I want to go visit? You start
from the top level and then you take it
down a notch which is like okay you've
identified like a rough time a rough
place like let's get more precise on the
time and the place and maybe the
activity and like maybe who you want to
go with and then you take it down
another notch right and that's when you
start booking your reservations and so
on. But sometimes you know you'll hit a
blocker on the reservation and you can't
get the flights or the hotel you want
and then you'll pop back up and say like
do I change my dates? Do I change my
place? Do I change who I go with? I'm
not going to bring the kids because then
we can, you know, have more options. So,
we can pop back up in terms of level of
resolution. That's the part that the
reasoning models don't do. They do
really well at like one level of
granularity. So, you've got a robot, you
give it all these like motions for the
for the hands, what the body motions, it
can plan essentially to control the
motors at that level of granularity. But
the going back and forth between
different levels of sort of resolution
of action, it's really hard. So on the
technical terms, we call it hierarchical
planning. That's really hard to do that
decomposition and keeping the
information relevant as you go back and
forth.
>> Is that just a limitation of the large
language model? Because the fact that an
LLM can even do this in the first place
uh like again like it started with
predict the next
>> do it at the word level. Right. Right.
>> And out of the word level, you do get
the higher the higher level. It is
really impressive. I think that's the
part that probably shocked a lot of
people. They expected at, you know, back
in 2023 or so, they expected that as
you're generating tokens, you're not
going to be able to generate sort of big
ideas or or bigger plan. And yet, it's
pretty remarkable that it does it. Which
is why you get sort of you know
different opinions in terms of some
people's thinking like hey like it's
already impressive like like let's just
keep on pushing that way of doing things
and we will unblock this and other
people being a lot more skeptical that
you'll achieve it.
>> Explain that a little more. So as it's
typing as it's I mean I think Andre
Karpathy basically explained that the
transformer is a computer and every time
you
>> generate a new token you're going
through a piece of computing. So the
more you type, the bigger the computer
is that you use.
>> Yes. The more I mean the more
information goes in and the bigger your
representation is.
>> Yeah.
>> Okay. And so but are you saying that as
this happens the computer is effectively
already thinking ahead? I I'll give one
example. Claude uh just to go back to
some anthropic research, they published
uh this amazing research where they
asked Claude to write a poem. Yeah.
>> And as it's writing the first line, it's
already activating features in the model
that's thinking what rhymes with that.
>> Yeah.
>> Which is amazing because again, it's
technology that predicts the next word.
But as it's predicting the next token,
it's already thinking the next sentence,
which to me is just
>> mindboggling.
>> Yeah. And so I mean this is why to some
degree the emphasis on code and the
ability to build representations of code
and generate code is so interesting
because when you look at code and you
know for for people who've programmed
before the code has that structure that
hierarchical structure it's encoded in
anyone who looks at a bunch of code even
if it's not necessarily a language you
understand you understand the notion of
functions and variables and libraries
and so on. And so those different levels
of granularity of the project, it's
encoded in there. Um, and so there, you
know, there's there's some hope that by
training enough on code, the machine
essentially like infers these kinds of
structural cues.
>> Fascinating. So that that I mean like
you talked about the the the fact that
this technology and this is sort of the
thing that that sort of went makes my
head explode a little bit. the fact that
this technology is able to do these
things that you wouldn't think given the
architecture it is supposed to do. Um
same with if you think about video
models and image models and by the way
one of your former colleagues Yan Lakun
would always talk about how
>> to gener and I know he has some
criticisms of video models but to be
able to generate AI video you really
have to be able to predict and plan
what's going to happen in the physical
world. Absolutely.
>> And
>> there's some embedded intelligence that
even leading researchers I don't think
fully get. um that when you for instance
ask a model just to use Yan's favorite
example to drop a pencil there's so many
permutations of where that can go and
now the models without like I mean
without having lessons of physics
understand that it drops and maybe hits
the table and might bounce but bounce up
>> yep because it's seen enough data from
objects that are dropped that have these
kinds of behavior
>> but tried to predict what's the behavior
of a similar object dropped on you know
on a different planet it and probably
the prediction is wrong because all of
the data was taken with our gravity
constant.
>> Yes. I mean, I will say as I'm talking
about this, I did just see a video
generated where uh a man's fingers came
out of a styrofoam cup as he was holding
it. So, there's there's room to
>> a lot of room for improvement.
>> Now, there is some talk Deis Abvis was
on recently talking about how Google's
video models in some way have capab like
these world model capabilities. they do
understand the physics and you brought
up world models as another area where
this technology really has the potential
to grow. It's the cutting edge
>> still kind of undefined.
>> Yeah. I I will say and and you know
going back to the caveman here, I'm a
little bit confused about why for
instance like one of the examples that
you brought up earlier was that if you
want a model to be able to like go out
and like complete financial transactions
and understand the implications of
financial transactions, it has to know
how the world works.
>> Yeah.
>> Um but can't you just teach that in
text? Can't you teach it like if you use
my credit card and you know buy anything
online I will go bankrupt like in in
text or even number logic and therefore
don't do it like why does and and I
think world models is like you these
models need to understand gravity why
does a model need to understand gravity
to learn these basic um rules of sort of
the way that the world works
>> well and and and this is why earlier I
sort of distinguish between like
physical world models and digital world
models right it's it's possible that you
can actually build really effective
agents, web- based agents that don't
understand the concept of gravity. And
it's possible you can build physical
world models for robots that don't need
to understand, you know, the functioning
banking system. And so you can define
the word world as being like a contained
environment. Um, and so but but if you
want to deploy the agent on that
environment, then it does need to
understand the the rules of that that
environment quite well.
The challenge is getting enough coverage
of data for all the possible futures,
right? And all the different ways that
the world could could evolve subject to
various events, various events
happening. So, a lot of the cases today
where it's actually most beneficial is
is where there's like a place for the
human on the table. And I I'll give you
an example, right? People talk a lot
about using chatbots for customer
service, right? Like chatbots should be
you should just like plug them in. and
they will answer all your questions. Um
they'll be available 24/7
and so on. In reality and and there will
be of course many chat bots deployed for
these kinds of cases. But you know like
one of the use cases we've seen that
works really well is actually to have
the bot like pull together all the
relevant information. You do customer
service, you pull together all the
relevant information from many different
sources as opposed to like following a
script being just chatbot. pull together
all that information about you know the
the the documents the documentation that
accompanies the system the case on the
client the different problem that
description that you have. You pull all
together that and then you pose a
diagnostic and then you pose a few
suggested actions and then you keep a
human in the loop to validate the plan
and to carry out the action. And so that
means that the human you know can and
these are more complicated cases than
just like your cell phone plan or
something like that. Um but nonetheless
in those cases like what would have
taken a long time you know maybe you
know half an hour to pull together all
that information distill it for a human
now you can reduce that down to like a
20 second you know analyze verify and
carry out the action. So if you have
that ability to combine the human and
the AI agent, actually you get often
some much more powerful results and it
means if your world model isn't
complete, humans in the loop, they
figure out the pieces that's missing.
They give that extra information and
then you bring that information back to
train your agent. Then you get continual
learning.
>> There you go.
>> We're getting there. We're getting
there.
>> Do you buy that the models need to
understand gravity for AGI to be
reached? I mean there are basically like
a couple schools of thought that you
could you could basically train AGI on
on bits and you know letters and stuff
like that.
>> Um images or and then there are others
that believe you know you really need uh
you really need the these models to
understand like um you know not just the
rules of poker but like what happens
when a person puts their hand on a poker
table. What do you think? Yeah. I mean,
I I tend to actually place my bet not on
the fact that we're going to reach like
a single super intelligent agent, but on
the fact that we are much more likely to
live in a future where there's going to
be many agents for many things. And so,
some agents will absolutely need to
understand gravity. You know, if we're
going to have physical robots that are
moving around in the world that are
going to be hitting objects, that are
going to be picking up objects and so
on, they will need to understand that
other agents that are dealing, for
example, with our digital life may not
need to understand that. And we also
need to have a protocol for these agents
to interact with each other and to talk
to each other. So, I actually think
that's a much more likely scenario. um
rather than have like the Uber agent
that needs to understand everything and
have an fully encapsulated world model.
>> There's a popular thing that AI lab
leaders have been saying recently.
They've been talking about how there's a
capability overhang. How the AI
technology can do a lot more than it's
being used for.
>> Do you believe that?
>> Um absolutely. Yeah.
>> Say say more about it. Talk about what
do you think is not being done that
could be done? I see it every day. Uh I
mean
and I'll I'll I'll open up a little
window like one of the reasons I was
super excited about joining go here is
because it's one of the few places that
you know that we have a team that does
research. So I get to see you know
day-to-day what's happening in research.
We have a team that does modeling. So I
get to see the models that we're
building look at the evaluations the
full spread of evaluations and we have a
product that's product is an agentic
platform that is going to real clients.
So you get to see the whole thing and I
see something that our models can do and
I see some things that we've built into
the products and then we go and there's
a lot of customers that are not using
the full functionality for all sorts of
reasons. Um so I think like that that
between like what we have in terms of
capacity versus what's being deployed
right now there's a big gap between
that. Sometimes the reasons are um are
uh capacity questions like a lot of
actual we talk a lot about super
intelligence big models. In reality
paying customers want like a good
trade-off in terms of performance for
efficiency. So you know we'll train
bigger models but we'll deploy smaller
models because it gives us that
tradeoff. It's like good enough
intelligence to get the job done. And
I'm like well we could give you so much
more. They're like no it's good enough.
So, and it's a perfectly, you know,
rational position for them to be taking.
So, some of that is for efficiency
reasons. Um, some of that gap is also
because you're going into
organizations which have systems and
processes in place. And sometimes
there's like a a mismatch between what
those processes are set up to do today
>> versus what would be a I think, you
know, a more welcoming environment for
an AI agent. So there's these kinds of
things and then the the other one is
often I think there's a lot of
intelligence that is not encoded. So the
agents go they plug into a bunch of
internal system they leverage all the
business intelligence with privacy
security consideration. They leverage
all that information
but sometimes there's big pockets of
information that we're not leveraging
right now. And if we did if we connected
into that then we would be able to do a
lot more. So that that like impedance
mismatch in terms of the information
sharing from the organization or from
the individual to the AI is another case
where leaves a lot of you know lot of
machine intelligence on the table.
>> So we're going to talk about enterprise
in a moment but let me ask you one
question about how this applies to
consumers. Um obviously we talked about
a lot of technology and uh the vision is
there within the big tech companies to
have a like a universal assistant uh
something like an Apple intelligence or
an Alexa plus uh you know both of them
have rolled out in their own way uh but
both both of them and I guess you know
meta has their own product Google has
their own product none of these are
lighting the world on fire is do you
think that is is this another example of
a capability overhanging or is it that
the technology is just not there Yet
>> uh I think both are true.
>> I think you know people
are expecting you know basically been
promised super intelligence. So you know
they are expecting magic out of these
these AI systems. It is not magic. Um
and so I would say like there's a big
gap between expectation what they can do
today. And then there's also a mismatch
between you know what people try to do
versus what might be the strength of of
these agents. I I compared a little bit.
You know, you're working in a team, you
get a new teammate in like day one. You
may not know exactly what this person is
capable of, not capable of, and it takes
some time working together, and and
sometimes that person gets a lot better
when you give them a lot more
information, and sometimes you discover
they have a new skill that they didn't
have, but at the end of the day, you
know, often that person isn't
able to do everything everywhere all at
once.
>> And so, I think there's there's both
both these things are true at the same
time.
>> Yeah. There's also I mean a lot of
corporate politics. I just wrote this
>> of course
>> I wrote this uh story recently in big
technology talking about how there's
like these two basic and actually you're
in a great position to talk about this
or or give us the real story here. From
my vantage point there's basically two
trajectories that a lot of companies are
on. The companies themselves have uh I'm
not talking about your customers but um
if you think about like companies
overall um many of them have struggled
to put this technology into place. But
individuals are starting to see the
benefit. So you actually have like these
companies with these pilots that are not
getting into production but then you
might have somebody you know lower down
using clawed code who's like actually
getting done.
>> Um so what do you what do you think
about that and what do you think it
means if we end up seeing that
divergence continue?
>> I think that is absolutely true. We see
this all the time even within our own
companies. Uh yes uh people's ability to
leverage the technology varies a lot. I
mean the reality is we are moving
towards a world where there's going to
be more and more of that technology and
so the people who have the ability to
understand and leverage the technology
are going to have an edge.
>> Okay, I agree.
Uh all right, last question before we
take a break and go on to some more of
like the practical applications, some of
the more coherent stuff. I I still can't
wrap my head around the fact that
>> the AI labs are so close together in
terms of the technology they produce.
One builds some innovation, the next has
the innovation. Um, one seems like it
leaps ahead, the next seems like it
leaps ahead.
>> Can you envision a scenario where one of
the labs just like kind of hits on
something and and can actually open up a
lead against the others or is it just
going to be neck andneck forever?
I think it's really hard to keep ideas
in a box
>> especially because in many ways these
ideas they reside in in people's heads
and I mean you've seen as much as me the
movement of people between these
companies like they're always you know
pingponging back and forth they carry
the ideas with them you know even if the
code stays on on one side like once
you've seen some insight you can't unsee
it
>> right
>> and so you They may need to reimplement.
They may need to articulate it in
different ways. They may give it a
different name. But ideas just
circulate. You can't keep ideas in a
box. And that's why honestly for many
years I've been so much an advocate for
open science. I just don't believe that
you can keep these ideas boxed in unless
you're willing to keep people boxed in,
which we are not willing to do. Um, and
so I don't I don't f I don't think we
have a way to close the ideas. We should
embrace the fact that when you let the
ideas circulate all of us progress
faster
>> and then the question is let's say all
these labs do reach super intelligence
you know it's been asked well you can't
hoard it so where's the economic value
in developing it? Yeah, we're still very
very early days in the technology and
we're even earlier days in terms of like
what are going to be the dominant
economic models, what is going to be the
right business strategy in the age of
AI. Um I think we need to give ourselves
the time to experiment. You know, now we
have 30 years or so perspective on the
internet and the economic impact of that
and it's going to take a number of years
before we we figure that out. But often,
you know, those who develop the
technology are not necessarily the same
as those who scale the technology versus
those who actually commercialize it
versus those who actually control it and
regulate it. So, there's a there's a
pretty complex ecosystem that is all
going to arise out of that.
>> Okay. Well, at at the at the other side
of this break, we're going to talk about
some real economic impact of this
technology already. Talk a little bit
about what Coher is up to.
>> Um, and then we'll cover a lot more.
episode. We'll be back right after this.
And we're back here on Big Technology
Podcast with Joel Pino, the chief AI
officer at Coher. And of course, this is
part of our Davos series that we're
hosting at the Qualcomm House uh here in
Davos and running over the weeks
following. So Joel, it's great to have
you. Uh let me give you what I've
gathered as the use cases in business um
for AI and you tell me if I'm missing
any and then maybe what you think is the
most valuable. All right, I wrote four
down. Uh one is external chat bots, the
customer engagement type of chat bots,
the type like Brett Taylor talked about
at Sierra. Uh the other is internal
knowledge. So let's say a company has
knowledge within the company and it's
all fragmented and maybe there's a bot
that you can start to query internal
knowledge.
>> Third is papering over systems that
don't work. I don't think that needs
much more explanation. It's like the
story of
>> skeptical about that but still
>> and then the fourth is automation.
>> Yeah.
>> Am I missing any big categories in as
far as AI in business and where do you
think the the real value or the biggest
category is right now? I think there's
like different ways to slice it. I think
that's a perfectly reasonable way to
slice it. I think another way that I've
seen it sliced is between like
predictive AI, generative AI versus
agentic AI, which is like a whole other
level of opportunity. Um, and then the
other way I've seen it sliced is is more
by application domains, right? like you
know whether it's what AI is going to do
in healthcare, what AI is going to do
for scientific discovery, what it's
going to do in banking, what it's you
know doing for example um public sector
and so on. So that's the other way that
people have have looked at the different
uh the different
case classes of opportunity.
>> And so what do you think the biggest is?
>> Um the
there is so much potential. I I hesitate
to pick one. Um I I will say you know
quite frankly where Coher has placed its
chips and and and the core hypothesis is
on um the case of enterprise AI that
needs really high privacy and security
guarantees. Okay.
>> I think there's a big cluster of
applications which falls a little bit in
the in the second category that you
outlined where you know you have a lot
of internal business intelligence
information perhaps fragmented. you want
to be able to leverage all that
information to uh to empower your
employees. And so in that case,
especially when that information is
something that you don't want to pop up
on the web through an API, um there's an
opportunity to build aic systems that
work inhouse with the local data that
inform the employees and are essentially
like close partners to the employees.
>> Can you give me like a use case or a a
case study? Yeah, I mean we do a lot of
work for example in financial services
uh because as you can imagine I a lot of
that data is quite sensitive in terms of
information uh very concrete use cases
we're seeing is uh for um financial
analysis. So you know we have people
whose job it is to advise various
clients um and they need to pull on
diverse set of data like what's the you
know what's the information that's
relevant to this particular customer
what's the information that's relevant
in terms of like the current landscape
the possibilities and so on and kind of
pull all of that information to make up
like a personal plan a financial plan
for a client um is the kind of
application that this technology can
make much easier and you can essentially
then query your plan decide Do I have
enough information? Do I need to gather
more sources of information? And you can
combine the internal with the external
information, but the output of that
stays private. It stays secure. It stays
in the hands of just the people who need
to see that information.
>> You know, I'm glad you brought that up
because I was asked recently by someone
in the financial service industry, uh,
what's going to happen to entry- level
employees who were doing a lot of that,
you know, collating and pulling in the
external information.
>> And I didn't have a great answer. I I
you know because
>> you know you pay you pay entry- level
employees less than your standard
employees and you you anticipate there's
going to be some learning on the job and
some productive
>> things and now question is what are
these people going to do
>> if I can do it for them
>> if these entry- level employees are able
to use AI properly they're skipping
ahead to the level where they can
actually be fully functioning analysts
and they can essentially do 10x the job
with the tools and so their growth both
in terms of their ability to deliver
value to the employer has just been
magnified by giving them the AI tools.
>> So is then the threat really to the
middle the people who are mid-career who
are going to get I mean it's like it's
the old story of the social media intern
who comes in and all of a sudden is
managing like PR or marketing for a
company. Is it the Gen Z kid who like
uses who who knows who knows how to
prompt and can use cohhere and all of a
sudden the person who's been doing
things for 15 years in a certain way has
to look over their shoulder.
>> I do think that whenever you introduce a
completely disruptive technology,
that is a lot of what you see. You see
the younger generation for whom that
technology is native and is very
intuitive and they really you know learn
how to use it very quickly and that just
makes them so much more effective and
productive and folks who are not able to
engage with a technology as quickly are
finding themselves as a disadvantage. I
just remember um being early in my
career and maybe this is why I didn't
last very long in a company and had to
go start my own but you know wanting
want having the energy and wanting to do
things and you know
>> if I would have had something that could
like build a prototype and I could bring
that to the meeting and show it as
opposed to like can I have like a couple
hours of the developers time to work on
this side project
>> that would change things.
>> Absolutely. And and and to be honest,
right, like that capability is afforded
to anyone in the company, right? It's
not just the the the more junior
staffers that have access to it. It's
also the people who are in leadership
position which instead of like writing
out a memo suddenly can go out and like
produce a full-fledged prototype. They
don't need, you know, 10 people, 10
staffers to help them produce their
prototype. They have an idea, they can
quickly prototype it and they send that
to the team to get uh to get moving with
a project. So I think that kind of
capability is is is going to open up new
new ways to new ways to to set up
projects across the organization.
>> Um this cloud code thing has been
interesting to watch.
>> Yeah.
>> Uh it went like overnight from something
that will like autocomplete developers
code to like will go out on the internet
and do things and build things to
accomplish
>> specific tasks. So, is this idea of AI
systems going out and doing things like
on one hand, you know, I hear I see the
story of like and I've said this on the
show a couple times, but like you know,
the former Amazon CEO of Worldwide
Consumers uh going out and vibe coding a
CRM over the weekend.
>> Um, you know, that's that's cool, but
I'm also just like,
>> you know, re how real is that? So, I'm
curious. Oh, okay. You're giving me a
look like, yes, it is real.
>> Well, I think that goes back to my idea,
right? like those who are able to
prototype in this way. It doesn't mean
that whatever you've vibe coded into a
weekend suddenly turns into $100 million
business, right? But it's a way to
communicate with your teams
>> your intention. So, as long as you have
good ideas, you're able to share these
ideas in a way that's much more real and
to start prototyping much faster.
>> Now, there's other ways to communicate
your ideas. There's other ways to direct
your teams, but that suddenly opens up
so much more. It is interesting how AI
is and AI is many things but it's a
communication technology. it it's
becoming that right and this is the kind
of thing that that the new the new
coding agents are opening up
>> is cohhere does cohhere have like a
version of this that it's working on
>> um I would say like yes we're working on
the same kind of capabilities we're
building core generic models I would say
that's uh it's a bit of a different
experience right now uh that we're
offering in terms of the north platform
but there is a lot of that sort of
collaborative work there's a lot of this
like you know going out and essentially
deploying agents, leveraging external
external information. So there's some
there's some uh elements that are
similar uh but we're less focused
specifically on on coding use cases
right now.
>> U coher obviously has raised a lot of
money, more than a billion dollars. Um
but
I I'm this is not I'll just like draw it
out. OpenAI sneezes that over a weekend.
Um you have a world now where AI is
being developed by a handful of very big
companies.
>> Uh your former employer Meta is a big
player, Amazon, um Google of course,
Microsoft and then you know OpenAI and
Anthropic with these they raise the
entire like uh years worth of VC money
uh in a round now. Um what do you think
about the risk of the fact that so much
of this is being concentrated in so few
hands?
>> Um honestly I do think it's beneficial
to to the ecosystem to have multiple
groups who are able to to develop models
and to deploy them. I think you know
just to give you a concrete example
right coher was very early on working on
multilingual models. So the ability to
understand information, digest
information across multiple languages,
20 30 and so on languages. We had a line
of models that is really well respected,
open sourced and so on. Um it's just not
on the radar of of some of these
companies that are very focused on you
know on English ccentric information.
Completely fine you know different space
for different companies. when we get
into markets in Asia, when we get into
markets in Europe, suddenly it matters
to have a model that is actually
state-of-the-art across languages or
across the the local language. And so
that opens up completely new market.
Right now the the the opportunities are
so broad that actually there's space for
you know upand cominging players to to
really keep on growing to have a very
healthy uh revenue to bring in talent to
actually build new things that are
different from some of these other
companies are building. So I tend to
think it's super healthy to have more
rather than fewer companies that are
building AI. And I think we're seeing
the fact that you know going back to my
idea of you know many different AIs who
do many different things even at the
company level this is what's happening
there's a number of players who are
building different things and and
learning from each other
>> but the fact that big tech has so much
of it
>> not a worry
>> it doesn't worry me
>> okay
>> and I mean you know we could have a a
much longer discussion about it
>> but it doesn't and it doesn't cause me
to lose any sleep over the fact that
like what we're building at cohhere has
like an amazing amazing paths to to be
successful.
>> Okay. By the way, I mentioned Anthropic
and OpenAI which have Microsoft and
Amazon and Google have massive stakes
>> and there are many more.
>> Yes. Um somebody who does Wario, Dario
Amod from Anthropic, well maybe not the
fact that he got all that all those
billions from Google and Amazon. Um but
he does have some things to say about
the big tech companies. Here's a thing
that he said recently. Uh some of these
companies are essentially led by people
who have a scientific background. That's
my background. It's steab's background
for Google DeepMind.
>> Uh some of them are led by the
generation of entrepreneurs that did
social media. There's a long tradition
of scientists thinking about the effects
of the technology they built
>> um and uh not ducking responsibility. I
think the motivation of entrepreneurs
particularly the generation of the
social media entrepreneurs are very
different. Uh they the way they
interacted you could say manipulated
consumers is very different. So
basically I don't think he wants them
running
>> strong opinions from DI which is I guess
not not not something um out of
character for Dario Holiday.
>> Uh do you think that's a legitimate
concern?
>> Because it's so interesting you're a
research scientist
>> who also worked at a social media
company. So if anyone knows the answer
to this it will be you.
>> I mean I think what what's really
important like no one is going to be
good at everything
>> right? The question is like how do you
get others in the room to advise you on
how to build something great? And you
know I spent some time at Meta. I would
say there was a there was a very strong
channel from researchers to to the
leadership team and and the opinions
were brought into the room. I think you
know I've seen that certainly at cohhere
where you know the the research team the
modeling team the product team like
there's a room where all these points of
views can come together. I go back to
this thought like I can't expect one
person to all have all that that
information and as long as they're
building up the teams that are diverse
>> that are listening to these diverse
voices like they will build better
products at the end of the day. M okay
on that note uh as ads have started to
enter the picture
>> for generative AI um there's a wonder
among outsiders like me about whether
these companies will uh do things like
engagement max and try to optimize for
time spent so they can you know get
those numbers up. Um, I don't want to
ask you whether you think that's going
to happen or not, but I want to ask you
as a researcher, whether that's even
economically feasible. Are the models
now efficient enough where like a vis,
let's say you were to show an ad, a
visit to to like to serve that visit
with an LLM, uh, could be a profitable
thing. or is it still so expensive to
serve these use cases that this even
notion of engagement maxing doesn't make
sense because economically it's not
valid.
>> I mean in general right like through
trial and error we we find economic
models that are viable right like that's
still how it is. So it depends a lot on
the pricing model and so on and so forth
and so
>> expensive ads to buy
>> but you know it depends you know yeah it
depends on how the model is set up. I I
don't know that this is the way that
that gets rolled out initially. We'll
have to see what what's the progression
of that. I I do think, you know, we have
the ability to tailor content based on
the information we have that is there.
That is a lever that's going to continue
to be used from an economic point of
view.
>> AI sovereignity. uh before we go
>> uh countries are start and and
institutions like banks
>> uh are starting to build their own their
own models or uh they're not relying on
off-the-shelf stuff. So talk talk a
little bit because this is something
Coher is working on. It's something I
don't know a lot about the fact that
there is this push or at least it's
something that is being discussed. So
what is AI sovereignty and how is it
playing out?
>> Yeah, sovereignty has been used in a few
different ways. um in some cases it
means the ability to have your own
model. So in the case of of financial
services and banks that is definitely uh
something that they spend a lot of time
investing in thinking about looking for
solutions. They see the opportunity they
were I think early adopters even of you
know previous generation AI technologies
predictive models for example
statistical models and so on and so they
see this as as the natural evolution. So
they're pretty advanced in terms of uh
their sophistication and their readiness
for AI. Um and often they have the means
to to to invest in it. Um and so we're
definitely seeing a lot of interest
there. Often though, you know, I think
the the talent gap makes it a little bit
harder for them. So sometimes they've
tried to build their own models and so
on. Then they come to us and and they're
looking for solutions that are a little
bit uh more mature out of the box and so
on. And so we have really solid
partnerships uh going on there. Um it
the the other way to think about
sovereignty that we're hearing a lot is
that companies want a robust plan for
AI. And so you know they want options
they they they may be using one model
but they actually want to have another
model uh to be able to to compare to
benchmark. If one model access gets cut
off or too expensive they have another
one. And so there's an aspect of
sovereignty that's really about building
a robust strategy. It's not about just
using your own or using one thing, but
it's about having control over the
access to the technology.
>> Yeah. It's just as you speak about it,
to me, it's just amazing how fast this
has moved. And going back to our first
meeting in 2022, uh the fact that we're
it's 2026, so it's been three years and
change.
>> Uh but it's it's just a world of a
difference year to year to year.
>> Yeah.
>> So, last question to you. Can the pace
keep up?
it is still moving very fast on so many
fronts you know just the the size of uh
the investments
um I think on adoption we are so early
in the curve and so that's that's going
to be the next challenge to see how do
we how do we enable this technology to
sort of disperse through through society
through the through the business world
um in people's lives and how do we do
that successfully but yeah I think the
pace especially when it comes to
commercialization and adoption is really
very very early days. So got a long way
to go.
>> Seriously. Well Joel, we've spoken a
handful of times. I always appreciate
how you're able to take these big things
that a lot of us are wondering about and
grounded in the research and the
practical side of things. So you're
always welcome on the show and thank you
for coming on.
>> Thank you. Always a pleasure.
>> All right, everybody. Thank you so much
for watching and listening and thank you
to Qualcomm for having us here at the
space at Davos and we'll see you next
time on Big Technology Podcast.
>> All right,
>> thank you.
>> That was great. Thank you so much. Thank
you.
>> Thanks everyone.