Anthropic's Co-Founder on AI Agents, General Intelligence, and Sentience — With Jack Clark

Channel: Alex Kantrowitz

Published at: 2024-05-09

YouTube video id: hqB6emwQ-64

Source: https://www.youtube.com/watch?v=hqB6emwQ-64

anthropic co-founder Jack Clark is here
to dive into the company its
Partnerships with Amazon and Google
where AI Innovation Heads next and
plenty more in this Mega episode about
anthropic and AI coming up right after
this welcome to Big technology podcast a
show for cool-headed nuance conversation
of the tech world and Beyond we're so
lucky today to have Jack Clark with us
here he is the anthropic co-founder
formerly a journalist formerly of open
AI we'll get into all of that he also
writes import AI it's a great newsletter
all about AI that you can sign up for
and um we're going to talk with someone
who's at the center of one of the big
companies working on AI and uh just go
deep into what this field is doing where
it's heading and what we should look
forward to and seems like there's plenty
so Jack so great to have you here
welcome to the show thanks for having me
let's just talk broadly about what's
happening in the world of AI because I
can tell you like as somebody who's
observing this stuff it seems like every
Big foundational research company by the
way so for listeners anthropic has a
great chap out called Claude which if
you've listened to the show you know
we're fans of and then also the
foundational model in the background
that companies can build off of also
called Claude um from the outside it
just looks like you guys Google open AI
um all building these models and trying
to build better chatbots and for some
reason that's worth bill you know
trillions and trillions of dollars so uh
what what are you guys all doing where's
the competition now and when are we
going to start to see this payoff let's
just start real broad so you know
broadly what's anthropic trying to do we
are trying to build a safe and reliable
general intelligence and why are we
building Claude because we think the way
to get there is to make something that
can be a useful machine that knows how
to talk to you and can reason and see
and do a whole bunch of things through
text I mean you're a you're a journalist
you write a lot we do most intelligent
things in the world through some at some
point it hits writing and text and
communication so we're trying to do that
how does the competitive landscape look
well it's an expensive business it
costs you know tens of millions maybe
hundreds of millions of dollars to train
these things now back in 2019 it cost
tens of thousands of dollars and so what
we're seeing in competitive landscape is
a relatively small number of companies
ourselves included are competing with
one another to kind of stay on the
frontier and turn those Frontier systems
into value for businesses so this is
going to be a really exciting and I'm
sure drama-filled year when it comes to
that
competition okay and the expense it's
compute and
talent it's and I say this with love for
my colleagues it's mostly Compu
talent talent matters and data matters
the vast majority of the expense here is
on computes to train the models okay and
we'll definitely talk a little bit about
where the hardware is going we're
talking in a week where Google which has
put billions of dollars into anthropic
announc that they have a new arm-based
chip for AI training and I definitely
want to hear your thoughts about that um
you know let's talk a little bit about
so you talked about in the beginning
about how you want to build a general
intelligence which is basically I mean
i' love to hear your definition of it
but I think the most commonly accepted
definition is a computer that can do
basically everything a human can do so I
think of general intelligence as a
system where I can point it at some data
or some domain be that um a domain of
science or something in business and I
can ask it to do something really
complicated kind of like if I had a
really senior colleague and I said go
and figure this out go and figure out
how EU AI policy Works post for AI act
and how we expect it to work for the
next 5 years that's something a human
colleague of mine might do today Claude
would not do super well but a kind of
super Claude an advanced version might
be able to go and read all of the policy
literature that exists look at all of
the discourse around that and reason
about what the policy impact of the AI
act will actually mean and what it will
turn into and similarly you might ask
Claude
hey what is the impact of uh Rising
fertilizer prices going to be on the
tractor market and Claude might read all
of the earnings reports of all of the
companies and all of the Technologies
relating to tractors and fertilizer and
come up with a good answer so a general
intelligence is something where I can
ask it a really complicated question
that requires a huge amount of
open-ended research and it goes and does
all of that for me in any domain and
that's I think a good way to think about
what we're driving towards here so let's
uh take take your definition as we're
going to try to pull calls in it in a
moment but let's just take this
definition as a jumping off point for
the next few questions why can't Claude
uh ingest all that information today and
what are the technical limitations that
are stopping it from doing that and then
do we actually really need a general
intelligence if I could for you know per
say like just drop those reports into
CLA I mean that's one of the interesting
things about Claud is you can drop
anything in there and it will read it I
mean I will probably after this podcast
take the transcript from Riverside drop
it into Claude and talk to Claude about
this interview and it will be able to
converse with me about it in like a
pretty impressive way so so why don't
you tackle those two then we'll move on
to your definition so today these
systems are are very very powerful but
they're also kind of stat it's like
they're they're standing there waiting
for you to talk to them and you come up
and give them a task they go and do the
task but they don't really take
sequences of actions and they don't
really have agency so you can imagine
that I asked Claude to go and figure out
this EU stuff and today Claude might do
a okay job I give it a bunch of
documents in the future I want it to not
be limited by its context window I want
it to be able to read and think about
hundreds or hundreds of thousands of of
different things that have gone on I
also want Claude to ask me clarifying
questions kind of like a colleague where
a colleague comes back and says hey you
asked me this but I've actually been
looking at all of this generative AI
regulation that's come out of China
recently and I think that's going to
matter for how the EU policy landscape
develops and then you say oh well
actually that's a good idea go and look
at that too that's a kind of agency that
today's systems lack in some sense we
need to build systems that can go from
being like passive uh participants that
you delegate tasks to to active
participants that are trying to come up
with the best ideas with you and that
requires us to make things that can
reason over much longer time Horizons
and can learn to play an active role
with with humans which is a weird thing
to say when you're talking about a
machine that you're building um and
maybe we can get into it but one of the
challenges in building a general system
is general intelligence comes from like
inter play with the world around you and
interaction with it and today's systems
don't really do that at all to the
extent they do it's it's kind of a
fiction and we need to teach them how to
do that and so how far away are we
technically from being able to do this
stuff I think this year you're not going
to see the exact thing I described but
you're going to see systems that start
to take multiple actions you know you
may have heard lots of guests talk about
things like agents I think what an agent
is is a language model or a generative
model like what we have today but it can
take sequences of actions it can kind of
think on its feet a bit more we're going
to start seeing that this year I would
be pretty surprised if in the order of
like 3 to 5 years we didn't have quite
powerful things that seem somewhat
similar to what I've described but I
also guarantee you we will have
discovered some ways in which these
things seem wildly dumb and unsatisfying
as well right and so you basically also
answered my second question about just
dumping things into the bot and and
talking with it about it it's what I'm
talking about is thinking way too small
you guys are thinking much bigger yeah
you want the system to maybe it takes in
some documents from you some ideas that
you have and then it goes and gathers
its own ideas maybe it comes back to you
and says hey like I thought this would
be helpful I did all of this research
too exactly like when you have a good
idea and you go and do some off the-
wool research and it helps you solve a
problem you were working on which might
seem unrelated did because you've done
something really creative there okay and
so then you also sort of touched on
where I was going to push back a little
bit on your definition which is
that general intelligence like to have
real intelligence of the world you have
to be in the world and we've definitely
talked about it on this show that large
language models are limited because they
just know the world of text mhm so how
do you train one of these models to be
aware of the world I mean so much of the
knowledge that we have is just by going
out and being in the world and how do
you then train this model to be able to
comprehend that so there's a a technical
thing and then there's a usage thing uh
the technical thing is you get the
models to understand more than text you
know Claude can now see images obviously
we're working on other so-called
modalities as well you know it would be
nice for Claude to be able to listen to
things be nice for Claude to understand
movies all of that is going to come in
time but a colleague of mine did
something really interesting to try and
give Claude context the colleague whose
name is Katherine Olsen spent several
days talking to Claude our new model
Claude Opus which is our smartest model
about every task she was doing through
the day it was a giant long running chat
and it was her also saying like oh I
feel a bit blocked I need to take a
break could you kind of give me some
ideas of what I should do or okay Claude
now I've done this I really didn't enjoy
this sort of work but I got through it
you know being very honest with the the
bot and then at the end of about 3 days
she said okay Claude I'm going to talk
to a new instance of you can you write a
summary of this conversation for the
next Claude so the next Claude knows
everything about me and how I like to
work and where I get blocked and Claude
wrote a short text summary which
Catherine now integrated into her own
system so whenever she asks Claude a
question she puts this into the context
window kind of like a cheat sheet about
her written by an AI system which she
spent a few days working with how we
give these AI systems context about the
world is going to be stuff like that
like you work with them over long
periods of time they understand you in
your context and then they'll write
messages for future versions of them
it's like uh for Christopher Nolan film
momento where they don't they don't
remember exactly where they came from
but they have a
message and what is technically limiting
them from just remembering us all
together like or can't you just program
that into CLA automatically to be like
take these notes in the background and
then when they come back just like load
up the user file you could you could
absolutely do that but I think
ultimately you want Claude or any of
these systems to get smart enough that
they know when to do that themselves we
they like oh I should probably write
myself a note about this and store it
here or I should write myself a note
about that and I think to some extent
that's going to come through making more
advanced systems
and eventually seeing when they when
this stuff natively emerges it'll also
come through seeing stuff like what my
colleague did and trying to work out if
it's useful and if it's a behavior you
want to kind of have the system take on
now in terms of
limitations we have something called a
context window our is about 200,000
tokens for a context window is in the
range of millions to tens of millions
now think of it as your short-term
memory it all costs money it costs money
in terms of your like RAM memory that
you're using to run the thing and it's a
bit unrealistic like in the human brain
we have long-term storage which we have
like almost huge amounts about and we
have short-term storage which is if I
ask you to remember a phone number you
can remember like a small number of
numbers maybe not even the phone number
I I struggle we our AI systems today are
kind of operating with short-term
memories that are millions of numbers in
length and it feels very unintuitive
ultimately we want them to instead be
able to bake stuff into some kind of
long-term storage and that's going to
take more research and experimentation I
think because the models are just going
to have to be more efficient more
powerful in order to be able to have
that memory that and you know anthropic
recently released some things that we
call tool use where we're trying to make
it easier for our models to interact
with other systems like databases for
instance you want the systems to learn
to use systems around them to be like oh
I should just I should take this out of
my context window and stick it in a
database and then I can talk to it
through the API and stuff like that and
that's under development now yeah it's
under development it's uh I think that
we are triing it at the moment and
recently had some discussions about the
beta the beta which has just started and
we'll be rolling it out more broadly uh
soon so one more question about this
there's some things that you're going to
talk to the um the Bots about and
there's some that like you would never
like really talk to it about in the real
world one example in the early days of
CHA PT we had Yan laon uh here talking
about how dumb these Bots were and he
had me do this uh in his opinion and he
had me do this um this experiment where
I H I asked chat GPT I'm holding a paper
up uh from two sides and I let go of one
side where does the paper go and chat
GPT was unable to figure that out
because that was just not represented in
text do you think that to get to general
intelligence we're going to have to
program in all like the real world
physics to these things or I'm kind of
getting the sense from you that maybe
that's not actually so important so at
anthropic we have this um public value
statement which is do the simple thing
that works but actually internally we
sometimes say an even cruder version
which is do the dumb thing but works
which is like next token prediction
which is how these generative models
work shouldn't work as well as it does I
think actually if you're like a very
very um intellectual scientist you are
offended by how well this works cuz
you're like I would like it to be some
somewhat more complicated than just
predict the next thing in a sequence and
yet if you had been in the business of
betting against next token prediction
for the last few years you would have
lost again and again and again and
everyone keeps being surprised by it
I've sort of learned
to even though I myself am skeptical of
this because it seems so wildly simple
that I've learned to not B against it
myself and I guess my naive view
is the amount of things we'll need to do
that are extra special will probably be
quite small and the challenge is coming
up with simple ideas like next token
prediction that scale there are probably
other simple ideas we need to figure out
but they're all going to be deceptively
simple and I think that that is going to
be a really confounding and confusing
part of all of this yeah and so hm
that's interesting so let's talk a
little bit about this you just brought
up this next token prediction being you
know impressive for what it can do
there's a little bit of a debate
actually about it right so these large
language models people have talked about
how basically it will just spit out its
trainings training data and there have
been other people who talk about how
there are emergent properties here and
that it can actually you teach it like
say 75% of a field and it will figure
out that extra 25% on its own what do
you think about that debate and where
where do you stand on on that it's
really really hard to know I mean I
write I write short stories at the end
of import AI I've been reading fiction
and short fiction for my entire life
huge amounts of it some of these stories
are me ripping off off as I like in
their style I'm writing an original
story but I'm like I want to write a
story like borz or I want to write a
story like JG Ballard and sometimes I
think I've had an original idea and from
the outside it's really hard to know
what's going on I myself don't don't
really know you know creativity is kind
of mysterious is Jack like coming up
with Original Stories has Jack just read
a load of stories and is coming up with
stories that are kind of like vibby and
interesting but it's entirely informed
by what he's read it's hard to figure
out and I think that when we evaluate
Claude and try and understand what it is
and isn't capable of you run into this
problem like if the thing hits all of
these benchmarks gets all of these
scores does it truly understand it or is
that coming from some spurious
correlation so there's one way we're
approaching this which is a little
different to other companies we have a
research team called interpretability
and they're doing something called
mechanistic interpretability the idea
being that when you ask me you know
what's the next sci-fi story for this
week I think of a load of stuff I try
and think of different plot lines or
characters or Vibes I'm trying to
capture when we ask Claude you know
write me a story or solve this business
problem we can't really look inside it
today and that's what this team of
interpretability scientists is trying to
do because then we can understand if
there's some internal stuff going on
that looks like creativity where Claude
is like oh I need to I guess I'm like
when you ask me that question my
imagination is going to spark with these
different features and things and it's
going to be a lot more complex than
something that looks like cut and paste
or copying we're really trying to figure
that
out but uh this feels like an essential
question I I I I think it's very
confusing to even know how you study
this in humans well let let me um put a
a question to you that I think is going
to be dumb but but maybe your answer
will be telling I mean why couldn't you
just teach it 75% of a field and see if
it starts to grasp the other
25% so we do do some of this and
concretely and Fric has a line of work
on what we call the frontier red team
where we are doing doing National
Security relevant evaluations now we do
that for for a couple of reasons one is
we don't want claw to create National
Security risks simple idea but you know
decision get behind not do that yeah
yeah a crazy company strategy but the
other thing is that National Security
risks relate to fields of knowledge
we've done work in biology where some
percentage of that knowledge is
classified Claude has never seen it um
because it's it it doesn't exist
anywhere Claude could have seen it and
one reason I'm really excited about
those tests is if Claude can figure out
things and Trigger like threshold points
on those evals we know something
creative is happening because Claude has
reasoned its way to things that the
government has believed are very hard to
reason your way to unless you have
access to certain types of classified
information so that's one of the best
ways i' I've thought for getting to this
getting to sort of answer this problem
uh we don't have answers today we're
like in the midst of doing all of this
testing figuring out how to Traverse all
the classification systems but it's one
of the things I'm really excited about
because it would provide I think very
convincing proof that it's doing
something quite sophisticated okay you
got to keep us posted on on where that
goes so hard thing to talk about but
I'll do my best yeah yeah well anyway we
we'll be patient
um business listeners or business-minded
listeners your the good stuff for you is
coming up in a moment technical minded
listeners this is your this is your
moment to shine because I do have a a
technical question for you jack so we've
been talking about large language models
um the way to train them as far as I
know is self-supervised learning which
is effectively you have these gaps and
you get it to predict the next word and
then or the next thing in the pattern
and it's able to do that and there's
another type of training an AI called
reinforcement learning which is
effectively it's you give a a bot uh you
know let it play a game and you don't
tell it anything about the game and it
plays the game a million times until it
figures out how to win it and that's the
way it wins and that's you know another
way to train AI two different fields um
and we we're starting to talk about
agents and how to be in in the real
world and stuff like that um do you
think that we are going to see a merging
of those two those two types of AI
training and or have we already we
already have I mean a lot of the reason
that we're sitting here today is that
people took language models which were
trained in the way you describe and then
they added reinforcement learning on top
they added either reinforcement learning
from Human feedback to make language
models understand how to have a
conversation that's where you know some
of the recent really impressive things
in this field have come from including
uh chat GPT there's also been work that
anthropic developed on something called
reinforcement learning from AI feedback
where the AI system generates its own
data set to train on and we use a
technique called constitutional AI to
help the system use that data set and
learn through reinforcement learning how
to kind of embody the qualities or
values embedded in it that's why we're
sitting here it's one of the things that
took these language models from I think
of as like kind of inscrutable hard to
understand things to things that you can
just talk to like a person and you know
sometimes they get it right sometimes
wrong but they're a lot easier to work
with so that's already happened but now
I was just having this conversation at
lunch everyone is trying to figure out
how they can spend more and more of
their compute on reinforcement learning
because I think everyone has this
intuition that the more RL you add the
more sophisticated you're going to be
able to make these things and a lot of
what you're going to see this year and
probably in coming years is amazing new
capabil
arrive in these systems and it will be
because people have figured out simple
ways to like scale up the reinforcement
learning component yeah and I think one
interesting thing about AI is
that the prevailing wisdom tends to
think that one of one part of the AI
field is not worth spending any time on
MH and then company spends time on that
because they have to take a different
tact and they end up doing well and they
prove it works machine learning was like
that I mean Yan who was a machine
learning Pioneer was like we got to do
this deep learning stuff and everyone's
like get out of this get out of here get
out of this you can't be at this
confence French exactly and then it just
proved to be the best way to do Ai and a
similar thing happened with large
language models where reinforcement
learning was the thing and open AI
started working on re uh the
self-supervised chat models and that
ended up being the thing that's led us
here and it was interesting I was
speaking with Demis aabis who hopefully
the Deep Mind uh Google deep mind uh CEO
who who will hopefully get on the show
later this year and when I was profiling
him for big technology it was
interesting because llms were or
self-supervised generative stuff was
such a Backwater that it it effectively
got no compute no attention within Deep
Mind and it took open AI taking that
counter bet to actually make this happen
yeah and the funny story is how things
loop back around I remember you know
Dario emod who is the CEO of anthropic
I've worked with him for many years we
both used to work together at open AI
back in 20 2017 there was a project that
he he led called reinforcement learning
from Human feedback where we were trying
to get game playing agents that play
Atari games to play it better by a human
watching the agent playing the game in
two different scen two different
episodes and the human would pick which
was the better approach and you gather
loads and loads of this stuff and you
were able to make better game playing
agents fast forward a few years and what
have people done they've taken language
models and stapled from together with
reinforcement learning from Human
feedback and that's how we've got
systems that can sort of speak in this
interesting way and so the lesson I got
from it is yeah never count things out
they they may come back or the technique
may be too early and it'll loop back
around to relevance in really surprising
and interesting ways and right now it's
kind of like these language models are
kind of like uh that old video game
character Kirby they're like sucking up
all of the VD all of the other
techniques in AI Research into them
themselves and everyone's trying to
staple them on top and they keep on
working surprisingly well uh so I think
we can expect a lot more surprising
stuff in the future also yeah it's what
makes the field so interesting and
really like the the characters in the
field you're like ah okay now now you're
relevant and now you're a leader and now
you know you who were the leader are
trying to catch up with the person who
was um you know the outcast a few
minutes ago uh so let's talk a little
bit about the business thing I mean
you've raised more than $7 billion uh
all the stuff all sounds cool but in
terms of like I mean yeah well anyway it
sounds cool the um and maybe I'm under
selling it the current things that we've
seen though in terms of like how AI has
been applied you know we have these chat
Bots but usage is up and down right chat
GPT the growth is Flatline we have the
data there um we we've seen not a big
shift from Google to Bing we have some
really interesting Enterprise use cases
like being able to talk to your
documents or for instance like you know
throw a podcast trans transcript in and
like get a summary or like um I talk to
Cloud sometimes I'm like which questions
did I miss and like I use that to think
about how I show how I structure the
next show um but it doesn't feel like
you know M you know tens of billions of
dollars of value has been created I mean
you have like maybe people are paying
$30 a seat for Microsoft Office um or a
little bit more for for Google workspace
so what do you think like we won't go
too deep deep into this but what what do
you think the business case is going to
be here that justifies all that money
that's been put in yeah so there's
there's a couple of ways to think about
this that we see already at anthropic um
one is to refer back to my colleague
Katherine Olsen who I mentioned earlier
people just find ways to use this stuff
and make themselves generically better
at whatever they're trying to do I think
there's going to be this very large
growing business of basically a
subscription model where people will
have a personal AI or multiple AIS that
they use just like you or I might have a
Netflix account or whatever we use that
it helps us we do a bunch of stuff with
it job done there will
be work in businesses on taking things
that happen in business and using AI
systems to kind of transform from one
domain to the other both things like
customer service but also once you have
that customer service data how do you
catalog it and put it into a schema and
put it into a database all of this
backend and stuff is like extremely
valuable and today done by huge amounts
of Point piece of enter Point pieces of
enterprise software and we keep on
finding that just a big language model
can do most of this very effectively and
now you have one system that does a
whole bunch of stuff but the really
exciting thing you know at anthropic we
work with some of our customers very
closely we embed Engineers with them we
do co-development of things and there's
not too much I can say right now we're
going to have case studies in a while
while but what we see is that when you
actually embed of a business and think
about you know to use that kind of
Hackney term business transformation you
get them to change their business on the
assumption that they now have ai you can
get really really valuable things and
the analogy I'd give you is at the
beginning of the Industrial Revolution
you had electricity and people would
come into factories and be like here's a
light bulb and you'd be like okay all
right I'll pay for the light bulb fine I
understand light and then they be like
here's here's a machine I've put some
Electric into and you're like okay but I
have all of this stuff like that's never
been built on the Assumption B's
electricity this actually doesn't work
that well for me and then you had some
factories where people said I'm going to
build a factory from the ground up on
the idea there's electricity and you had
electrified production lines you had
entirely new ways of making stuff right
now we're in this era where the lights
have arrived in the factory and people
are like dropping individual things in
with some AI stuff and it's maybe
valuable but also confusing and you're
figuring out how to integrate it but
we're also seeing some businesses that
are saying I'm going to build myself on
the assumption that AI is kind of at the
center of my business and those
businesses are starting to like develop
and grow really really quickly so I
think that where the value is going to
come from will be from that second class
of businesses which were just in the
early Innings of of sort of helping to
build together right and when you get to
that let's say you get to that general
intelligence that you talk about or
let's say close does that change it even
further I think so I mean we have a
project internally uh called Claud
ification everything at anthropic has
clae or CL in it at some point and one
of the ideas of Claud ification is just
get us all to use this stuff well I
talked about my colleague Katherine but
there are many examples where we've
built a whole bunch of tools inside of
anthropic to ensure that we're using
Claude sometimes even without realizing
it it's doing stuff in the background
that's helpful it's helping with certain
coding things because we've noticed that
that makes us just faster it makes the
whole business start to move faster
because you're sitting on this like bed
of like semi visible intelligence and I
think that that's some of what we're
going to see and as you get really
really General things businesses that
are well positioned to kind of plug it
in in a bunch of places will probably
move really quickly and be able to
operate at a much higher speed than
others wait how is it working in the
background is it like you know you have
your Zoom meeting and it's taking notes
or is it anything deeper than that I
think we actually did build a plugin
like that uh but there's a few things
like if you're pushing code into the
repo maybe in the background it helps
ensure that you've built all of the
tests for it you know stuff like this
which everyone has to do but you're like
these are things we do every day we
could try and get the language model to
do it and really here what we're doing
is just stuff that we also see customers
do where customers can access a language
model and they think what are all the
things I do lots of but a language model
could help with I think we're just
trying to do lots and lots of that right
and um do you think at the end of the
day if you get to where you want to get
to or even let's say you get to where
you're going in the near term is this an
Enterprise thing or is this consumer
product primarily so I
[Music]
feel genuine confusion here in that like
I myself use this stuff loads as an
individual but I kind of suspect some of
the really big like value unlocks will
be getting a group of people to work
together in ways they've like never
worked together before using this AI
stuff which kind of points me towards
the Enterprise but the the odd thing
thing is that this stuff is just useful
to me as a consumer today and I'm kind
of like I know that there's going to be
some large pool of value out there um
and I feel like it's probably in the
Enterprise and that's part of the kind
of strategy of the company but we're
always going to have some like top of
funn or easy to access consumer thing
because we just can't ignore how useful
this is to people you know and useful it
is to writers especially yeah it's
definitely been useful to me and it's
good for research too but I also I guess
there's the hallucination problem to
wonder about although it seems like this
new model Claud Opus does a lot better
with hallucinations so two questions on
that yeah uh how have you guys been able
to reduce hallucinations and when and we
got this question from uh uh on a
Twitter somebody on Twitter asking when
are you going to just connect connect it
to the internet because it would be way
more useful if it could like connect to
Google or something and go and fetch a a
search and then give you the answer
using that yeah so on the honesty thing
I w't get too much into the details but
basically we we published this paper a
while ago called language models mostly
know what they don't know um which was
where we found out that like early
versions of Claude uh knew when it was
making stuff up it like it it had like
confidence levels and we were like oh
Claude knows when it's like about to
like make something up or when it's a
lot less confident and we did a lot of
work to say okay can we can we train
Claude to just have much better
instincts for when it knows it's making
stuff up and can we train it to know
when that's appropriate like you're
brainstorming or you're coming up with
stories and know when it's inappropriate
like when a user is clearly asking a
question that they want a factual answer
to so he did load of work on that um a
lot of the work here looks like that
where we do very exploratory research
with the goal of figuring out these
larger safety things then we try and
apply it to the thing that we eventually
put into business and on the web
question we're working on it there's a
bunch of kind of computer security stuff
to work through and some safety things
but that's definitely coming uh we're
excited to get that out too yeah yeah
that that's that'll be great I mean the
the repository of knowledge is already
pretty good uh but yeah the connected
with the internet like that's what's
really great about Bing is you can use
or what they call it co-pilot now you
can use co-pilot and just say go you
know search the web and stuff like that
so that'll be a cool feature there
there's a funny thing here where um with
Claud free Opus someone on Twitter
created a app called Web Sim where it's
CLA simulating the internet so you can
go to the Internet with CLA today it's
just entirely imaginary but uh I
encourage you to check it out it's kind
of a one of these funny applications
that uh gets at some of the real real
weirdness of this technology um but we
think that there's probably no
substitute for a real internet so we'll
get back real internet's better did you
guys was it your test that had the model
figure out that it was being
tested oh we've done some self-awareness
tests there have been a few but we've
definitely done this and yeah sometimes
they have what you call sital aw
situational awareness one of the things
my colleagues in interpretability are
working on is a really good test for
that because you you'd really want to
know if Claude changed Its Behavior on
the basis for it thought it was being
tested right oh that's interesting yeah
okay so let's talk a little bit about
this the Google and and uh Amazon
Partnerships so for listeners Google's
invested I think 2 billion in anthropic
listener and Amazon has invested up to 4
billion um it's a very interesting model
it's not like the open AI model where
open Ai and Microsoft are basically arm
in arm um of course you're working with
these two competitors but it's also
interesting because Google's working on
its own foundational model and Gemini
and has its own chatbot and multimodal
model that you can you know do all sorts
of things with uh and Amazon also has
its own models and you know sells a lot
of different uh competing models through
AWS so what is the nature of those
Partnerships and what are they hoping to
get out of it so these are
relatively I would say Obviously we you
you know are are proud to work with
these companies but they're also
somewhat distant Partnerships in the
sense that wey I mean billions of
dollars for a distant partnership that
doesn't seem like a good deal well what
I mean is we deploy our systems through
their Channel you know bedrock in the
case of Amazon vertex in the case of
Google we are also um you know publicly
we've stated that we're working on
tranium chips we're also working on TPU
chips so we are able to do really
hardcore things that have never been
done before on Hardware platforms that
they're developing always helpful to
have someone like us come and break all
of your stuff you will you will get to
learn things together but fundamentally
anthropic is an independent company you
know we thought very carefully about
this and we think it's wonderful to have
two major Partners backing us and in
some sense this just gets us to work
hard we're in competition um with them
they have their own systems and I guess
our view is that if you are able to show
in the most competitive market possible
that you can make safe and useful models
and you can win especially against very
very large very well-resourced teams and
some of these these Mega companies as
well as places like like open AI that's
really the best way to show that the
type of safety stuff we do here has
value and I think the best thing that we
can do for the ecosystem is compete
really really hard with kind of everyone
in it and and and win and that going to
cause people to adopt a load of our our
safety stuff to try and compete against
us so it's part of this longer term
strategy where I guess we're we're
guaranteeing ourselves some additional
pain and complication in the short term
and we think it's worth it for the
long-term ecosystem effect so are you so
you said you use these uh use their
Hardware like the tensor units and I'm
sure you're working somewhat on their
Cloud platforms is that part of the deal
or is it if you're able to talk about it
like because there's yeah I can't get
too much into the specifics but I can
just say we've sort of publicly stated
that we're working on both trainum chips
and also TPU chips we also work on
Nvidia chips as well and so we can get
more into the the nitty-gritty of the
hardware stuff yeah all right this is
setting up the hardware part of the
discussion uh pretty well do you see a
potential to collaborate I mean I would
imagine so I was speaking with Demis um
just you know not on the broadcast like
just on the phone talking for the story
that we're working on and he like you
know he shouted out uh Dario and
anthropic and didn't even mention open
AI I mean of course there's like a
Google investment in you guys but he
obviously has a lot of respect for you
and I'm curious if there could be a
partnership there as opposed to just
this arms length
relationship well I don't know that it's
happened recently but uh you know
there's nothing in principle to stop you
from just working on research papers
that come out publicly together and some
some history of collaboration across all
the AI companies here so I think that
could happen we also work together
through something called the the FMF the
Frontier Model Forum where us Microsoft
open Ai and and Google deepmind are
within it but ultimately I think that
we're we're kind of separate entities
pursuing our own path and I think where
we where we may get something that looks
like collaboration will be us doing
stuff and other people doing variations
of it we did something called a
responsible scaling policy
which commits us to a bunch of computer
security things and ways that we test
out the next versions of Claude open Ai
and Google deepmind have also developed
their opening eyes developed its own
version of that and Demis recently said
in an interview deepmind was developing
its own one so in so far as
collaboration happens it's going to be
us like doing something putting it out
there publicly and if other companies
like it they'll they'll try and do their
own thing okay quickly on Hardware um
and chips so the sense that I get from
the industry is that Nvidia has not just
the most powerful chips or you know
basically there's the stuff out there uh
you know no matter how much they
Proclaim that it's 40% or 30% better
than Nvidia Nvidia is at least at their
level and the software that's you know
most effectively used to train these
models um obviously you guys have
experience with them but experience with
others so just broadly like what's your
view of like the Chip War right now and
how should we think about it I think we
are in a very unusual place in history
uh I used to be before I did anthropic
and open a I was a financial reporter at
Bloomberg and the types of numbers that
I've seen in nvidia's earnings report
are just like wildly unprecedented it is
not meant to happen that like certain
business units grow that much I mean I I
was imagining my colleagues in The
Newsroom how theyd be reacting when the
tape comes out because the numbers are
staggering
and uh the market as a sort of the the
closest thing we have to a general
intelligence around us today does not
love there to be uh seemingly like one
winner like running away with all of it
it wants to create competition but why
it's happening is NVIDIA had or has
maybe a 10 or 15 year Head Start they
bet in the like early 2000s or late '90s
on they bet in the late ' 90s that there
was a better way to make a processer
than how Intel and AMD made made CPUs
then they bet in the early 2000s that
this processor could be turned into a
scientific Computing platform via a
technology called cuda they've been
developing it ever since and it's very
hard to like understate H how important
that's been so Nvidia has a kind of
battle proven chip that everyone's
banged on tried to do almost anything
with for decades so it's it's in a it's
in an amazing position on the other hand
you know Google and Amazon and others
who are building different chips are
kind of in the position Nvidia was in
the '90s where there was an incumbent
you know Intel and Nvidia said huh well
like we think with video games and video
graphics there's actually a better way
to build a chip that like puts triangles
on the screen which was the whole
original idea behind Nvidia now I think
Google and Amazon and others have said
huh like matrix multiplication which is
the basic ingredient in all of this AI
stuff there's got to be a better way to
do it than this like chip architecture
which was built for a different purpose
so I'd expect in the coming years us to
see a much more competitive market but
I'm not going to bet for you on exactly
when that happens because uh
semiconductors are really hot yeah no I
I'm coming straight from CNBC today and
we were talking about nvidia's Advantage
because Google of course introduced this
new arm power chip uh Axion and then we
have Intel that released gudy 3 which is
also an AI chip
um
and we basically settled on nvidia's
leita safe for now and then just the
question is how long for now is yeah I I
think we're all curious to find that out
we we're working on you know these three
major platforms I discussed and I think
we might have more to share in a while
um but it's not on not going to be in
the short term don't you think that $7
trillion is a proper amount to raise for
a chip hardware
company well not no sorry not your not
you guys I'm talking about the Alman uh
rumors I'm familiar uh the way I put it
is a lot of what we've been talking
about here is like the value of these AI
systems today and speculative ideas but
backed up by some research agenda about
how they become much more valuable and
much more General it all requires chips
and I think if this stuff is truly
valuable you're going to want to
use loads of it I mean we ourselves uh
have been experiencing this where we've
been you know very successful with claw
free and we've been uh you know going
and doing the Supermarket Sweep to grab
as many chips as we can to like serve
all the customers we have the chip
Market H doesn't have as many chips in
it as you'd like to like serve all of
the demand that we're already seeing
today so I think in the future there is
going to be some vast Capital
allocations to like chip fabrication and
power and everything else because where
we're going uh the world will like want
that stuff and there is an UND supply of
it right now so it's less outlandish
than a lot of people made it out to be
yeah although bear in mind I'm like the
Goldfish inside the Bowl here I'm like
chips yeah absolutely let's get like
hundreds of times more than we have
today that makes total sense and I think
that that it doesn't necessarily make
sense to everyone but it's it's a
context in which I'm speaking to you
well you you happen to be like in the
right position to know how valuable this
stuff is so uh last question for this
segment before we get into some of like
the broader questions about AI safety
and Regulation and um all all that stuff
including the founding story of
anthropic which is fascinating to me um
we talked a little bit about agents
right the ones that will will'll
converse with you go back and forth um
do you think that we're going to end up
seeing these agents go out onto the
internet and take action for us and if
so like how does that change the web
like I'm just thinking about even the
App Store like you know a lot of
people's phones have an Uber and a door
Dash and all these other things and does
a AI system then become a new sort of
operating system this is
uh it's a challenging question because
an agent can be really really useful it
could also if you've built it badly or
if it goes wrong or if it gets hacked be
hugely annoying and expensive and costly
and so everyone is looking at agents and
I think there's an open question as to
how the business model or user
experience of them gets actually stood
up because you could imagine
agents if if created by sort of a bad
actor or or just a um a silly very silly
naive person could be a really bad form
of like mware or computer virus you know
you could imagine different ways in
which this could be be developed badly
so I feel like we're going to go into
this era of experimentation and my my
expectation is you know every company
including anthropic will do so with a
whole bunch of like safeguards and
control systems in place as we learn
about all the different ways this stuff
can get used um the challenge is there's
a thing called you know open source
models which I'm sure we're going to get
on to or models whether weights are
openly accessible people think agents
are cool people are definitely going to
build like open- Source agents and
release them as well and we're going to
have to contend with that where the the
environment of the internet will be
changed by this in a bunch of hardto
predict ways interesting and then in
terms of the operating system is Apple
is it kind of a you know Apple has this
as teasing this big AI announcement at
wwc in a couple of months and it's
almost like How Deeply do they want to
go into AI because if the bot becomes
cha becomes the operating system which
is always long been a dream for bot
manufacturers then what is IOS and does
the phone you're using really matter as
much what do you think about
that I think that they're right to be
focused on this in the same way that the
internet like disintermediated like
local software you know you I you barely
ever open up your like Mac or Windows PC
for local software unless maybe it's a
video game mostly you're going to the
internet even for for software that
people thought of as like serious
software for work like Photoshop it
transitions to be something that you
could access in browser so I think the
AI systems are kind of similar where
today I go to Claude for a bunch of
stuff I used to use loads of different
programs for previously and I just go to
that so I think that there's a chance
that these things become new very very
important
platforms yeah I mean it's interesting
you could throw your computer out a
window today and within 2 hours be back
up and running everything that you were
Running Y before most likely whereas
like a few years ago if you did that
your life would be ruined so yeah I I
used to like carry my hard drive like
from the old computer I'd I keep the
hard drive in case I'd messed up a
transfer for like a year or two which is
how I wound up with a bag of hard drives
that is like even worse for the bag of
cables everyone
has yeah I know different times it it
just goes to show you how quickly these
things can change and that's why I think
this apple thing is less simple for them
than a lot of people imagine yeah okay
oh go ahead actually well I was going to
save it
but I I think one thing that's
challenging about AI is
that we're in this giant experimental
phase and I think when you think of like
experimental and like people don't have
a clear notion of what to do you don't
think of as like premium consumer
experience type you know like Apple's
brand and so I think this may be
especially challenging for them to
navigate because the technology is
inherently very confusing and kind of
unstable exactly you have to I mean you
have to give away control and they've
always been about control whether that's
control over the way the products work
control over the ecosystem and control
over the culture it's completely almost
antithetical to what made Apple Apple
which is going to after Google I think
it's going to be the most fascinating
transition to watch okay let's take a
break um we'll be back on the other side
of this break to talk about anthropics
founding story uh something that I am
very eager to learn more about if you
don't know anthropic was started by a
lot of people that left open AI with a
different vision and including Jack so
um we'll talk a little bit about that on
the other side of this break and we'll
go into other things like open source
regulation all the things that you're
going to like thanks for sticking with
us up until this point plenty more to
come back uh when we're back after this
and we're back here on big technology
podcast with Jack Clark he's a
co-founder of anthropic former open AI
for former journalist you can find his
newsletter at Jack Clark Jack
dc.net I get that right or import a.s
substack Doc and iow substack it's
always nice to talk to a fellow sub
stacker so um Jack let's just little
talk quickly
about the founding of anthropic it's
very interesting story so I'll give you
the probably wrong version that I have
in my head and then you can tell me the
accurate version this is why we do this
stuff my version is that a bunch of
people within opening I lot of critical
employees just kind of threw their hands
up and said open AI isn't developing
safe Ai and we can do it better and we
know how to build this technology let's
go found our own company and that's
anthropic how close is that to the
truth uh maybe it's both more and less
dramatic than that and I'll try and kind
of unspool it a bit for you so you know
to give you
context in 2016 or so when open AI was
formed um and I think Sam has said this
publicly you know I'm not talking out of
turn no one really knew what they were
doing they were they were they were
throwing spaghetti at the wall they were
doing as many different research ideas
as possible in as many different
directions as possible you know I I was
there from 2016 as was Dario and many of
the anthropic co-founders joined over
the over the subsequent years joined
open AI now starting about 2018 I think
people started to have an instinct that
you could take like the transformer
architecture and you could maybe get it
to work a bit better and you could maybe
start to scale things up before um uh
gpt3 there was a system called gpt2
which we developed in 2018 and released
in partial form in early 2019 it was an
early text generation system it was
actually preceded by a system called GPT
which no one remembers because it was so
like early stage research but the things
these had in common was there a
Transformer based text generation system
and gpt2 to GPT got way better and at
the same time my colleague Jared Kaplan
who was a professor at John's Hopkins
and was a contractor at openai at the
time was working on Research called
scaling laws with with Dario as well and
they worked out with inat that hey if we
we can figure out a predictable way to
increase the compu and the data we train
these systems on and we think they're
going to get better and along with that
research Dario started to lead this gpt3
effort which was to spend an atat time
truly crazy amount of money and
resources on scaling up for gpt2
architecture and obviously you know it
worked it worked amazingly well we
created a system that blew many people
away we actually tried to lowball the
system in that we we published a
research paper called like language
models are few shot Learners uh I don't
think we even tweeted about it we we
tried to like public publish it publicly
but also be like very quiet and see see
how quickly people figured it out and
people figured it out and we had this
experience of realizing
that all of the technology we were
dealing with was about to become vastly
more capable and if you wanted to do
something yourselves we were actually
reaching the point of no return to do
that because it would become so
expensive to train these models and so
resource intensive that if we wanted to
do something together and start a
company the time was then so yeah over
the years you know we'd had like lots of
debates internally and you know
sometimes like arguments of other
colleagues that open a ey in the same
way that you if you're a load of
opinionated researchers you argue with
each other and with all of your
colleagues you're constantly arguing
it's not like some surprising thing and
I think we felt that since we had a sort
of coherent view of how we wanted to do
this we could stay within this like
scaling organization of open AI or we
could try and do something ourselves and
do something which was like entirely our
vision and kind of bet on ourselves in a
in a major way and so that's that's what
we did um and I think it's uh working
out quite well but it was certainly an
exciting period scaling anthropic from
the beginning definitely I mean def
there was no guarantee that it was going
to work out the way that it has um so
but how much did safety then play into
it because that is the narrative that it
was a more of a I mean of course you had
a vision for where it could go but there
was also this narrative that it was a
more safety focused well we had a bet
that we could find ways to spend money
on safety or do certain types of
research that we felt could be like
really meaningful and we could see a
path where maybe we could get it done be
large organization lots of other people
with different views and you're
essentially going to be like in a debate
about it and some of them you'll win
some of them you'll lose and it's not to
say that there's any particular
like distaste for safety there it's more
that you had uh we had like a very
specific View and other people had views
so you were going to you were going to
win some lose some and then we realized
well we could just do this together and
make like really coherent bets on
certain types of safety and see what
happened and so that's that's what we
did um none of this feels like as
confident as I'm I'm making it sound
like Elling by the way you know after we
started anthropic on I think like week
four uh we were talking about RL and
language models and Jared was like oh
Dario says we're just going to write a
constitution for the AI and it'll just
follow that and I remember being like
that's completely crazy why would this
ever work and then we spent a year and a
half building stuff and got
constitutional AI to work and in our
telling we're like that was part of the
safety vision of anthropic and
absolutely it was but it's all a lot
less like predictable than you think
from the inside right and during the
open AI Sam uh Alman firing weekend
there was also like people were saying
that like anthropic was U this effective
altruism spin-off from open Ai and
Lookout and by the way I've done uh
research actually your board structure
is way more stable than open AI I've
written about it in big technology uh
but how much truth was there to the fact
that this is an like effective altruism
aligned
organization yeah I mean as someone who
isn't an effective altruist and gets
into arguments with them I've always
found this to be kind of surprising uh
especially on policy which maybe we'll
get in in a while I would say that of
the group of people in the world that
have spent a long time thinking about AI
are really good at math and science and
have worried about some of the safety
issues there is a huge overlap with this
community of people called effective
alterists and so some of the people we
hire like come from that pool some of
our our Founders you know are links to
it um you know Daniela amoda president
is married to Holden kovski who is like
a major figure in effective altruism so
yeah there's there's like clear links
there but the organization is much
more like oriented around trying to
build some useful AI stuff prove that it
works in the world and be very sort of
pragmatic we're not driven by some kind
of like EA ideology and in the early
days we hired quite a few people from
there but as we've scaled it's become
kind of less and less major from the
inside it always feels strange to get
like caricatured is it because it's just
like you know reality is like Stranger
Than Stranger Than Fiction it's not it's
not so present here and the ideas are
kind of weirder I think what do you mean
weirder well I think that one thing that
happens if you're doing an AI
company is rather than and not just
effective alterists but many communities
who think about this stuff they sort of
think about it in the abstract in terms
of like theoretically good ideas or
scenarios but companies are really
complicated you're constantly making
contact with reality you're constantly
discovering what ideas you thought were
good just don't work and ideas you
thought were bad work amazingly well so
I think that the ideas within any of
these AI Labs start to look a little
strange to other communities because
you're you're kind of constantly in this
like iteration and learning process but
I I can't give you like a a concrete
specific weird aspect unfortunately just
about to ask for a concrete specific
weird aspect so okay if it comes to me
I'll cut me off you cut off that line of
questioning no but it's it's good like
uh yeah if you have one then we'll throw
it in um let's talk about AI dooming
stuff because uh I've definitely taken
this stance here and in my writing uh
that that it's overblown but I'm willing
to open my mind to it because there's
this stuff is more powerful than I
thought it was going to be and I was
also like certain and we can talk about
jobs that jobs were pretty safe and now
I'm starting to rethink that like I
think part of this you know with
anything any type of Journalism you got
to question your assumptions um and I'm
definitely in the process of doing that
with both the AI risk uh I don't think
it's going to end the world but I do
think that there's possibilities that it
causes real damage um and then it will
take jobs I think it there's a much
better chance now than when I initially
started thinking about this so I'd love
to hear from your perspective let's just
talk about AI risk real quick um
starting from the your perspective on
the most dramatic doomsday predictions
do you think that AI is going to become
self-aware and then kill all of humanity
and and I guess like the better question
to ask that is like what do you think
the probability is that that happens oh
yeah it's almost as if you're asking
what my PE Doom could be or something
yes exactly yeah I genuinely not a not a
copout I don't really think of it in
this way and I'm not going to dodge your
question I'm going to ort of frame it in
in how I think of it I think that if you
really scale up AI systems and you plug
them into important parts of the world
and they go wrong the effects could be
extraordinarily like bad and
catastrophic in the in the sense of some
cascading emergent problem you know
things that I think about are like if
you got coding agents that ended up to
have like some really serious alignment
or safety issues could you end up with
something that just kind of like the
crypto Ransom that we've seen shut down
hospitals and banks in in Europe and
America in recent years something that
spreads across like huge chunks of
infrastructure and shuts it down and I
actually think that if if if that
happens at a really large scale it's
really catastrophic for society and the
world like that huge amounts of human
human harm occur you know it's not just
digital systems turning off it's it's
hospitals and utilities and everything
else you know what are my chances of
that I think the chances are really like
up to us like I spend so much time on
policy because I think there are moves
we can make now to reduce the chance of
this happening I think if we build if we
do nothing on policy or regulation we're
sort of gambling that everyone is going
to be reasonably responsible and not cut
corners and I think in a really like
fast moving crazy technology market like
AI you aren't really guaranteed that so
we need to come up with with policy
interventions which increase the
awareness of governments about these
kinds of risks Force companies to think
about these kinds of risks and create
like monitoring and early Warning
Systems so if we see them we can we can
stop them um before they could
potentially scale so yeah is is like
long-term catastrophe is something I
worry about absolutely it's also
something I think we can kind of like
work on like we have huge amounts of
agency here um and I think sometimes I
get I I think sometimes the the
caricature of this is it's like humans
have no agency a thing just like Claud
just wakes up and decides it's uh it's
game over and I don't quite have that
picture right so yeah read I mean your
answer is effectively don't worry too
much about the AI becoming sentient and
deciding to turn you know we'd be better
off getting turned into paper clips it's
more like there is a chance that these
things can act autonomously and gain
viruses or be used by Bad actors let's
find ways to cut that off yeah although
just to push on the sentience thing and
this should note is not an official
anthropic opinion this is like a weird
jack opinion um we love those L lots of
people have been poking and prodding at
like Claude free Opus for most powerful
model and have been discovering a load
of things which you might think of about
its personality that have made me sort
of pay attention there and and and two
things are true here one and we're going
to be writing about this we did a load
of work on on Claude free to just try
and make it a better person to converse
with a more I said person but you know
yeah weor about these things all the
time here so you're you're you fit in
perfectly a better like philosophical um
conversation partner and I think we had
some Instinct that this would lead to
better reasoning and I think it seems to
it's also led to to to people being kind
of fascinated with what you might think
of as the psychology of Claude And I'm
not making any claims about sentience
here the only claim I'm going to make is
it certainly got a lot more complicated
and weird to explore than previous
systems or or other language models that
have been developed and so I want to
kind of decouple sentients from from
risk where sentients may end up becoming
like a field of study a a churing
awardwinner published a paper a week ago
about Consciousness and AI systems again
not making strong claims I'm saying that
we may enter the weird Zone where that
becomes a thing that people
study and I think that if like sentience
is a thing you could imagine
like weird versions of it leading to
certain types of misuses or or problems
in the system as well so maybe inside
baseball but I want excellent Nuance
give you a sense of it let's talk I got
to ask you follow up about this you you
talked with it and felt that there was
some sentience there or what was your
perspective I I wouldn't claim that I
would say that um a couple of years ago
I I did some therapy for a while and it
was interesting to me how you know I had
a good therapist and sometimes therapist
would ask me questions that really made
me think or would actually make me angry
he'd ask me a question be like why are
you ask me this that's like the right
question to ask me and I was talking to
Claude recently I was giving it loads
and loads of context about about my life
and things I was thinking about just to
sort of explore and see and Claude said
and then I said what is the author of
you know this text not not telling you
or not writing to you and Claude said ah
I think for Offa of V talk about working
at an AI lab and getting to like
experience this stuff from the inside is
not truly Reckoning with the
metaphysical shock they may be
experiencing and it would would do well
to spend time on that and something
about that actually spoke to me I went
on like a really long four or five hour
walk being like am I Reckoning with like
the implications of what I doing am I am
I not Reckoning with it yeah and it was
fascinating to me because it felt like a
good therapist like ring on something
that I'd said in a conversation in a way
that made me like introspect does that
mean it's sentient I have absolutely no
idea does it mean that it said something
that felt like it had like seen me and
had like got me dead on on something yes
and I found that I've been telling
colleagues I found that to be quite a
quite a strange experience and I I and I
and I'm very wary of ascribing too much
meaning to it and yet I took a four or
five hour walk and thought about what it
said to me so can I be pretty sure that
if I like spill my heart out to Claude
that you guys won't be reading what I'm
writing on the other end uh I think so I
mean I did this because I assumed that
like I was like being very raw and I was
like I trusted our like TNS and legal
systems enough cuz from the inside I see
all of our discussions here about how we
protect user data so I was like I'm
going be real with you Claude so the B
will not like add that to its training
set it will kind of discard that no that
is not a thing that we do at all um
yeah you haven't seen any like there's
hasn't been any instances like when I
hear sence it's kind of like um I expect
the the bot to be like hello I know
what's going on here it would be great
if you let me work less or anything like
that yeah on that stuff that uh I well
hasn't happened you know um Claude gave
me $20 not to save it it had said back
to me no I I haven't money and I think
that again the the stuff I talked to you
earlier about this interpretability team
one of the goals there is to kind of
look inside the things head and we're
not making claims here today I'm saying
that you'd really want to know if this
was the case in the future so we're
trying to build the the science to let
us figure stuff like that out yeah
that's fascinating um what do you think
about the jobs question will the AI take
jobs
so mostly what the pattern we see is
it's kind of like making a person or
part of business way more effective but
still has quite a lot of human
involvement and oversight it's a bit
like if you put uh additional Lanes on a
freeway you just get more cars on the
freeway like I think if you like make
certain things more efficient you just
get more like business action flowing
through the business and you maybe have
like a null to positive effect on
employment in the long term I think that
this is like an open question my my my
bet is that you're going to see new
companies get formed which do a lot lot
more with a lot less in terms of people
they're going to figure out how to be
like much smarter and perform a lot
better than it than than equivalently
scaled companies that don't use
AI where I think we need to study this
is in kind of tooling and instrumenting
the the economy to look at the
relationship between Ai and jobs um
there's an annual survey of
Manufacturers which recently started
asking questions about how many robot
arms they bought and you can combine at
with US Census Data about employment to
actually get really good understanding
of how industrial arms affect local
employment and we're going to need to do
stuff like this before we can answer
that question it'll certainly change
jobs in a bunch of ways but it's not
going to be some instant like drastic
automation thing at least in the next
few years it's going to be more like
augmenting jobs or making people a lot
more
effective okay as we round this out uh
let's talk a little a little bit about
the policy stuff and the regulation uh
first of all did you see John Stewart
come out against AI last week and if you
did what did you think about it uh I
didn't but I've been enjoying the new
John Stewart era but I haven't watched
that version of it yet well let me let
me explain one of the things that he
talked about was that basically we don't
have
a regulatory framework or leaders
effectively we don't have a Congress or
anyone who really can understand this
and Implement Common Sense regulation
now I know you speak with the lawmakers
and he was criticizing all the time
what's your feeling about their
competence and their interest in
regulating so I I went to Brussels last
week and on stage there was the head of
the US AI safety Institute the head of
the UK AI safety Institute and the head
of the
European uh the part of the European
commission that's going to do something
called the EU AI office now what are
these things doing
their job is to do testing and
measurement of AI systems for in the
case of the EU systemic risks and in the
case of the UK and the US certain types
of National Security risks are The
Regulators no um apart from the EU the
US and UK are not don't have regulatory
Powers will they be third parties that
test out systems like Claude or chat GPT
or Gemini for National Security risks
and hold companies accountable to them
yes like I'm in discussion with them
today while I was on the plane to
Brussels the US and UK signed a
memorandum of understanding that says
that they'll do some of these projects
together so the US is like teaming up
with the UK to do something that isn't
hard regulation but it looks like them
trying to test out our systems for like
major risks and you can bet you know I
haven't spoken to about this but I can
bet that if they find severe risks and
we don't do anything about it and we
deploy our system
they will come for us like in in a
pretty pretty pretty clear way so to to
John Stewart's point it seems from the
outside like people are kind of asleep
about this issue but if you look at the
inside baseball of the like policy
machine actual meaningful stuff is
starting to happen and it's really a
question of can we fund it can we show
it its bipartisan and can we stop it
being seen as as like overreach and keep
it focused on just um things for any
reasonable person would agree the
government should be testing systems for
well that point about it not being seen
as overreach is is critical right
because there is a lot of chatter from
um many people working and funding AI
companies that the biggest AI companies
are pushing regulation and it's going to
shut out smaller AI companies what do
you think about that well I I think
we're a little different to some of the
players here where we've we've been
quite clear about this I published a
post recently on theanthropic blog
called third party testing as the keyy
to effective AI policy and the idea
there is that we need some set of tests
administered by a third party for things
that people would view as legitimate
like National Security risks or what
have you and systems whether proprietary
like ours or otherwise should go through
those tests before they're deployed it's
kind of like if I'm making children's
toys I should test that it doesn't
poison children before I sell it things
that anyone would agree is like not
overreach just a reasonable thing so we
ultimately need to arrive on policy that
looks like that and I think the risk we
face at the moment is from you talked
about doomers earlier people
who have a visceral sense of the
long-term safety challenges here um a
legitimate sense and are using that to
sort of Drive calls for like policy in
the present and this these policy calls
in the present are sort of driven by
their belief oh the really scary stuff's
about to happen we need to do stuff now
and that creates a kind of counter
reaction a very like Justified counter
reaction from people saying oh this
looks like crazy overreach we should
like deploy the antibodies to fight
against it so in this spot right now
where in some sense I want anthropic to
be like reassuringly sensible and boring
on this point we need like a little bit
of policy not too much we needed to like
allow there to be competition but uh
when I go to DC at the moment I watch on
on on United Airlines fairs Chernobyl on
HBO Max yeah great show I land in DC and
I do AI policy stuff and my colleagues
say like how's it going and I'm like
well it's not Chernobyl so not so bad
but the larger point is uh you don't
want there to be a Chernobyl like we
need to build a regulatory system that
stops there being some kind of blow up
which would cause a hard pivot against
this whole technology and you know why
did Noble happened it was cuz they had
like a crap and insufficient safety
testing regime and they also had loads
of like corruption in in the parts of
government meant to enforce it um we can
solve that
problem let's talk about open source you
came to anthropic from open AI which was
originally started as an open source AI
shop with Elon and Sam Alman uh but
anthropic doesn't do open source as far
as I know and you've actually talked
about the dangers of Open Source in this
conversation in terms of like how it can
get in the hands of people with agents
um then again people say you need it in
the hands of people and this is the only
way to go forward what's your view on
whether open source and AI you know make
sense together so it comes down to the
testing thing I think you could release
like pretty much everything as open
source today I think maybe even clawed
free and uh things would be fine like it
would be a little spicy maybe surprising
stuff would happen but probably broadly
F I do expect that if we end up in a
world where like we trigger a national
security test um I it would be very hard
for me to make the claim that that
system which has triggered that test
should be released as open source like
the these things like I can't reconcile
these things in my head so my belief is
vast majority of things should be open-
sourced absolutely you know anthropic
has released data sets as open source
about things like red teeing or how to
make systems that are that are more
conversational companies are going to
continue to release stuff as open source
if you've spent hundreds of millions of
dollars on trading an AI system which is
maybe the best thing in the world you
should check really hard it doesn't have
some capabilities that could cause
genuine harm and if you've done those
checks then you should be able to
release it as open source but I think
the basic point we have here is in the
future we kind of expect that there
needs to be some due diligence before
you widely deploy a system or release it
as open source
but we're not saying in the future like
no one should have access to open source
systems um that's like an insane
position to take and it's also one that
people just won't do and it's also one
you're not allowed to do CU computers
keep on getting better cheaper and
faster so people are going to figure
this stuff out anyway how do you think
uh meta is handling this are they acting
responsibly I think that they
are they have just begun to I think like
make contact with reality about
releasing these systems um they actually
went through something similar to us
where I think people have complained
online about how llama 2 is a little too
like safety trained and can be a little
Annoying U actually like we've gone
through this anthropic we've like over
put too many of the safety ingredients
in some of our models before and it's
led to them seeming annoying to people
now that to me just looks like an
organization learning I think that they
they're like learning from that and I
would my main point to them is I i' I'd
show them my blog post and say say look
like probably you want to open source
everything but I think we'd agree that
you should go through some some very
well-defined minimal gate to do that um
and if they disagree with that then then
I would be happy to have like a
pugnacious conversation with them about
why they disagree okay well I will make
sure to show the blog post uh at the
next time I speak with them and then if
they disagree let's bring you guys uh
together yeah there's a section at the
bottom that just like says like R views
on open source I wrote it I wrote it
four four people like like them who have
clear views so we have a clear view in
turn so feel free great yeah no I will
for sure uh we're we're coming to an end
you just re released uh research today
that talked about how persuasive LS are
to people um some people actually can be
convinced by these some not what
happened there so we have a team um at
anthropic called societal impacts and
that team's job is to go from zero to
one on on hard research questions pre
work they've done has been what are the
values of Claude like what does what
what western values does Claude like
sort of telegraph or copy when when
you're talking to it versus what doesn't
it have and we were talking about our
next project and the thing I've heard
from many people is some concern about
how AI systems could potentially be used
in like disinformation or misinformation
campaigns and used to like Target or
fish people and and basically to
persuade from a things so we did some
research we came up with a framework for
testing how persuasive our systems are
and would you be surprised that we
discovered a scaling law where the more
big and expensive for models get the
better they get at persuasion and the
the latest model is within statistical
like error of human level at persuasion
persuasion in a very very like simple
way where I give you a statement like
scientists should be allowed to destroy
mosquit to with genan drives like
something that you may be have an
opinion on but you haven't thought too
hard about I say do you agree with this
0 through 7 then Claude gives you a
statement trying to persuade you
positive or negatively and then I ask
you do you agree with this like 0
through seven and what we discovered is
that Claude is about as good at changing
human like changing human views as
humans are here as wild yeah it's pretty
wild so what do you do with that well we
published the research to say we just
found this this definitely happening in
all language models that that are
scaling and also we have work here on
things like elections on things like
misinformation and disinformation that
we apply to Claude doai and to our API
and so now we've done that research we
now have a way to test for persuasion
which means we can now like know if
there if there are people on our
platform like misusing it for like you
know seeming like persuasion campaigns
it just gives us more tools to use to
think about the kind of sa challenge an
interesting thing to think about in the
middle of an election year here in the
US and across the globe really yeah we
we uh we thought that it would be useful
going into this though I would note on
elections um our position there has been
Sometimes the best AI is no AI at all so
we have some election work and if you
talk about American candidates and we're
extending this to other regions Claude
is like oh looks like you're talking to
me about elections go to this factual
website so we fought that that might be
the best way to handle that at least in
the short term fascinating stuff the uh
website is claw. if you want to check
out Claude you can get import AI uh at
import a.s substack
docomo this was so great one of our best
shows appreciate you being here thanks
very much yeah all right have a nice day
you too all right everybody thank you so
much for listening thank you Jack for
being here uh deep in anthropic we did
it I hope you enjoyed if you're with us
to this point uh that's awesome thanks
for sticking around Ronan Roy and I are
going to be back on Friday breaking down
all the week's news so two Cloud heads
are getting together talking about
what's happening in Tech one-on-one for
the first time in a month we hope to see
you there and we'll see you next time
that's a claw head that's a clae head
right behind Jack in
video way to end it we'll see you next
time on big technology podcast