The Unofficial Guide to Apple’s Private Cloud Compute - Jmo, CONFSEC

Channel: aiDotEngineer
Published at: 2025-07-30
YouTube video id: CCsWZ5bJlO8
Source: https://www.youtube.com/watch?v=CCsWZ5bJlO8
[Music]
[Music]
So, we're going to talk about Apple's
private cloud compute. This is an
unofficial guide. Uh, I don't work at
Apple. We'll talk about it um in a sec.
But, so um this is my background. My PhD
in bio uh in data science, biomedical
informatics. I've sold two companies.
one in AI and uh data, one in cyber
security and infrastructure. I'm not
South Park Commons. I'm building a
company called Confident Security, which
we'll get to at the end. Um but again,
disclaimer, put this I'm not an Apple
employee. I'm not speaking on their
behalf. Everything I've ganed is from
public sources. And hopefully what we'll
learn today is some tools that we can
use ourselves. There's really six key
components or and some approaches to
ensure privacy. And privacy and security
are very related, not perfectly
overlapping, but related. So, um, before
I get there, I know that we're in the
security track, but I want to motivate
why you might care about privacy. Not
everyone believes that you should have
privacy. Uh, so let me just give some
give some examples. This year, Deep Seek
leaked 100 a million sensitive records
of chat logs. And you might say to
yourself like, well, that's DeepSeek.
Everyone knew that was going to happen.
Um, but uh before before I show you the
next piece, I want to pull the audience.
How many here care about privacy?
All right. How many of us use chat GPT?
All right. How many consider ChatGpt to
be private? Okay, good. Um, how many use
the ChatGpt private mode?
Uh, and then how many use the API but
with zero data retention? Kind of like
the default state. Okay, great. Well, as
of yesterday, uh, Open AAI has to retain
everything uh, anyway, whether or not
you flagged it as private or not. uh and
I don't want to get into the comments of
why they have to do that, but the point
is is that uh they have the capability
of retaining your private chats that you
can flag on the UI as not private other
things. And uh obviously being forced to
do that is not great, but this is why we
should all care about privacy.
So Apple doesn't want to have these
headlines uh because they're one of
their major value props is privacy. So
let's talk a little bit about the
problems that Apple solved and then how
we might use it. So fundamentally AI
requires more compute than a phone. Uh
but every obviously they want to bundle
AI into their phones. Privacy is a major
selling point. Anytime you give up
private data to something remote, you're
inherently reducing your privacy, right?
Anytime I give you my data, it's not as
private as it was the second before. So
the question that Apple is trying to
answer in their PCC system which is now
available on all of our iPhones and used
for inference is how do you get remote
compute while remaining private and the
simple way to do that would be to buy
all of the iPhones H100s pair them in
the cloud you get your own H100 boom but
obviously AI is even more expensive so
uh that's not going to work so they
actually need some approach which is how
do you get remote compute while
remaining private and cheap uh otherwise
it doesn't work so I'm going to kind of
frame a problem this way. You've got an
iPhone. You've got an untrusted remote
server. You can't see inside of it. It's
a black box. Once you give them your
data, you have no idea what happens
inside. And what Apple does is tries to
make it not a black box so that the
iPhone has some control of what happens
to the data inside Apple's remote
servers.
Uh and hopefully that this trusted
remote service is also hard to hack. So,
for the remainder of the talk and uh
we're not going to be able to get into
all of it in 16 minutes, uh but we're
going to talk about Apple's PCC
requirements that they set up and I'll
review a conceptual architecture about
how they meet those requirements. Then
we'll go into two specific components of
the six because I don't have time to go
through all six and you'll be bored by
that point. Uh and then talk about some
some pros and cons of Apple's things and
how we might use some of those
components ourselves.
So there are five key requirements to
Apple's private cloud compute that
they're trying to meet when they design
the system. The first one is stateless
computation. This is essentially the
guarantee that when Apple receives your
data, it's only used to satisfy the
request and cannot be used. It's
impossible to use it for anything else.
You can't log it, anything like that.
Um, the second thing is enforceable
guarantees. That these notion that the
code, everything's enforced with code,
not by some sort of policy. Not I
shouldn't SSH to the instance, but I can
SSH to the instances. No, there's no SSH
on the instance. You can't SSH to it.
Um, you don't want to save things. Well,
don't have a disk, right? So, these are
what they call enforceable guarantees,
not just policies. The third principle
or requirement is non-targetability.
That means that if you wanted to hack my
data on PCC, you'd have to target
everyone and sift through all of it
rather than having some easy way to find
just me.
No privilege runtime access. I just
briefly touched on it earlier, but
essentially there's no way to bypass
these restrictions in production.
Um, and then the final one and the most
important one is verifiable
transparency. Verifiable transparency
essentially says we can prove that all
of the above items are true. Great.
So let's talk about again this is a
little more bigger representation of the
black box. um in a classic you know kind
of remote system you have some sort of
off service and then we have an AI
engine and in this AI engine you have
some S sur who can access it and some
disk that you can write to it but again
the iPhone doesn't know uh what it's
sending its data to but this is
fundamentally so let's see how we can
change this to uh get to some of these
at a conceptual level so the first thing
that Apple does is it adds an anonymizer
and uh this anonymizer is the first part
of two parts of non-targetability. But
ideally, right, Apple can't tell who the
data is coming from. So, it would be
harder for an attacker to come and like
fish out my particular set of data. Um,
if but if you're a student and looking
at this, there's still O behind the
anonymizer and so the iPhone provides
some sort of O credentials and those O
credentials are obviously tied to the
user. So, the second thing that Apple
does is separates O. Um, and
conceptually you think of this as if
you're going to the arcade uh and you
want to go spend your money on an arcade
machines. You first put your money into
the coin machine. You get some coins
out. These coins are anonymous. Now you
can go to the machine and no one knows
uh what machines you spent your money
on. That's essentially what happens
here. It's called blind signatures.
We're not going to have time to get into
it today, but that's what happens. So
now the iPhone is making an anonymous
request going through an anonymizer
that's taking everyone's it's kind of
like tour. It's like laundering
everyone's data. Uh so that if someone
were to access the system internally,
they wouldn't know who it's coming from.
So that gets us non-targetability.
The second thing that Apple does is it
changes the set of requests that are
happening. The first thing it does is
before it sends its data, it says what
are you running? Uh and if the AI engine
replies with I'm running this and only
this, the iPhone might say okay, I trust
that. uh and if that remains true then
you can run this this AI on the data
that I'm submitting. Um this is where
how they achieve verifiable
transparency. There's a little more
subtlety to that which we are going to
get into. Um but it's essentially the
iPhone says I trust the code that you're
running. You can only decrypt my data if
you're still running that code. So the
iPhone can verify what they're doing.
The next thing no privilege runtime
access. That was easy. Just get rid of
SSHD. Make it no way of accessing those
machines. Uh enforceable guarantees. get
rid of the disk. We talked about that.
And then stateless computation again
with no disk, no access. There's nothing
to do with the data other than respond
to the iPhone. Um, and since the iPhone
verified the code that was running on
this machine, it knows it's not being
logged anywhere before it gives them the
data. Okay, so they achieve those five
guarantees that I talked about here
using six technical components. And
again, we're going to go into two of
them. Um, but I'll describe them all
very briefly. Oblivious HTTP is a spec
developed by Cloudflare and Apple and
others that allows you to essentially
make anonymous requests using a third
party to uh use the launder your
requests through this third party. So
all of the request that goes to Apple's
private cloud compute first goes through
Cloudflare. So when Apple receives it,
it only knows that it came from
Cloudflare, not from an individual
user's IP address. The second thing that
they use is blind signatures. Blind
signatures is that arcade analogy that I
gave you, but it essentially is a way to
O separately and then verify that you're
bearing true authentication, but you
can't link it to your identity. And
again, we don't have time to go into
that, but if you want to look it up,
it's a formal spec as well. There's lots
of packages and open source libraries
that let you to use that.
Third component is the secure enclave.
Um, this is an equivalent we have in in
our world that if we're not programming
on Apple is TPMs, if you've heard of
that, but they're essentially a place, a
separate piece of hardware where the
private keys are kept and that makes a
guarantee that those keys can never be
removed from the hardware. That's really
important um because you don't want the
keys to be given away that does all of
these all of the interactions that this
is doing is with keys that they prove
who they are. if they could move it and
have some third party hold it, then it
wouldn't be trusted, right? You could
essentially fake everyone out that you
are an official AI engine, but actually
you're somewhere else. So, the secure
enclave helps with that. Again, won't be
getting into those. We're going to get
into these two. Uh, the last one is
secure boot and hardened operating
system. This is like a standard
technique. Um, but it's essentially they
run a very limited version of iOS. Um,
that makes it very difficult to hack or
modify. Um everything has to be signed
just like every if you've done an iOS
now app now you have to do signatures
but theirs is like even crazier. Um okay
so the ones we're going to talk about
are remote attestation that was this
flow I talked about here um great and
then uh the other one is the
transparency log the transparency log is
a record of all the software that
Apple's deploying on their private nodes
so that you can go and verify what's on
the record is actually what's being sent
to you during the attestation. Okay, so
let's talk about remote attestation very
briefly.
Uh, and I'm going to talk about it
abstractly, not with iPhones. So, you
have some client and the client says,
"What are you running?" Uh, and the
server replies with two things, a set of
signed claims and then a public key. And
the signed claims essentially say, "I'm
on genuine hardware. Uh, I'm running a
genuine GPU. Uh, I am running this set
of software. I use this bootloadader. I
use this version of Linux. And then the
client gets to look at those claims and
decide whether it trusts that version.
Uh right, it might be like, oh, I only
trust this version of the Linux kernel
and above. Um or I only trust that it's
been signed by Apple. Uh and if so, it
can use this public key that comes
across to encrypt data that is later
sent to the server.
uh and this is really important which is
this public key and these claims are
tied together. So during later
interactions with the server the client
will encrypt using the public key and
the signed claims and the server will
only be able to decrypt if it is still
matching those signed claims. There's a
whole bunch of cryptography that makes
this possible and a bunch of certificate
chains and a bunch of like trust and
vendors but that's the fundamental idea
and this is what is letting you change
that black box to something that's a
little more translucent, right? You're
not just throwing it over the wall. You
can kind of see what's going on inside.
Okay. The second thing is the
transparency log. Transparency log is
actually very simple conceptually. It's
just a database with records for each
software release and or each component
in a software release signed by a
particular person. Um so for example in
this record Bob added this binary or
piece of like compiled source code. Uh,
and this is the hash of that binary on
November 1st of 24. And then, uh, that's
it. It's just a declaration that this
binary was signed by Bob. Uh, why does
that matter? Why would you care about
this? Well, first of all, um, reviewers
can go through and offline look at these
binaries that are made publicly
available and verify their behavior.
And so then when you get a remote
attestation and the remote attestation
says this hash of this binary is there
you can be like oh yeah I've already
checked this binary I believe that it's
doing the right thing. Um the second
thing so that's what I said the second
point which is you can check that remote
attestations match what's in the log.
And then finally if you see an
attestation that's not on the log you
know the whole system's been compromised
because if if it's not on the log
definitely someone is like doing some
sort of shenanigans right they might
have like hijacked your connection or
whatever. Um, and it's just because like
a limited set of people can write to
this log and there's no way to modify
the log, right? It's append only. It
uses like a Merkel tree so that you
can't change the contents. Um, great.
So that is the transparency log. Um, so
let me tell you how this all comes
together, right? So remote attestation
is this flow again. uh the iPhone first
through the anonymizer requests a remote
attestation package uh and then says
well if I believe that remote
attestation package I trust the contents
that is running on the server I can then
send my data and I phrase this as try to
decrypt the data and run the AI again if
the attestation changed
the AI engine would not be able to
decrypt the data right so that's the
most important part right it says I'm
running this thing trust me and it says
I trust you okay great encrypt it and I
can only decrypt it as long as it's
still running the exact thing I said I
trusted.
And the second item we talked about is
the transparency log which is check if
the attested claims match the
transparency log. And this transparency
log we talked about so on here Apple is
writing a lot a lot of software onto the
log and then essentially saying uh trust
what's on the log. You can verify it
offline and then when the attestation
claims come in just double check that
they do indeed match.
Um okay. And then I I don't have time to
get into all of these. Um but uh here
are some of the other items that we
talked about. The blind signatures,
the oblivious HTTP is the anonymizer.
Blind signatures are the way to do the
O. Um and then of course uh over here we
have the secure enclave. I kind of put
that outside of the AI engine. They're
they're separate pieces of hardware. And
then the hardening is just this little
lock, but you know, we don't have time
to get into it. Um,
and uh, that's kind of at a very
conceptual level, like you could
essentially do a PhD on each of these.
Um, how Apple's PCC works. So, what are
the gaps? What are the downsides? Well,
first you have to put all of your trust
in Apple still, right? On the bright
side, like Apple runs their whole supply
chain. They verify the nodes when they
get them at their data center. they
actually resign them what's called
what's called data center
identity keys or something like that
DCIKs. Um but there's no guarantee that
Apple doesn't share the searchs with
anyone uh or insecurely generate them or
set the private key to one everywhere.
Now I I think they are trying to do
their best effort but you you still have
to trust you've shifted the trust now
into like Apple's behavior rather than
the hardware. Um but anyhow, and then
they're only available on Apple devices
for consumer use uh on official apps.
Maybe at some point they'll make PCC
available to everyone else uh but not
yet.
So what trade-offs does Apple PCC make?
Um they're limited by latencies to Apple
data centers. So um they do have local
models first that they try and use. Um
but if those local models aren't
adequate, they'll send them to data
centers. uh as we start to do like
real-time voice and other things uh this
is a little more latency uh like adds a
lot more latency to the system. Um the
compute costs are higher. There's doing
a lot more encryption. There's like I
didn't I mean you're not seeing it but
there's like six layers of encryption
before it even gets to that node that
actually that actually makes it happen.
So you're spending a little bit more
compute there. Like I told you no custom
models, no fine-tuning. The client
libraries are very complicated. um the
client having to orchestrate all of
these requests, this transparency log,
this O, that's way more complicated than
a simple HTTP request, which kind of
sucks. Um, and what if your iPhone goes
down after it's authenticated and then
it loses all the authentication keys?
Like you've essentially like lost all of
your state, right? So, it's a lot more
stateful.
Um,
operationally complex. You can't SSH in
the machines and then there's no
logging. So, that's difficult. uh not
everyone would sign up for that. Um you
can't do any usage tracking. If you
could do usage tracking, then you'd be
identified, right? And so Apple can't
like parcel out, you know, you get 2,000
tokens. They do do some fraud and abuse
tracking at a very gross level. But um
if you wanted to use this and maybe pass
on your costs, a similar architecture,
and pass on your costs to the customer,
you wouldn't be able to know which
customer was doing what, right? Um and
then not open to thirdparty developers.
Okay. What can I learn from this? I gave
you the list of six that Apple uses and
here's what's available in our world. If
you're not developing on Apple silicon
and Apple hardware, you still have
oblivious HTTP and blind signatures.
There are libraries to do that. So, we
don't have secure enclaves, but we have
TPMs. Uh, almost all Intel and AMD
hardware now has TPMs. And then in the
cloud environment, they have virtual
TPMs uh that provide the same behavior
as a TPM. And again, that that's where
you put a bunch of your private keys
that are tied to that public key that I
talked about. These are available for
us. Secure boot and hardened operating
system. Um,
remote attestation is kind of available.
It's kind of tied to the TPM. There
aren't great standards yet. Um, but
there is a little bit of work there.
Transparency log. There are two open
ones. One's called SIG SUM. The other
one's called SIG Store if you've heard
of them. if not um
and then confidential VMs are just
becoming available on cloud providers
with GPUs. So confidential computing has
been around for a while. Uh but now you
also have to have confidential H100s and
only H100s support and H200 support
confidentiality. What that means is that
their memor is encrypted. So if you were
to physically go up to the H100 and like
try to look at its RAM, you wouldn't be
able to see what's going on there or
figure out what's going on there. And
then finally, what we have that Apple
doesn't have is we have open source and
we have reproducible builds. We have the
ability to link the source code to the
binaries. Uh, and so we can have
security research look at the source
code as well as, you know, blackbox test
the binaries and develop confidence in
what the server might be running. All
right, what's next? Okay, so Apple has
set the standard for private AI and the
market is definitely following um in
that was in June of 2024. wasn't
actually released until October of 2024.
Azure Open AAI or sorry, Azure AI, not
Azure OpenAI, is doing private
inferencing starting as September.
They're still in private preview. And
then about a month ago, Meta of all
companies, I guess I'm recorded. Meta of
all these great companies uh also added
uh private processing, which if you read
their blog post, it's like they copy and
pasted this. Maybe they used Llama to
rewrite it um into their language, but
it's essentially identical, which is
great for all of us thinking about
privacy. Um, and sure I'm sure uh
WhatsApp also doesn't want those like
press releases like I showed earlier.
Um, so I'll just close by saying we're
building the same thing. Uh, but for
everyone else, if you're not on Apple or
you're not in WhatsApp, we have it. Uh,
it's called confident security. Um, and
if you'd like to talk more, let me know.
By the way, this is an anti-AII shirt.
Uh, which means that if you take
pictures of me, it will confuse all the
facial recognition stuff. We have
others. If you have some cool questions
and want to talk afterward, if it deems
it worthy, I will give you an anti-AII
shirt. We also have some other privacy
based swag in the back, so come hit me
up. Thanks everyone.
[Music]