The Unofficial Guide to Apple’s Private Cloud Compute - Jmo, CONFSEC
Channel: aiDotEngineer
Published at: 2025-07-30
YouTube video id: CCsWZ5bJlO8
Source: https://www.youtube.com/watch?v=CCsWZ5bJlO8
[Music] [Music] So, we're going to talk about Apple's private cloud compute. This is an unofficial guide. Uh, I don't work at Apple. We'll talk about it um in a sec. But, so um this is my background. My PhD in bio uh in data science, biomedical informatics. I've sold two companies. one in AI and uh data, one in cyber security and infrastructure. I'm not South Park Commons. I'm building a company called Confident Security, which we'll get to at the end. Um but again, disclaimer, put this I'm not an Apple employee. I'm not speaking on their behalf. Everything I've ganed is from public sources. And hopefully what we'll learn today is some tools that we can use ourselves. There's really six key components or and some approaches to ensure privacy. And privacy and security are very related, not perfectly overlapping, but related. So, um, before I get there, I know that we're in the security track, but I want to motivate why you might care about privacy. Not everyone believes that you should have privacy. Uh, so let me just give some give some examples. This year, Deep Seek leaked 100 a million sensitive records of chat logs. And you might say to yourself like, well, that's DeepSeek. Everyone knew that was going to happen. Um, but uh before before I show you the next piece, I want to pull the audience. How many here care about privacy? All right. How many of us use chat GPT? All right. How many consider ChatGpt to be private? Okay, good. Um, how many use the ChatGpt private mode? Uh, and then how many use the API but with zero data retention? Kind of like the default state. Okay, great. Well, as of yesterday, uh, Open AAI has to retain everything uh, anyway, whether or not you flagged it as private or not. uh and I don't want to get into the comments of why they have to do that, but the point is is that uh they have the capability of retaining your private chats that you can flag on the UI as not private other things. And uh obviously being forced to do that is not great, but this is why we should all care about privacy. So Apple doesn't want to have these headlines uh because they're one of their major value props is privacy. So let's talk a little bit about the problems that Apple solved and then how we might use it. So fundamentally AI requires more compute than a phone. Uh but every obviously they want to bundle AI into their phones. Privacy is a major selling point. Anytime you give up private data to something remote, you're inherently reducing your privacy, right? Anytime I give you my data, it's not as private as it was the second before. So the question that Apple is trying to answer in their PCC system which is now available on all of our iPhones and used for inference is how do you get remote compute while remaining private and the simple way to do that would be to buy all of the iPhones H100s pair them in the cloud you get your own H100 boom but obviously AI is even more expensive so uh that's not going to work so they actually need some approach which is how do you get remote compute while remaining private and cheap uh otherwise it doesn't work so I'm going to kind of frame a problem this way. You've got an iPhone. You've got an untrusted remote server. You can't see inside of it. It's a black box. Once you give them your data, you have no idea what happens inside. And what Apple does is tries to make it not a black box so that the iPhone has some control of what happens to the data inside Apple's remote servers. Uh and hopefully that this trusted remote service is also hard to hack. So, for the remainder of the talk and uh we're not going to be able to get into all of it in 16 minutes, uh but we're going to talk about Apple's PCC requirements that they set up and I'll review a conceptual architecture about how they meet those requirements. Then we'll go into two specific components of the six because I don't have time to go through all six and you'll be bored by that point. Uh and then talk about some some pros and cons of Apple's things and how we might use some of those components ourselves. So there are five key requirements to Apple's private cloud compute that they're trying to meet when they design the system. The first one is stateless computation. This is essentially the guarantee that when Apple receives your data, it's only used to satisfy the request and cannot be used. It's impossible to use it for anything else. You can't log it, anything like that. Um, the second thing is enforceable guarantees. That these notion that the code, everything's enforced with code, not by some sort of policy. Not I shouldn't SSH to the instance, but I can SSH to the instances. No, there's no SSH on the instance. You can't SSH to it. Um, you don't want to save things. Well, don't have a disk, right? So, these are what they call enforceable guarantees, not just policies. The third principle or requirement is non-targetability. That means that if you wanted to hack my data on PCC, you'd have to target everyone and sift through all of it rather than having some easy way to find just me. No privilege runtime access. I just briefly touched on it earlier, but essentially there's no way to bypass these restrictions in production. Um, and then the final one and the most important one is verifiable transparency. Verifiable transparency essentially says we can prove that all of the above items are true. Great. So let's talk about again this is a little more bigger representation of the black box. um in a classic you know kind of remote system you have some sort of off service and then we have an AI engine and in this AI engine you have some S sur who can access it and some disk that you can write to it but again the iPhone doesn't know uh what it's sending its data to but this is fundamentally so let's see how we can change this to uh get to some of these at a conceptual level so the first thing that Apple does is it adds an anonymizer and uh this anonymizer is the first part of two parts of non-targetability. But ideally, right, Apple can't tell who the data is coming from. So, it would be harder for an attacker to come and like fish out my particular set of data. Um, if but if you're a student and looking at this, there's still O behind the anonymizer and so the iPhone provides some sort of O credentials and those O credentials are obviously tied to the user. So, the second thing that Apple does is separates O. Um, and conceptually you think of this as if you're going to the arcade uh and you want to go spend your money on an arcade machines. You first put your money into the coin machine. You get some coins out. These coins are anonymous. Now you can go to the machine and no one knows uh what machines you spent your money on. That's essentially what happens here. It's called blind signatures. We're not going to have time to get into it today, but that's what happens. So now the iPhone is making an anonymous request going through an anonymizer that's taking everyone's it's kind of like tour. It's like laundering everyone's data. Uh so that if someone were to access the system internally, they wouldn't know who it's coming from. So that gets us non-targetability. The second thing that Apple does is it changes the set of requests that are happening. The first thing it does is before it sends its data, it says what are you running? Uh and if the AI engine replies with I'm running this and only this, the iPhone might say okay, I trust that. uh and if that remains true then you can run this this AI on the data that I'm submitting. Um this is where how they achieve verifiable transparency. There's a little more subtlety to that which we are going to get into. Um but it's essentially the iPhone says I trust the code that you're running. You can only decrypt my data if you're still running that code. So the iPhone can verify what they're doing. The next thing no privilege runtime access. That was easy. Just get rid of SSHD. Make it no way of accessing those machines. Uh enforceable guarantees. get rid of the disk. We talked about that. And then stateless computation again with no disk, no access. There's nothing to do with the data other than respond to the iPhone. Um, and since the iPhone verified the code that was running on this machine, it knows it's not being logged anywhere before it gives them the data. Okay, so they achieve those five guarantees that I talked about here using six technical components. And again, we're going to go into two of them. Um, but I'll describe them all very briefly. Oblivious HTTP is a spec developed by Cloudflare and Apple and others that allows you to essentially make anonymous requests using a third party to uh use the launder your requests through this third party. So all of the request that goes to Apple's private cloud compute first goes through Cloudflare. So when Apple receives it, it only knows that it came from Cloudflare, not from an individual user's IP address. The second thing that they use is blind signatures. Blind signatures is that arcade analogy that I gave you, but it essentially is a way to O separately and then verify that you're bearing true authentication, but you can't link it to your identity. And again, we don't have time to go into that, but if you want to look it up, it's a formal spec as well. There's lots of packages and open source libraries that let you to use that. Third component is the secure enclave. Um, this is an equivalent we have in in our world that if we're not programming on Apple is TPMs, if you've heard of that, but they're essentially a place, a separate piece of hardware where the private keys are kept and that makes a guarantee that those keys can never be removed from the hardware. That's really important um because you don't want the keys to be given away that does all of these all of the interactions that this is doing is with keys that they prove who they are. if they could move it and have some third party hold it, then it wouldn't be trusted, right? You could essentially fake everyone out that you are an official AI engine, but actually you're somewhere else. So, the secure enclave helps with that. Again, won't be getting into those. We're going to get into these two. Uh, the last one is secure boot and hardened operating system. This is like a standard technique. Um, but it's essentially they run a very limited version of iOS. Um, that makes it very difficult to hack or modify. Um everything has to be signed just like every if you've done an iOS now app now you have to do signatures but theirs is like even crazier. Um okay so the ones we're going to talk about are remote attestation that was this flow I talked about here um great and then uh the other one is the transparency log the transparency log is a record of all the software that Apple's deploying on their private nodes so that you can go and verify what's on the record is actually what's being sent to you during the attestation. Okay, so let's talk about remote attestation very briefly. Uh, and I'm going to talk about it abstractly, not with iPhones. So, you have some client and the client says, "What are you running?" Uh, and the server replies with two things, a set of signed claims and then a public key. And the signed claims essentially say, "I'm on genuine hardware. Uh, I'm running a genuine GPU. Uh, I am running this set of software. I use this bootloadader. I use this version of Linux. And then the client gets to look at those claims and decide whether it trusts that version. Uh right, it might be like, oh, I only trust this version of the Linux kernel and above. Um or I only trust that it's been signed by Apple. Uh and if so, it can use this public key that comes across to encrypt data that is later sent to the server. uh and this is really important which is this public key and these claims are tied together. So during later interactions with the server the client will encrypt using the public key and the signed claims and the server will only be able to decrypt if it is still matching those signed claims. There's a whole bunch of cryptography that makes this possible and a bunch of certificate chains and a bunch of like trust and vendors but that's the fundamental idea and this is what is letting you change that black box to something that's a little more translucent, right? You're not just throwing it over the wall. You can kind of see what's going on inside. Okay. The second thing is the transparency log. Transparency log is actually very simple conceptually. It's just a database with records for each software release and or each component in a software release signed by a particular person. Um so for example in this record Bob added this binary or piece of like compiled source code. Uh, and this is the hash of that binary on November 1st of 24. And then, uh, that's it. It's just a declaration that this binary was signed by Bob. Uh, why does that matter? Why would you care about this? Well, first of all, um, reviewers can go through and offline look at these binaries that are made publicly available and verify their behavior. And so then when you get a remote attestation and the remote attestation says this hash of this binary is there you can be like oh yeah I've already checked this binary I believe that it's doing the right thing. Um the second thing so that's what I said the second point which is you can check that remote attestations match what's in the log. And then finally if you see an attestation that's not on the log you know the whole system's been compromised because if if it's not on the log definitely someone is like doing some sort of shenanigans right they might have like hijacked your connection or whatever. Um, and it's just because like a limited set of people can write to this log and there's no way to modify the log, right? It's append only. It uses like a Merkel tree so that you can't change the contents. Um, great. So that is the transparency log. Um, so let me tell you how this all comes together, right? So remote attestation is this flow again. uh the iPhone first through the anonymizer requests a remote attestation package uh and then says well if I believe that remote attestation package I trust the contents that is running on the server I can then send my data and I phrase this as try to decrypt the data and run the AI again if the attestation changed the AI engine would not be able to decrypt the data right so that's the most important part right it says I'm running this thing trust me and it says I trust you okay great encrypt it and I can only decrypt it as long as it's still running the exact thing I said I trusted. And the second item we talked about is the transparency log which is check if the attested claims match the transparency log. And this transparency log we talked about so on here Apple is writing a lot a lot of software onto the log and then essentially saying uh trust what's on the log. You can verify it offline and then when the attestation claims come in just double check that they do indeed match. Um okay. And then I I don't have time to get into all of these. Um but uh here are some of the other items that we talked about. The blind signatures, the oblivious HTTP is the anonymizer. Blind signatures are the way to do the O. Um and then of course uh over here we have the secure enclave. I kind of put that outside of the AI engine. They're they're separate pieces of hardware. And then the hardening is just this little lock, but you know, we don't have time to get into it. Um, and uh, that's kind of at a very conceptual level, like you could essentially do a PhD on each of these. Um, how Apple's PCC works. So, what are the gaps? What are the downsides? Well, first you have to put all of your trust in Apple still, right? On the bright side, like Apple runs their whole supply chain. They verify the nodes when they get them at their data center. they actually resign them what's called what's called data center identity keys or something like that DCIKs. Um but there's no guarantee that Apple doesn't share the searchs with anyone uh or insecurely generate them or set the private key to one everywhere. Now I I think they are trying to do their best effort but you you still have to trust you've shifted the trust now into like Apple's behavior rather than the hardware. Um but anyhow, and then they're only available on Apple devices for consumer use uh on official apps. Maybe at some point they'll make PCC available to everyone else uh but not yet. So what trade-offs does Apple PCC make? Um they're limited by latencies to Apple data centers. So um they do have local models first that they try and use. Um but if those local models aren't adequate, they'll send them to data centers. uh as we start to do like real-time voice and other things uh this is a little more latency uh like adds a lot more latency to the system. Um the compute costs are higher. There's doing a lot more encryption. There's like I didn't I mean you're not seeing it but there's like six layers of encryption before it even gets to that node that actually that actually makes it happen. So you're spending a little bit more compute there. Like I told you no custom models, no fine-tuning. The client libraries are very complicated. um the client having to orchestrate all of these requests, this transparency log, this O, that's way more complicated than a simple HTTP request, which kind of sucks. Um, and what if your iPhone goes down after it's authenticated and then it loses all the authentication keys? Like you've essentially like lost all of your state, right? So, it's a lot more stateful. Um, operationally complex. You can't SSH in the machines and then there's no logging. So, that's difficult. uh not everyone would sign up for that. Um you can't do any usage tracking. If you could do usage tracking, then you'd be identified, right? And so Apple can't like parcel out, you know, you get 2,000 tokens. They do do some fraud and abuse tracking at a very gross level. But um if you wanted to use this and maybe pass on your costs, a similar architecture, and pass on your costs to the customer, you wouldn't be able to know which customer was doing what, right? Um and then not open to thirdparty developers. Okay. What can I learn from this? I gave you the list of six that Apple uses and here's what's available in our world. If you're not developing on Apple silicon and Apple hardware, you still have oblivious HTTP and blind signatures. There are libraries to do that. So, we don't have secure enclaves, but we have TPMs. Uh, almost all Intel and AMD hardware now has TPMs. And then in the cloud environment, they have virtual TPMs uh that provide the same behavior as a TPM. And again, that that's where you put a bunch of your private keys that are tied to that public key that I talked about. These are available for us. Secure boot and hardened operating system. Um, remote attestation is kind of available. It's kind of tied to the TPM. There aren't great standards yet. Um, but there is a little bit of work there. Transparency log. There are two open ones. One's called SIG SUM. The other one's called SIG Store if you've heard of them. if not um and then confidential VMs are just becoming available on cloud providers with GPUs. So confidential computing has been around for a while. Uh but now you also have to have confidential H100s and only H100s support and H200 support confidentiality. What that means is that their memor is encrypted. So if you were to physically go up to the H100 and like try to look at its RAM, you wouldn't be able to see what's going on there or figure out what's going on there. And then finally, what we have that Apple doesn't have is we have open source and we have reproducible builds. We have the ability to link the source code to the binaries. Uh, and so we can have security research look at the source code as well as, you know, blackbox test the binaries and develop confidence in what the server might be running. All right, what's next? Okay, so Apple has set the standard for private AI and the market is definitely following um in that was in June of 2024. wasn't actually released until October of 2024. Azure Open AAI or sorry, Azure AI, not Azure OpenAI, is doing private inferencing starting as September. They're still in private preview. And then about a month ago, Meta of all companies, I guess I'm recorded. Meta of all these great companies uh also added uh private processing, which if you read their blog post, it's like they copy and pasted this. Maybe they used Llama to rewrite it um into their language, but it's essentially identical, which is great for all of us thinking about privacy. Um, and sure I'm sure uh WhatsApp also doesn't want those like press releases like I showed earlier. Um, so I'll just close by saying we're building the same thing. Uh, but for everyone else, if you're not on Apple or you're not in WhatsApp, we have it. Uh, it's called confident security. Um, and if you'd like to talk more, let me know. By the way, this is an anti-AII shirt. Uh, which means that if you take pictures of me, it will confuse all the facial recognition stuff. We have others. If you have some cool questions and want to talk afterward, if it deems it worthy, I will give you an anti-AII shirt. We also have some other privacy based swag in the back, so come hit me up. Thanks everyone. [Music]