Government Agents: AI Agents Meet Tough Regulations — Mark Myshatyn, Los Alamos National Lab
Channel: aiDotEngineer
Published at: 2025-12-06
YouTube video id: TnSGx36Ly0Q
Source: https://www.youtube.com/watch?v=TnSGx36Ly0Q
All right. So, good morning. My name is Mark Mashottton. I'm our enterprise AI architect at Los Alamos National Laboratory. Uh, today, you know, this is an AI conference. What What's a nuclear science lab doing here? The reality is we've actually been doing applied a IML for almost 70 years. Uh this is actually one of our scientists in 1956 uh playing Los Alamos chess uh in front of one of our first supercomputers, Maniac 1. And what's unique about this is we if you look at it, there's actually no bishops on the chessboard. You know, we we've been doing applied statistics and applied machine learning since we didn't have the memory needed to hold an entire chessboard in a computer at once. And that's fascinating to to say back when this photo was taken, it was after the Manhattan project, we were pushing the edge of developing Monte Carlo methods that we still use today. And for us, you know, AI didn't come as a complete surprise, but the opportunity that's come with agents, with things we can do has been incredible even to to us that have been along the right the ride for quite some time. Uh, let's see if I can make this full screen. So, what we have going on here, this is actually a demonstration. You can find it on our YouTube channel if you can't see it on the screen here, but we've looked at generative AI, not only from a strict model standpoint, but also from an agentic standpoint as a way for us to move science faster. You know, we we like much of the federal government are under a squeeze to do better, faster, cheaper, and more to protect our country. And in this case, you going from not just what a model knows, but what we can let a model know was really the the change that happened here. We started with a problem of go design an ICF, an inertial confinement fusion capsule for our sister lab at Liverour across the bay here. and we said, "Read a paper. Go read lots of papers that you think are tangential to this first paper and then come up with a design for a fusion capsule." Uh, it created a hypothesis. And the thing that's kind of uniquely ours is this isn't a generic, you know, chatbot that spits back a bunch of code. What you'll see here in a second, we're actually executing that code on our high performance computing assets. And we are actually running, you know, thermodynamic hydrodnamic tests on some of these types of problems where our model, you know, isn't just an LLM. It's all of the, you know, 50 60 plus years of math and science that we've done to to bring the the management and the development of our nuclear stockpile and stewardship of that stockpile, bring those tools into an agentic era. So, we're looking at this as a chance for agents to move faster uh and for for science to move faster because the risk at the same time is starting to move faster. You can see here it actually did come up with a design that it thought optimize that yield and we were simulating a slice through an ICF capsule. But, okay, that's one nice toy problem. What does that mean for the other 20,000 researchers that we have at our laboratory? For those of you not familiar, we're 40 square miles of labs, test sites, uh, test plants. We have 13 nuclear facilities. And so, we're huge. We have a huge breath of what we're trying to accomplish with AI, uh, and getting our mission moving faster. For our national security AI office, you know what you just saw, that's the first thing of that we're charged with. Push the science of AI faster. Don't just sit there and consume commercial tools or open source tools. We write our stuff. We write our own models. Uh we also realize that we can't do everything. We don't have the hubris to understand or to say here and oh we understand everything. We don't need anyone's help. We absolutely need those partnerships from commercial industry from academia. And then just like the rest of you all here, we're looking at how do we bring AI and Gen AI tools into our workflows. You know, we have a huge footprint. We have to do payroll. We have to do procurement. We have to do cyber security. And so our office is kind of in there. How do we do that? And it really does come down some to some of what we're doing with our partners. We have some great academic partners. We couldn't at the time these slides were released for uh public review, we didn't get the screenshot on there. We also announced a partnership with the UC family of schools uh on the academic side of developing, you know, the future of AI. But we're also working with all the frontier labs. you know, here's a couple press releases where we've actually done chem biosafety work with open AAI and we we've been able to acknowledge that work that we've done with them, but we have a place where we've been doing we're a safe place to do dangerous things and we've been doing that for decades. So, it's a neat partnership to have these frontier labs that really can't afford to hire anyone they want still come to us as a source of data and a source of partnership. There in the middle of that last picture, we actually have science of AI in the hardware space. Uh that's our Venado supercomputer. It's over 2500 nodes of GraceHopper super chips and we we brought it um through a partnership with open or with um Nvidia and uh HPE to build a supercomput that can help us push the boundaries of what does it mean to do AI research. And then more recently, we've also brought OpenAI's models onto this system, brought it up to our classified networks, and we're getting to work on the really hard problems that are unique to our data and our mission space. When we talk about agents, you know, partnerships take trust, you know, certainly having labs trust you with early access to their models or model weights. As we talk about sharing responsibility with our partners, certainly the responsibility of what our AI tools and services do starts to matter. Uh there were previous administration had certain executive orders out. Those were replaced largely in January when the new administration took change. But this piece of OM memorandum just came out in April. Uh M2521 and there's M2522. And it starts to codify like what things should the US government start to worry about when we're fielding these AI systems. It tells the government to go faster. That's important. But it also says these government type workloads, they have real world impacts. You know, for us, we are not a t-shirt company. If our data gets out, that's, you know, geopolitical challenges show up. Uh kinetic challenges show up. People can die if we do this wrong. And this I won't bore you. It's like 25 pages, reasonably well written for an OMB memorandum as far as readability and comprehensiveness, but it says we as the US government need to move faster into bringing this into everything we do. It's not enough to just buy, you know, pick your favorite office addin tool and say we can type powerpoints faster or summarize our emails faster. We got to go deeper into our mission and that comes with trust. So, who here is part of a software as a service uh company or startup? Okay, handful of hands here. So, you've probably seen something similar to this, especially if you've been in the cloud space recently, that as us as customers start to trust you with our data, your responsibility also comes up. Uh that's easy to do for our open public unrestricted data like the open science work like I showed off of our ICF capsule agent. But as we get into controlled and classified, as we get into classified and the DOE space, as we get into restricted and formerly restricted data, where the physics of how nuclear weapons work don't expire, that that will forever be classified. It's born, classified, and stays classified. It takes an element of trust in you all as our builders, as our providers. And this is really some of the most interesting and unfrusting conversations we have with companies trying to sell us tools and services is great. You have your sock 2 report. I have NIST 853. This is actually rev 4. It's over a,000 different security controls and enhancements. And the the US government has put a lot of legislation in place to do traditional cyber security work. Fed Ramp certainly tried to make this easier by coming in and saying, you know, 200 your security controls, 300, 400 have been vetted with a third party authorizer. You have some continuous monitoring. Has anyone here been downstream of the Fed ramp process? Yeah, I see a couple smiles. So, you know how much of a pain this has been. And much like everything else in the government right now, it is changing. There's a new Fed ramp program out there saying if we're going to trust you with our data, if we're going to trust trust you with the outcomes of our agents, you have to start thinking about your continuous monitoring, your continuous security posture. Uh if you work with the DoD, that gets even harder. Uh DoD has what they call their security requirements guide or CCSRG. Um it talks about if you're touching this type of data level. So it takes that three types or three types of Fed ramp. It layers on two more uh impact levels as the DoD calls them and says this is how you're going to access that service if you have PII or mission data or operational data or finance data. And then they add another copy of this book, you know, CNSSI 1253 on top of that. So if if you're looking at this saying it's a lot of governance, it is. Um but the fun part is right now where we are today from those uh April 3rd memorandums is AI use cases, AI governance is still on the drawing board. Like we are in that 180day rulemaking period that these uh pieces of OM memoranda put out saying agent or agencies have to go develop their strategies, their plans for developing you know AI implementations. How do you govern pilots? What's considered high-risisk, lowrisisk in your context? And there's some prescriptive guidance out there. NIST back in 2023 released their AI risk management framework. >> Breakout sessions will begin in 5 minutes. >> Five minutes for morning breakout sessions. [laughter] >> Your choice. >> But the fun part is you can develop the future with your customers right now. Now, this is a clean sheet of paper from a technology perspective that we largely haven't had to tackle. Uh, and it's it's fun in a US government space to say we can invent part of the future together with commercial industry. Um, make hopefully better, less obnoxious, less obstructive decisions so we can keep moving mission faster. And and if it sounds like this is a lot of lawyers and paperwork, it probably is. Um, there there's no getting around. Some of these records and artifacts do have to exist. But the the reason you'd want to collaborate with us is we're doing things that are either incredibly hard or can't be done in commercial industry. Um, at least at Los Alamos, we are sitting on pabytes of data that has never seen the internet, will never see the internet. Uh we have subject matter expertise in chem uh bio materials physics um materials composites um certainly cyber security and the design of high performance computing that some of the partnerships I mentioned earlier and they can be your partnerships too. You know we we firmly believe that if we're talking about taking care of the country taking care of our national competitive advantage that's not just a bunch of scientists sitting on a mountain side in Los Alamos that are going to figure that out. We really do want your help and your uh engagement with us to you know push the boundaries of what we know. This was originally meant to be an architecture talk. So finishing up with an architecture slide. If you are interested in bringing a agentic tools, agentic services to the federal government. There's really four things to think about. You know, we want to see that you've built for explanability. Our keynote this morning touched on that a little bit of how did you get to that decision? you know, if if something goes wrong or if we have a bad day, we don't have shareholders that we're responsible to. We have the US citizens to be responsible to. Um, we have whatever that outcome was that, you know, caused some press briefing. We need to be able to trust our agents the same way we trust our staff. Uh, when we talk about fielding things, again, we we are not a t-shirt company. Building for isolation matters. And um I was looking forward to seeing Microsoft's demo on the uh self-hosted uh AI foundry pieces, but for us, we do that anyways. We look and leverage heavily open-source tools and services and models to do some of this work because we can't get it from a hyperscaler cloud provider. Uh so as you're building your tools and services, take a look at some of those services in scope page. Even if you are a SAS startup, um if you can build in a DoD impact level 5 environment with that limited number of services from your cloud vendor, you can deploy anywhere. You know, you you have the least common denominator uh out of that entire tech stack. If you can deploy your, you know, your tool, your application there, that makes our job easier. That makes you more portable. And as along with that comes build for governance. We also have some awkward conversations with customers where it's well we need a software bill of materials as we're doing this procurement with you and yeah [laughter] uh people look at us like I mean I guess we can dump you know what we had in our build script and it's it's a little bit of an awkward conversation but that's required per our regs you know AI stuff is moving a mile a minute the traditional cyber security stuff is moving faster but not quite there yet so if you can plan to have those conversations of how did you handle open source dependencies. What are your patching plans? What you know, help us fill this paperwork out if we're buying from you as a software as a service or platform as a service. That makes that entire partnership that much faster, that much more friendly. And lastly, keep up the speed. We have also had some awkward conversations with some of our service providers saying, why is your federal stuff a year out of date? you know, why is that service par not happening a year, three years, five years uh from when you launched it in a commercial region? And that's not us just liking ourselves and wanting to have bravado that oh, we're the government, we we're a quasi federal agency, we we care about our data. No, this is rooted in export compliance law. This is things like we can't buy from you unless you're in the right places. Um so it's if you can design for speed in your hard corners that optimizes your chances of uh fielding your tools and services with us uh in different places that we have to operate to meet our mission and and with that I lo Alamos we were founded on the idea that the right application of math and science can change the world overnight. Um we we've done that. We're not a stranger to how that feels to show up and the world is now different. Uh that's what we were founded to do. And when we look at AI, uh Aentic tools, what we can do with frontier models, any of the above, um we see it as the greatest opportunity and the greatest threat to national security, but the opportunity is what keeps us showing up. We're not scared of the the downside risk. We have to be here to help develop the future. Uh, one of my favorite anecdotes, uh, because we are a nuclear science lab, uh, we do a lot of nuclear non-prololiferation work. And because we do that type of work, we've gotten really good at specialty sensors. And what have we been able to do with that specialty sensor? We have a laser strapped to a car on Mars zapping rocks. You know, we built the ChemCam sensor. So even if you're a little bit on the fence about should we engage with, you know, the nuclear enterprise of the US, there's other fundamental science that we do that's just pushing the boundaries that we as a human species know and can do and can can grow into. So with that, thank you so much for your time today. I really appreciate it and I'll be available on the side for questions. Thank you.