Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger
Channel: aiDotEngineer
Published at: 2025-07-27
YouTube video id: Fzb1a24hF-o
Source: https://www.youtube.com/watch?v=Fzb1a24hF-o
[Music] Okay, we're going to kick this off. Uh we're trying to sort out some internet options here, but uh in the meantime, I'll uh give us our little intro. Um so, first of all, we're I'm Kyle. This is Jeremy. We're from Dagger. Uh, and you'll see more about what Dagger is through this workshop where we're going to build a cool uh, sweet agent and we're actually going to deploy it to GitHub. And so even like worst case scenario, if we can't get things running locally when we actually push things to GitHub and see agents running GitHub, then that's going to be out of our internet hands and it's all going to be really cool. So, uh, first of all, on the left side here, cool. Uh on the left side here we have kind of where we're getting started from. So this is the the documentation site where we have uh install instructions. So I'll I'll walk through those real quick. Um and then our quick starts where we're actually going to walk through these as like the content of this workshop. Um and then also a shout out to uh tomorrow night we have a a hack night at the Cloudflare office. It's on the the um external events list for uh this conference as well. Um but here's a QR code for it. Um okay, so real quick. So there's a there's a there's a question about whether there's a Slack I think Slack for the workshop. Yes, absolutely. So, if you go to the Slack, there's it says dagger-workshop ship agents that ship is got it. Okay, let me pull that up as well. So, if there's questions, put them in there or raise your hand and Jeremy will get to you. Climb over people. I will I will do make it happen. Yes, I'll do my best. Awesome. So, um yeah, if if everyone if you're following along, awesome. If if you can't like because you don't have desk room or uh can't get the internet or whatever, I'm going to walk through it live because I already have everything on my machine. Um and then you can always uh you know check back with this later on once you have a a solid connection. Um so if you're not able to get out your computer or follow along, just watch me and I'll go through it and it's going to be really neat. Um but if you are following along, uh here's the installation page on uh docs.dagger.io IO. So you can install um from the homebrew tap or uh straight from our install script or with windget uh you can install the dagger cli. Um the only other dependency is that you need a container runtime such as docker or podman or uh nerdctl. So like anything that can run containers uh because dagger itself runs its engine as a container and I'll explain what that means in a second. But, uh, if you're following along, get started on this while I talk through a bunch of stuff about what we're actually doing and what, um, all these technologies are trying to accomplish. So, I'll take I'll take I'll pause you really quick. Yeah, just for the for the folks on the tech team. Um, and so for some of you in the room, we're having we we're finding the Wi-Fi may or may not work for you. Use a hotspot if you got one. If that works down in the basement, you're then you're amazing. Also, I'm trying to do a little something through the wired connection here, but for the tech team, it's requiring a password for me to use this service. So, um anyway, if you have that, slip me a note at some point. Um but otherwise, yeah, we're working on we're working on getting more connectivity as we speak. Awesome. Yep. There we go. Uh so, the QR is actually for the the hack night tomorrow night. the um the docs and what we're going through are at docs.dagger.io. Um so that's like the main content for what we're going to walk through. Um and I guess real quick we can intro as well. If you want to intro yourself first, Jeremy. Yeah, sure. I'm Jeremy Adams. I look after kind of the ecosystem. Uh I'm part of the ecosystem team that Kyle and I are both on and uh I've been at Dagger for a few years. And so I've got to see already a progression of folks using us for all sorts of things and most recently a lot around AI agent kind of workflows. Um but I love in this workshop we're going to blend together some of the classic use cases we've seen with Dagger around CI um and dev workflows as well as you know giving those to agents. Awesome. Yeah. I'm I'm Kyle. Um I'm on the same team. Yeah. Um and I have a background in like DevOps and platform engineering. Uh so much more on the the kind of cloud infra side of things versus um that building side. So it's cool to come at this from uh that perspective of you know trying to deploy agents somewhere and and make things work. And that's why in this workshop we're going to deploy things to GitHub because that's uh eventually what you're going to want to do when you build an agent. You have to put it somewhere to run it. uh can't just live on your machine all the time. Uh I guess depending on what agent is. Anyway, um so if if you made it this far, you've made it to the docs, uh then we're going to talk a bit about what Dagger is and why we're building agents with it. Um and so basically Dagger is like I said, it's a container runtime. It's a workflow engine. And so people have historically done things like build their CI/CD with Dagger because you're building these pipelines that orchestrate containers, run all these tasks. Uh, and it runs the same on your machine as it runs in any cloud like in um, you know, in your Kubernetes, in GitHub, wherever your CI might run, but it runs the same everywhere. So you're making these workflows. U, and the cool thing is that's also what agents are, right? is that they're they're just these processes where we have um we have a bunch of tools we want to give to an agent. Um anyway, okay, we're we're going to see in action. Um and so uh let's see, we have components, right? So, Dagger itself is made up of uh core components like containers like I said also repos directories files and now LMS are also a component that you have to work with uh within this kind of toolbox of Dagger and how we're building things and so it's just another building block right it's not uh a framework special like just build for making an agent then you have it living next to your software it's another component within your toolbox and so you're bringing LMS into these existing workflow those um and that's why it's a little bit different. Yeah, you could think another way of thinking about Dagger in a nutshell is Dagger's for software engineering workflows and environments. So you're going to see us building some environments essentially some containerized environments with some functions and all of these things can become tools that you know human software engineers uh AI agents um use for both development side as well as uh app delivery side of things. So you'll see that kind of of course these these areas are all blending and kind of squishing together right now. We're seeing all this stuff happen in in real time. So, Dagger is kind of uh gonna be one tool that you could use for that whole range. You say you explain Dagger as a tool to build containerized environments. What's the what's the distinction between? Yeah. So, the question was uh if Dagger is a tool to build containerized environments, what's the distinction between Dagger and Docker? So, yeah, Docker has been around for a long time and in fact the founders of Docker are the founders of Dagger. And so we could think about the scope of the original scope of Docker was really about containerizing an application and making that thing portable. So I can run in my laptop or Kyle's or up in Kubernetes or anywhere. So now what we're doing is we're taking a whole workflow and making that a portable thing. So yeah, there's definitely multiple containers and other types of objects, but everything's everything's sandboxed by default. So you get So yeah, and we'll see as we get into it. Great question. Yeah. So we we're we're writing code that is workflows itself. So that code can be Go, Python, TypeScript, Java, PHP. We have all these different languages you can write with. Um and the cool thing is that you're not kind of choosing your language for Dagger and then that's the world you live in. Um Dagger has this cross language interop. So if I write a cool Dagger module um with Oh, it's it's going to want to load. That's why I have it. Oh, you're amazing. Yeah. Yeah. uh error occurred. That's Wow. Okay. Um Oh, no. Okay. So, um I broke it. If I write a cool module with Dagger that say does like uh TypeScript build or something, right? Um I I can share this on the Daggerverse and maybe I wrote that module in Typescript and you're writing your modules in uh Python. you can just install my module and you have these uh native bindings in your language uh to work with Dagger modules across language. So anyway, that's that's my point in that we're not when you pick a language, you still get to benefit from the whole Dagger ecosystem. Um and we don't have images on these. That's okay. There's some sweet animations there of like happening. It's so cool. Yeah, like the coolest animation you could think of. Um awesome. So I think we could probably skip forward here. Yeah. Um, and so, uh, hopefully we we've installed or we're downloading or maybe we're still downloading, um, Dagger. Uh, so I'll run through real quick, um, kind of the basics of Dagger. So, hopefully that's big enough. Maybe I'll make it a bit bigger. Yeah, that's good. Um, there we go. Yeah. Cool. So, we've uh we've installed Dagger. We've got these things uh like uh container runtime somewhere. Um, and so the first thing we can do is create containers. So if I'm in dagger shell, which I think I am over here, that's definitely going to get bigger. Yeah. Yeah. So I'm in digger shell and I can say container I think. I don't know. Let's go over here. Fighting the internet. And so what Kyle's showing is there's like a few different ways of using Dagger and you uh on the command line and um including kind of a non-interactive just fire off a Dagger command to run one of these workflows a function that's one of these workflows or you can use it in this kind of interactive shell mode that he's showing here. Yeah. And it's all it's all about building building blocks right. So like with the basics of Dagger, you have like I mentioned earlier things like containers, directories, LLMs. Uh but with our code, we're actually going to be building larger blocks out of those blocks uh to assemble like an actual like part of a workflow and then I'll take those blocks, build bigger workflows out of those. So as as we're using shell, we're always going to be interacting with some level of a workflow here. about like with container I can say from uh Alpine and now I've got an Alpine container and I can get uh we can do things with that like anything you might want to do with a container right so I could literally say uh give me a terminal and now I've got a terminal in a container uh and this is the exact kind of tools that we're actually giving to our agent as we're building these pipelines right so it can you can give it a container but you can build a specialized workspace for your agent to do things like write the code. Uh so it's a lot of setup to say that we've got all these these primitives that we can give to agents to build some really effective software engineering agents uh by giving them the exact tools they need to complete the job. Um but also like we mentioned uh earlier people use Dagger for CI/CD because you can create these workflows for you know running your tests for your application or whatever. And the cool thing is that if you've done that, you can take that same code that you wrote for running your tests and give that to your agent. So now your agent isn't just guessing at some code that it's generating, but it can actually run your actual test the same way that your developers and your CI do uh to make sure that the code it's generating is valid code and that can it can iterate on these things within the agent. So this is all we're going to build right now. And we were just talking to somebody outside before the session uh who was telling us that in in his organization they get uh you know not infrequently now because of people like a product manager who's discovered vibe coding or a team that's using uh you know AI powered idees or whatever that people are like cranking out these like massive PRs for him to review like 25,000 line PR RS and like and the PRs don't even stay static. So he was like he's like oh I just got this PR and I have to review it and then like I come back and now there's five more commits on it that like you know and so you've got this thrash happening and so part of what the reason why CI and AI bringing that together makes so much sense is we actually have to bring some some balance back. you know, we got this fire hose. We can all now just create so much code, but how do we make sure this is actually code that we can test and that we can deploy with some kind of confidence at some point? So, we need to balance out and make sure that there's uh software delivery workflows that are there to to test and build and validate things before we put them out in production. So, some of what we'll get into today. Awesome. So, yeah, let's actually get into writing something. So, zoomed out a bit so you can see where I landed in the docs. Here on the left side, we have quick start and I clicked on build a CI pipeline. And that's basically to get us to a point where we have a project or we're going to make an agent inside of that that can build new features for that project. Uh, so it's just going to be a real quick thing where we kind of set up this uh example project with functions that know how to build and test the project. Um, so I'm on this page and we've already talked through installing Dagger. Uh, and we talked a bit through the basics. Um, so now we have this example application called Hello Dagger template. And if you go to GitHub and say use this template, uh, you can name it whatever you want like Hello Dagger Workshop or just Hello Dagger, doesn't matter. Uh, you can create a a repo in your GitHub from this template. And the important reason for that versus cloning it is that that's going to make it way easier when we actually push things to GitHub in a little bit uh to make it easier for you to uh run the GitHub actions that actually run the agent. Um, so we're going to use that template and I've done that over here in this repo where I have my Hello Dagger Pi. Uh, because I've done this in every language. U, you can use whatever language you want to use. U, I'll be walking through Python today because I think that's probably what a lot of people uh, here today are most comfortable with. Um, but if you're not, I can switch between languages. Just raise your hand and say, "Show me go." Um, that's okay. So I've got let's see. So I've got this application in my GitHub now. I've cloned it to my machine. So now I can look at the code and it's like this uh view app that has a bunch of things in it. But the main thing is we want to be able to make an agent that develops it, right? Um optionally you can configure Dagger cloud which is let me just start loading that web page now. Um, it's basically a visualization so you can really easily see what your agent's doing, right? Because that's the hardest part of building agents a lot of the time is understanding like what are they tripping on, what what's actually going on inside the agent, how is it interacting with its tools, what tools is it even seeing. Um, so with this visualiz visualization, you're able to really easily see uh everything that your agent's doing. And that's helped me a lot like develop my prompts. Like if I see um the prompts and environments, right? If I see a lot of times that okay that the agent fails because it tries to call this tool incorrectly, I can improve like the description of the tool or maybe I need to change how the tool works completely. Um and so being able to see how the agent's behaving is a huge part of that. Um whether you're using cloud or or any other thing to visualize your agents, that's like the most important part of u making it reliable. Um okay, so we've cloned the project. We now want to create a Dagger module. So if you've installed Dagger, uh you'll run this command Dagger init uh with whatever SDK you're using. So we have these tabs here. Um so I'm going to be using Python. And then the name of the module is going to be hello dagger. And that's important because that is basically the name of our um object that gets created. So if I open this up and I've run dagger and now I can open in my dagger folder. And sorry that's really small. I don't remember how to make that bigger in Zed, but we can. It's in the uh It's in the uh preferences to zoom the sidebar. Command comma. Oh, you command comma and it'll Yeah. And you just put the there's a the top there's a there's a Well, I guess you don't have your set, but it's a it's a font size. Font size. that one. UI font size. Change it to like 25 or something. Watch it. There you go. Save that. Bam. Boom. Okay. So, now hopefully we can see the sidebar a little bit better. Um, so I'm in this dagger directory and apparently I've written go for this one. Is this Oh, because that's the wrong project. Cool. Uh, let's go to the correct project. Me open that up and close all these things. Okay. So I'm in the correct project and in my dagger I've got this source hello dagger mainpi and so we would have generated when we set dagger in it'll have um basically these files but some different content. So it's going to have like the the basic generated things uh to get you started building modules but we're going to say um see dagger functions. It'll show us what's available in this Dagger module that just got created. And so this is basically how you interact with daggers with the Dagger CLI and you have this code that are just functions of how to uh interact with your application. So for example, this build one uh we have a container. If we go down to this function and you see we're just building building blocks. We have a function that gives us a Dagger container that uh is from this base and we put these files in it and we run this command. And so in that container when we want to do a build of our app, uh we can call that other function to get that container with our our code in it, run another command and then get a directory from that. And so this is like really basic Dagger stuff of how you create your dev tools using Dagger. This is like good to call out here. So originally we had this example from Kyle where he showed us running like a container and then we said give me a a Scratch container. Oh, wait. Give me an im from an image from Alpine or from Node or from whatever. And then you can layer on more things like add a directory a source code to that. Run exec a test command whatever, right? Chaining these things together. So you notice I'm using this builder pattern here in code instead of in like a CLI. So it's all the same API under the hood. It's just in this in this case he's using a Python SDK into that same API but the same things are happening either way. Same one unified cache where all that stuff is being all those cached operations are at and and one API. So that's why it becomes really easy to use different languages different language SDKs because it's ultimately all one API under the hood. And so we got this code in this the next step of this where it says construct a pipeline. We've copied this code into that main file and that has all of those functions like publish build test uh and that build m1 we looked at u and build mv as in like your build environment and so when we run dagger functions we'll have those shown up here with their descriptions and everything from the code. So now we've at this point like we we've got the project that we want to build the agent in. We've got um some Dagger functions that let us build and test the project. We've got the project itself. So now let's actually get the agent started. Um, and so now I'll zoom out again so you can see because I jumped to the next page here, which is add an AI agent to an existing project. And so we're starting from exactly where we just left off there with that previous guide where we pasted in that code. We have our our um build, build a publish, test in our Dagger functions. lots of useful functions, but the expectation was the human was probably running those, right? Or you were having them run in CI. You'd kind of set that up, but nothing really agentic yet, right? So, we have, you know, we're just running our unit test or our our build and creating a production container. And this is what uh you as a developer or your your CI environment are running these functions. But now we want to create an agent for developers to interact with um or you know to run anywhere but also our agent should be able to use these functions as well. Um and so we're in this this next guide and we're going to now create a subm module because I mentioned like our agents want these uh refined environments where we give them access to exactly what tools they need to complete their tasks. We and nothing more than that. No, no, wait. I thought you gonna give agents like every possible tool. You want to let them have like a thousand functions that do very powerful things and just let them run crazy. Is that not the best practice? The uh Yeah, maybe not based on the the smiles across the room. Okay. Yes. Oh yeah. So the question is what if the tool needed changes at runtime? How about dynamic kind of tools, right? So a lot of cases we're working with MCPs that might we might have a lot of static tool kind of experience you know where things change what what what does happen Kyle? Well so the the main thing is like you want the right amount of tools to for that agent to solve its task uh whatever that task is like it needs the flexibility uh to to be able to solve complex problems. Uh so it's not just going straight down a workflow and saying okay I do this and I do this and do this because you don't really need an AI to do that. It needs the amount of tools to select to choose its own path uh to solve whatever task you throw at it. But you don't want so many tools that now this is a generalized agent that does anything, right? It needs to have some amount of focus so that it can solve a specific set of problems really well. But we will see like in the agent loop that's going to happen. We will see the ability for the a for the LLM to see this like menu of tools it has and that for it to select the right tool at the right time given the context. Yeah. But yeah, definitely like a big part of iterating and building these agents is determining like the scope of the tools. So like um the kind of the balance between flexibility and reliability where you want it to be able to solve a breadth of problems. Uh, so it needs a variety of tools that it might need. You don't know exactly what it's gonna need ahead of time, but you don't want to give it so many that now it's getting lost and confused and fails half the time, right? And so that's what we're going to focus on here with this uh we're going to create a subm module basically that is kind of it its playground. It's specific set of tools that lets it uh edit our source code. And so if you've worked with uh maybe agent frameworks in the past that have like file system tools, we're actually going to build that in our own code right now. Um and that and it's just a few lines of code. So don't let me scare you with that. But that's the idea is like we're we're creating these building blocks and as you scale this up, uh you can consume these from that other people have written. You don't have to write it all from scratch. But for the practice of building this as a workshop, we're going to write it all. And so all right where we going to give the a where we going to give what we put in this workspace what kind of functions yeah so we do another dagger in it here and we say uh daggerworkspace so we've created in our file system another subdirectory here uh workspace underdagger and so this is another dagger module um and this one's just going to have just the functions that we want the agent to have access to um so you can imagine it wants to read the files in your source tree uh so we have a function and again a file is one of those core components of Dagger. And so we just our workspace has a Dagger directory which is our source code. Um and so we give it a function to read a file from that. Um so it just gets the contents of that file. Um and that's just the Dagger API to say this is a path to a file. I can do lots of things with a file. One of those is to look at the contents. Um another function it needs is to be able to write files to the workspace obviously. Uh and so it's very similar API here where we say okay give me the path and also the contents to write to that file. Uh and then it needs to be able to know what files are in the workspace. So it needs to be able to list the files and it's just going to literally do a tree in that workspace so it can quickly see the the file structure of your code. Um, and so now basically with those three, we have another one that we're going to look at in a second, but with those three now can do all the code editing you might ask it to do within your your file system, right? And with with more complex um projects, you might need more advanced capabilities of these, like you might need to be able to read specific lines from a file or scan files or insert lines into files. But with our kind of demo agent that we're building right now, it's like just the most basic where we can just read and write files and list the files. So if the agent had access to this workspace object, it would see those functions as tools. Read file, write file. Exactly. It will in a minute. Yeah. Yeah. We haven't we haven't plugged the the brain into the robot body yet. We haven't. Right. So right now we're building if you think about the agent as a robot body with a brain plugged into it. We're building the robot body. Uh, and the brain is going to come in just a second here, which could be any LLM kind of a brain in a jar, right? Analogy. So, our last one finally that I mentioned earlier um is test. So, when it generates this code in in its uh workspace, it needs to be able to test to make sure that the code it generated is correct. And if it didn't, it'll get the test failures and iterate until it's producing good code, right? And so this is kind of the most important part of building this good agent is some sort of validation tool whether that's like a test or lint or just something to check that what it's generated uh is correct or maybe it's all of these things right there could be different levels of complexity but anyway here now we've got this workspace so if I go in my workspace I have this exact code over here and if I run I think I have the function down here If I say dagger-m. So now dashm points to a d a specific dagger module and I say functions. Remember before we arranged dagger functions. If I run dagger-m daggerworkspace functions, I'll see uh exactly those functions that we just uh created. Okay. So the next step is we want our main dagger module to have that as a set of tools it can use. And so we're going to say dagger install that workspace module. So now uh it's installed as a dependency of my main module. So it has this object available. And we'll see why that's really cool in a second. But basically all all your dependencies of Dagger like I mentioned like we have um you know I can look at this real quick. We have a big community of people building things with Dagger. And with that, we have the Daggerverse, which is this massive index of like thousands of Dagger modules that do different specialized things. But whenever you install one of these into a Dagger module, it creates, if we look at my Dagger JSON in this uh project, uh we have this list of dependencies. And so your your um Dagger module has basically its own Dagger client that is the core Dagger API in addition to all of your dependencies. And so that when you're writing code, you can uh like I mentioned earlier like native in this language. Uh you'll see all of these things available on the the main Dagger client. So you can do um all these complex tasks. So basically we've built two modules already. We've built this workspace module and the main module where we're doing our our tests and builds. Uh and so we want to create the agent now that can take that workspace and our tests and we can actually ask for new features uh or modifications or whatever. So that's the next step in this guide we're looking at where we want to create an agentic function. So could could we have mixed and matched like could we have written that workspace in Typescript or in Go and still installed it into our Python module? Yep. Exactly. So like the other modules any individual module can be written in any language and you can mix and match however you want. I knew the answer to the question. I was just you check just checking but but yeah we see people do this a lot where they have different teams where like you know maybe there's a front-end platform team and then a backend platform team and maybe the these folks are typescript these folks are go but they can interop and user stuff. So yeah. Yeah. So like everything everything every task or workflow or whatever that you do with Ager is a function in your code. And so an agent is no different, right? It's just going to be another function. Uh and we're going to call this one develop because we're going to ask it we're going to give it an assignment to complete in our project. Uh and it's going to complete that assignment. So that the develop function is our agent. Uh, and so this is going to give us the code to copy. And I'll just open in the editor so that it looks a bit nicer. Don't worry, it's only like 500 lines. It's totally fine. And you know what? It's it's it's really short. And oh, wait, it's not 500 lines. Yeah, this is this is it right here. So, we have a few lines maybe. You have it all like spaced out nicely. Yeah, it's so we have um a new function called develop and it takes in an assignment. And this annotated thing is just a Python way of getting us these these um doc strings for the parameters. But it in different languages like we can see back here if we go like go your arguments just look like this where this little comment is basically the help string when you're using the Dagger CLI and say Dagger functions it'll say the assignment parameter is assignment to complete which is really cool. Uh and we see our source here which is like our project source. Uh but of course we don't want to have to pass that as a parameter when we're calling our agent. Uh so there's this cool thing with Dagger where you just say default path is slash and that's going to be the root of our git repo. Uh so if we don't pass in explicitly a source parameter, it's just going to pass in our git repo as that parameter. And so now we just have to say develop build me a cool new feature and it's going to kick off our agent. So let's look at the components of the agent real quick. So the environment is like the main thing, right? And I I've used that word a lot today and hopefully a lot of people are using the same word in the same way, but you you have your uh your robot body in the brain like Jeremy said where your environment is basically not just the tools that it's using to complete the task, but also um your your inputs and outputs for the agent um any any objects or state that it's working with. All of this is the environment. And so we want to construct this environment and then plug in the LLM which is our brain and say here's your environment here's your task slashprompt and complete the task. Um and so this is this is the environment we put together where the assignment is a string input. Um so we have we have this cool kind of way of um declaratively building your prompt right where our assignment is the assignment to complete. This workspace input is a workspace with tools to edit and test code. So now that our agent when we connect these things we'll we'll see this as the description of this thing that it can use and say okay we're we're building out this prompt by uh annotating our code basically. And so with this workspace input thing that's referring to the subm module we just created. So if the workspace exactly so if we called that something else like fu workspace and we installed that this would be with fu workspace input. Right. We're we're dynamically generating all of these functions for the environment type to say um any objects in my dependencies I can have as an input or an output of uh my environment. And so we notice that we also have a workspace output which is the completed task. Um because all objects in Dagger are immutable. And so you I give it an object, it's going to do a bunch of things and give me back a different object that's it's completed task. Um, and maybe that's like a boring detail, but the main thing is the thing I passed in is still going to be the same, but it's going to have a new version that's given me back called completed. I mean, I think a lot of people are dealing with this kind of stuff now, right, with the different APIs and like doing a bunch of JSON parsing and validation, right? and trying to you know there's different frameworks doing it different ways but you could just think of it as this is our way of saying like here are the typed inputs these are typed inputs we're expecting a typed output back in the end and this gives us a way to ensure that uh we're getting what we actually asked for right now uh next we we need our prompt so we have the environment and the prompt and we give both of those to the agent basically um so the prompt I believe is just a bit lower here if you're following following along here. So it wants you to create a dagger/develop prompt.mmarkdown and it looks like this. So I'll just open it again over here on my editor. So this is our prompt and so we're saying you're a developer on this project. You're going to give you're going to get an assignment and the tools to complete it. Your assignment is dollar sign assignment. And so this is basically it's going to be templated in by the assignment in our environment. So it's going to drop that right in that prompt. So the the agent itself doesn't have to go read this other variable in its environment. It knows, okay, my assignment is make this cool new feature. And then we have a bit of prompt structure here, right? Where uh if you've built a lot of these agents, you've probably kind of refined how you build your prompts and what those structures look like. Uh this is a really simple agent so it doesn't have a ton of structure but we do say uh before you write code make sure you analyze the workspace to understand the project structure so it's not just going to create some garbage or be like cool I made this new file uh but I didn't look at the project first. Um don't make unnecessary changes because sometimes uh you'll see especially certain models uh without the right constraints will go make the change you ask for and then change four other things and be like cool looks good ship it. um and always run the test. So, we do have to ask it to run the test once it's made those changes. So, it's not just going to see the test function and be like, "Oh, I should probably call that." We want to make sure to tell the LLM like, "Okay, you have a tool that can validate the code you're writing. Make sure you use that tool." Uh, and then don't stop until you've completed the assignment and the test pass. So, this is telling it, you know, keep working until you've satisfied what I asked it to do and the test pass. some good reinforcement. You kind of like told it to run the test twice. Yeah, you better. And this is comes from experience, right? Maybe a third time will help too. I'll say it doesn't hurt at all because Yeah. And maybe in all caps because it's like what we find we end up running evals on these things, right? Where we'll try different LLMs plugged in and then we'll iterate some on the prompts and until we're getting the results, the consistency we want across the different the different ones. And um and yeah, it comes from experience of knowing like how they veer off track and etc. How we're writing these. Yeah. And that's like what what I mentioned earlier like using something like Digger Cloud to be able to visual or see the visualization of all the work the agent's doing. If I'm frequently seeing like okay that the agent is just calling write file and then returning I know that okay I have to tell it to look at the code. I have to tell it to test the code. And that's going to be different for every model and especially like the prompt structure is different for different models. Yeah. Yeah. So the question is like can you implement like reflection agents to police each other and that's something I probably have an example of that I can show at the end if we have time. Um but yeah like remember in the with this each agent is just a dagger function and so you can create all these agents layered on other agents. Um, and even in your environment, you could actually put an agent in the environment and say, "Hey, you have this a this agent at your disposal uh if you needed to do something, right?" And I have examples of that, too. But it's like similar to the the concept of like Google's A2A where you you say uh if you're not familiar with that, it's basically this um structure where you tell an agent, listen, you can do these things, but you also can talk to these other agents, and that's what each of those other agents do. And so if you need to, you can reach out to them and say, "Hey, other agent, um, I need you to tell me how to write TypeScript." And that comes back, right? So you can put agents in environments. It's all just piecing functions together, right? It's it's just the same code we've always been writing, but now there's an LM component. Um, cool. So now this line right here, line 94, most important line of the workshop because this is the agent where we've actually taken our Dagger client and LLM. So this is another type within the Dagger client. Make it bigger just for a second, you know, just Sure. Yeah. So it's off the screen since it's so important. I feel like it's not even getting that much bigger. It's just so huge. Yeah. There we go. Yeah. Uh cool. So like we we've said, "All right, from the Dagger client, we need this LLM type. Uh we give it an environment. We give it a prompt. And that's the agent." So now we've got this thing work that is a Dagger LM. See, people want pictures of it. You got You got center it. Yeah. Make it look good. There you go. Boom. I can If you need your pictures, you can get one with Kyle and commemorative. We've got like frames outside. You can slide it in after. I'll autograph it. Um, so that that's the agent. Like that's literally because we've asked it like we we've said in this prompt. We didn't really ask. We told it we told it in the prompt. Uh, this is this is your task. This is how you work. Don't stop until it's done. And so now this work variable in our code is the completed work. And so from that work we can look back at the environment in that and say I have this output called completed because you remember in our environment we defined a workspace output called completed and this thing should be a workspace. If it's not somebody screwed up that happens sometimes. Um it's a good final check and type check. Yeah. And so from that workspace, we want to grab the completed directory which is the source. So if you remember in our workspace object here, it has an attribute called source which is a directory. And so this is all like a few layers of complexity, but we've said in that workspace, we have a source thing that's a directory. And ignore the node modules folder because maybe that's going to break in my machine. Yeah. Uh and then now that we've got that just to make triple sure because remember I mean we we did tell it three times to run test but now we get this back and in our code we're saying all right now run the test because this is all the same code that we're using throughout our project to run tests. So we can say okay completed now manually run the tests and if that fails you could maybe kick it back into the LM and say hey this failed try harder. That's pretty huge right? So that's like trying to put the agents on on rails or give them guard rails, whichever metaphor you like better. But it's like, you know, that's pretty key because we're trying to like let them do the creative stuff they do, the generative stuff they do, like write some code for us, but we need to enforce certain standards, right? It could be compliance things, could be like you say linting, so we don't just dump that garbage garbage back to your machine. Yeah. Uh, and remember all these changes that it was making as it's iterating on these things, that was all done in a container. It's not just changing your file system as it's doing its work. And that's a key thing, too, because now maybe you have 10 of these agents running. They all have their own sandboxed workspace where they're editing these files. They're not messing up your local state. And before we do mess up our local state, we triple check that the test pass. And then we say, okay, return that completed directory. And so now this function and we'll just triple check here on the guide side. They didn't miss anything. We say dagger functions and we have this develop one that shows here. So now if I go into dagger shell which is hopefully what it asks us to do. It is I say hopefully I wrote this so you know this we're just checking myself here. Um and I can go in and say dagger. Now before I do that um one thing I don't think I called out at the very start here was that we had to like configure an LM provider. So with Dagger, you bring your own model. You can use OpenAI, Gemini, Anthropic, um local models, Olama, Docker model runner, like lit literally any anything you could hook up to bedrock. Um so you do have to configure some environment variables to be able to for Dagger to make API calls to that, right? Because we're just we're just the agent with the tools. The model is living somewhere else. Um, and so this is this configuration page. Configuration, uh, shows all the different options on how to configure things. Um, one really cool thing to call out, I'm just going to type something really scary. Um, oh my gosh. So, Dagger also has cool secrets provider integrations. So, I don't have my actual API key uh echoed there. I just have my one password reference and it's just sitting in one password somewhere. Um and so let's see. Yeah. So it's just pointing at this credential. Yeah. And then if I reveal in plain text, um so I I've configured this in my environment. So now when I say dagger, um it's going to take a second to spin up. And this is the part where if you're struggling a bit with Wi-Fi, this might be a bit tough, but it's okay because if you are following along, we're going to push this to GitHub in a second and it's going to run in GitHub and it's going to be on GitHub's network. So, we don't have to be uh beholden to that. But now, can you run LLM? Yeah, exactly. So, now if I say LLM pipe model for example, uh where you see my little one password prompt. Nice. So, it's got my key. It's going to take a second to think about it. Uh, and so with each model provider, we have a default model, but you can also specify one. Um, we can also specify one in code, but right now by default, it's going to use cloud 35. Uh, so maybe we're not going to get the best results, but we'll see. Yes, classic. A classic. Yes. Um, cool. So now I have that and I can say, and we have that new develop function, right? So I can say help develop. And so this is the thing we just made where Can you bump that up a little bit bigger? For sure. Yeah. Yeah, perfect. Uh, so we have that required argument of assignment and that was our assignment complete. We have an optional argument source which again is just going to be my repo and this is going to give us back a directory. Uh, so here's how I use it. I just say develop and then do the assignment. So let's say develop and then we didn't actually look at the project we're daggerizing yet, but I promise it's like uh viewjs website. So let's ask it to I think in here we say um the example thing is to make the main page blue and I'll say make the main page say hello workshop people. Oh doesn't say that right now. Um and I've never run this so I don't maybe it'll succeed. So now we can see this happening. We see our prompts getting passed in. We see the little uh person face that's the prompting and the little robot head of the model which is claude 35 sonet saying cool let me do these things and we can actually see it calling tools right so it's it's uh looking at the functions available we see that workspace list you said yeah list files yeah the ones that we made um and so it figured out okay I can look at my files now here's this specific file I might need to edit so let me read that file and so it it now sees the contents of this. And while this is running, let me just open up cloud and hopefully this will load so we can actually see like the the cloud visualization of this because it's maybe a bit easier to see because we it's sign in. I'm clicking the button. I think my Wi-Fi is failing me on this O flow, but while it's running, we'll just watch this. It's the same it's the same open telemetry in both places. So that you're getting streaming to your terminal UI and the web UI. We see it call write file with some new file contents. And now says now that we've made the change, let's run the test. And this is the part that that really might fail on this Wi-Fi because it's inst. It's doing an npm install and downloading a bunch of npm modules or node modules. But it's uh it should pass in a second. Uh we'll just let it go and we'll talk through it. But we we can see that our agent is actually it wrote the files and then it's writing it's running the tests which is really awesome. Uh cool. So this opened up over here with npm installed. Yeah. Was part of the tool that you gave it or um so this is Oh yeah. So let's So we see it's saying like with exec npm install with exec npm run test unit. If we go back to our workspace in our test function, that was part of it. So this is like the agent just had to call test and we've defined what happens when you call test and so it's not like the random ones like you know sometimes you're like you know make sure test and it's like I'm going to try pi test with these crazy options and you're like why did you think that was going to work? Instead you just give it you know exactly what it should be. We could give it more flexibility in how it runs things, but in this case like we already know like this is how you run tests in the project. So we just give it a test fun. Like that's probably the biggest thing in like creating reliable agents with Dagger is like giving flexibility where it's important for completing tasks and removing it where you know exactly how things are meant to happen. So you know exactly how tests need to run. Uh so it doesn't need the freedom to just run any command in a container. we know, okay, all you need to do is modify files and run this test function. Um, and for more complex agents, maybe there's some other functions there, too. But for this one, like this is the amount of freedom we've given it. Can we can we like open another uh Well, hold on. So, we got cloud. Okay. Okay. We got cloud. So, yeah. Well, we'll get back to my pipe dream in a second. Okay. So, let me see if I can expand this. Uh, and so this is like the visibility that we want to see when we're running these agents. So we saw the prompt and we saw the assignment is to make the main page say hello workshop people. Cool. And then so this is the prompt we gave it. Now Claude 3.5 is looking at this and saying first let's look at what objects we have and check out the workspace make the changes and then run the tests. Sounds good. It runs list objects which lets it see uh what it has in its environment which is like this this workspace tool, right? Cool. And then it's going to say list method. So it's going to see what it can do with a workspace. Like what the heck is a workspace? It says it has tools to edit and test code. And then we expand that. And so this is like this kind of visibility into the agents environment where we say, "Oh, there's this workspace write file function that gives it back a workspace type and these are the arguments." Oh, you mean so we didn't have to write any of the JSON kind of, you know, description of tools. It just gets generated from the functions. Yeah. So we just gave it that that Dagger module and then it all got wired up into the agents environment. And so that's cool. Let me select these methods. So now I have these as tools to call and then let's see what's in the project. So it's going to call workspace list files. And remember the the way that it does that in our workspace code was it creates like an Alpine container and runs tree. And so we can see the tracing of that too which is like the underlying uh actions of the tools being called. We also see the return of that which is what the agent sees and it sees this whole file structure. Cool. And then we can see says cool sounds to make it say that we should probably modify this one or this one. So let's see what's in those files. We can see it read the file and that's uh it's going to see this whole file of um the work hello world.view that says okay I don't think that was it. Let's see the app.view view and then it reads that file and then eventually it says I see that that world that app.view uses the hello world component and passes a message to it. So now it's going to write the file. It's going to change app.view to pass a different message to it. Um and let's see, we can expand this to see the whole thing. Yes. Awesome. Nice. So hopefully if this ever if it doesn't finish, it's fine because we're going to push it to GitHub in a second. Um and then GitHub can run it for us. But now it's running those tests. So this is the part that it's currently at in my shell where it's been running for like five minutes. Um, so yeah, that that's the the visibility part I'm talking about where we can see exactly what the agent sees and what's happening under the hood. Um, so this is to be clear, right? So this is all running on your laptop and yet it's all inside that Dagger engine in containers totally isolated from your laptop. Dagger cloud is just showing me the visualization. It's not running anything for me. This is on my machine, which is why it's still running. Well, right. And and this is like because of the connection we have and because of, you know, whatever the load we're putting on it. But it's the other thing to think about is it could be like uh we're using Python here. We're using Node, right? We're using a bunch of different tools. So like the app is Node, but the the uh the workflows that Kyle's writing are in Python. You could have a laptop say or any server that just has Dagger and a connection to the internet and you don't need any tools installed. So that's why the environments environments is not just for the agent developer. I mean it kind of goes all the way through. So you could have a brand new laptop with just Dagger and it would because it's using a Python runtime container for the workflow he wrote in Python. That's just implicitly there. So you don't need to install Python. You don't need to struggle with VMs or any other versions or whatever. It just it's done. And then inside of that somewhere there's node container that happened, right? In order to create this environment, the build and the build and all that. And that again, it's all just nested inside of there and and cached and everything else automatically. So you can you could kind of just do this with a very bare bones machine setup and everything will just work. Yeah. So what we can see that we probably won't get to run this part locally just because um I don't we we'll come back to it if it finishes but anyway I'll just describe this flow here where we say okay we're in shell that happened like we we ran that develop thing and it it gave us back something but now in dagger like I keep saying we're in shell dagger when you type dagger and get into that um this view it is a shell just like bash right where we can do things like create variables and chain things together. And so what we could do if this finished is say, okay, let's actually save that the output of this thing because remember it returns a directory. Save that to a variable called completed. And then we could pass that to our other functions because remember they they default to using our git source from our machine. But we could we could pass in that optional directory to all of our functions to say use this directory instead. So now I could actually run the whole thing uh as like a local like I could see the results of this before even saving it to my machine. So let me just go over here. I don't know why I keep ending up in this folder but we'll go uh to the correct directory and we'll open another shell here and I'll just type in part of this command. Um because what what I can do is I can run the output from the agent as like I can run the whole site. I can build it and serve it to my machine. Uh and I can see what it's built before I even save it back to my disk to say yes, this is a good solution. Um so once we uh get this connection here, just waiting on pipes to connect to each other. Um and we'll we'll let that run for a second. But uh the main thing is we we can pass this around. We can run all of our functions with that completed directory and then finally say all right we say export that saves it back to your disk and we're done. So the next step is all right we're good with that. We we know how to use this agent locally to ask it to make cool tasks. That's fine. But my my people requesting features on my site they don't have this installed. They don't have Docker and Dagger installed on their machine. They don't want to use Dagger shell. they just want to go to GitHub and say make this new feature. So that's the next step here and and it sounds ambitious but it's really quick. Um so we've got plenty of time to to look at the solution here and we'll look at it in Python once again. And so the first thing we're going to do is actually install another dependency from the Daggerverse. And this is my module called GitHub issue. And it's basically if we go to Daggerverse, we saw it installed earlier when you showed us that Dagger JSON with the Exactly. But that's because I skipped ahead. Oh, I see. Yeah. Nice. Um, so if we search for that and we have this module called GitHub issue, uh, it's got a bunch of functions that let us do things with GitHub issues like um, we can list GitHub issues in a repo. We can list the comments on a particular issue. We can write comments. Uh we can create pull request comments. Um all kinds of things with GitHub issues and GitHub pull requests. So with this module where I've just basically used the uh GitHub Go SDK in this Go module to connect my Dagger Functions to the API calls. I can install this in my Python project. And now I I can have the ability to work with GitHub issues. And so all it needs is a GitHub token. And so we create we add another function to our code um called develop issue. So remember we created develop now it's develop issue and all this is going to do is say we have a GitHub issue out there with our feature request. We want to read that GitHub issue give it to our agent. The agent's going to do all its things then give us back a directory. We're going to take that directory and make a pull request. Oh, so like really similar to like the assignment that we gave it, instead it's going to be reading the GitHub issue and instead of just getting the directory back ourselves, we put the directory into a PR. So we can see the code here. Um, and so this is the entire thing here where we're not writing a new agent to do this. We're using our other agent. We're just we're we're wrapping it with some other pieces to say go here to get the assignment. Once it's done, put that completed work over here. uh which is the from here was like read a get of issue uh and then we get that assignment and I can open in the editor so it's probably easier to see um okay so we we get that uh get of issue from that issue we get the assignment from the issue body uh we pass that to our develop function because this is our agent and say here's your assignment here's the source uh which came from that same defaulted uh input argument. Uh and then we uh get the issue title and URL uh which is going to be really cool because then we can actually in GitHub automatically have the new pull request linked to the GitHub issue uh just by having this the body say closes this issue and that's going to create a pull request. And so this whole thing like you can run this part locally too. You don't you don't have to run this part in GitHub, but it takes the GitHub token and an issue and the repo name so it knows where to put the PR and then it does that whole flow. But we actually want that to run in GitHub and that's super easy. Um, so we've made that thing. We just saw the code. Uh, now we create a GitHub actions workflow. Uh, the first two things we need to do is in the repo we need to create two repo secrets. one for a cloud token. Again, that part's optional. Um, but if you want to see all those things happen in Dagger Cloud, you just put that token in the environment. And then whatever LLM key you're using, so the same one uh I use locally is going to be in that repo secret. So if I go over here in my repo and I say and I zoom out a bit so I get all the buttons, I say settings. We wait for the page to load. And then down here under secrets and variables actions, I have two repo secrets here that we just saw from that screenshot. Make it big again. Sure. Um and then there's one more thing which is uh let's see that's how we get our Dagger cloud token and paste in there. Um there's a little check box we have to press over here to let GitHub actions create PRs. Uh because that's disabled by default. Uh so if I go under um okay under actions general and then at the very bottom there's this checkbox allow GitHub actions to create and approve pull requests. So I've done that. Uh now I just need to create a workflow and the workflow is super short. Um this is a thing you can copy paste and I'll open it up over here. Um under GitHub workflows we have develop and so now we have this is GitHub actions. If you ever haven't used GitHub actions, I'll explain this real quick, but it's basically uh a CI platform and we have with this configuration we tell it uh when thing when events happen uh go do these things. So in this case we say when a github issue is labeled and the label is called develop then run this command and this command is the dagger called develop issue with those arguments like github token the issue ID and the repo and these things are all coming from github actions automatically. So like the environment's GitHub token is created here where we say this uh this command needs a GitHub token with permissions to write contents. Contents are like commits to your project. Uh read the issues and write pull requests. Uh and so we've put that in the environment. We've g it given it the API key for our LM and the cloud token. Um and so now just by running this dagger call that connects the dots where github actions whenever we create that label is going to run that dagger function and that dagger function has all the capabilities to run the agent and open a PR. So that's like us in the dagger shell when we call when we are running like the develop function or some other build function or whatever. This is just having github actions run the develop issue function for us. Yep. Why are you having actions do it? So then issues. Um, so we're having GitHub actions do it because we want this flow to be automated inside GitHub. So I'll show the flow real quick, but it can run anywhere. So you can run it as long as it doesn't matter. Doesn't matter where it runs. Yep. Uh, this just happens to be GitHub a GitHub actions because we're already in a GitHub repo. It's free because this is like uh uh we're not using any crazy compute to run this thing and most of the hard stuff's happening on your LLM that you're paying for. and they have better internet connection at GitHub than today. So let's say let's create a new issue and we'll say change the greeting and we want to what did we ask for before we asked for like make the main page say hello workshop something like that hello workshop people yes okay so we'll create this GitHub issue and remember this this whole thing kicks off when I add the label develop and so I've already run this on this repo and obviously made a typo as well at one point. Um, but if you don't have it there, you can just say foo and you'll have a button to say create a new labeled uh develop. So we want to call it develop. Um, so I click that and now my issue has been labeled and so now that kicks off GitHub actions to call my dagger thing. So, let's go over here in the actions tab and we should see something running and it says change the greeting and we can watch this run over here. We can also pull it up in cloud because remember I put that cloud token in there because this stuff is all too hard to see uh flying by my screen in real time. So, let's go back here and this is GitHub actions, right? But it could be any kind of you know orchestration or CI orchestration could be Jenkins could be Gitlab CI it could be anything Azure DevOps you know whatever whatever you got. Yeah. Question how much if any like Chrome modification for you guys is it literally just puts in that one markdown file or do you add like is it aware that it's in Dagger? It is. Yeah. So we have the question is like how much prompt modification does the agent have? Uh, Dagger has its own system prompt that it adds that kind of guides it towards like how you use uh tools within Dagger. So, it knows like call the the select methods and list functions and those those things we saw doing. You can add more to the system prompt. You can get rid of that system prompt if you want to. But yeah, there is a default one. Yeah, we have to make further edits because the agent is not able to develop the right code or logic. Yeah. How do we correct develop but before the rest of the stuff. Yes. So if the agent does some if it calls develop and it runs and it produces something that we say okay that's not right. How do we go back and say make these changes? Um can we just edit the completed source? Oh yeah. So yeah so you you can edit the completed source if you want. If you say if you see the source and say, "Oh, it needs one more change." Or I can show you another function where we say, uh, we have an ability to give it more feedback to say, "Okay, you've done this so far. Here's some more changes to make because you didn't get it quite right." Um, and so we'll see that happening. Uh, yeah, go ahead. Possible test. So that doesn't write uh the test directory it I think it should um yeah so I the question was giving uh the agent access to the test directory. I think in test it runs that and I think in our workspace we just give it the we give it like the full source full source of the repo so it could get down in there if it wanted to. Yeah. I think it it's kind of a funny thing like making sure the test passed because sometimes if the agent broke the test, it'll go change the test and sometimes that's correct, right? Sometimes we actually change the behavior and the tests need to be updated. But maybe more often that's not correct. So you might want to maybe have that as part of your prompting or part of your validation to say make sure the agent didn't change the test or or how it kind of tough to decide like whether that's correct or not. Yeah. So in a oneliner. Yeah. Yeah. So in uh our workflow we installed Dagger but it's it's really just there's a a Dagger for GitHub action and so we just said what threeliner in this threeliner. Yeah. So we said this version of Dagger but this installs Dagger in your in your GitHub actions runtime basically. Uh so we used checkout to check out a repo and then this to install dagger dependencies like you wouldn't like Oh yeah. Yeah. Exactly. So this is um in our Dagger JSON we have all all of our dependencies listed and so you don't have to say like Dagger install or anything. When we say dagger install, it adds it to this and then we just run it. Um, we don't have to do anything like npm install like that. It just it it knows to make sure your client's generated. But that's that is the nice thing about having those dependencies, you know, u in a in a file saved in git, you know, alongside the project. So because like what we've done essentially like if when we first got this project this view app project it didn't have any dagger didn't have anything right it was just like an app that you could run and then we said oh well it's dagger in it in this thing and then we got that little dagger where we started developing our build and test functions right kind of like our tools for development or for CI just alongside and then in there is where we've been installing more modules like the the workspace the github the GitHub issues module like anything else you would need. So now and that's all in git. So the thing's now like this fully loaded like daggerized project. So it's kind of carrying around its own tools on its back for working you know just for a developer to use or platform engineer to use or for an agent to use. Yeah, we're just like waiting for things to load here. Um yeah, go ahead. Have you gotten anything uh like dagger in dagger where you have it spinning up like agent fleets? Yeah. Yeah. Yeah. So that's um I mean myself as someone that builds a lot of Dagger code I have agents that need to write Dagger code and to to um reliably uh validate those things they need basically Dagger inside of Dagger. So, that's exactly like a thing that you can do. And I can even uh pull up if we go to And we're a bit short on time, but we're basically done with that guide. Just waiting for it to run. Yeah. Um, but we have uh an examples thing here on the docs. And there's tons of examples here, but one of the really cool ones that I like the most because I wrote it is I thought you were going to show mine, but that's fine. No, it's fine. The Oh, it's not. Okay. Oh, we're gonna add it to the list of examples. Add to the list be even cooler list soon. So, we have your question next. Yeah, there there's this repo under my GitHub kpen/dagger programmer. And this thing is something I use to uh like in the docs we saw those tabs of all the different languages. And so, every whenever I write a new guide, I have to have it in five languages. And so, this agent can take it in one language and produce all the languages. Uh, and that's just an agent that knows how to write Dagger. And so to do that, it has a lot of cool things in addition to be able to like run Dagger and Dagger. So if we look at the the code for that, it's just like this one's in Typescript. Yeah, this is the TypeScript one. Um, and when it runs tests, it runs the Dagger thing and there's this flag privilege nesting so that the inner container can talk to the engine. Um and this this is writing Dagger code basically. Yeah. Question here curious how does this relate to um MCP and would you use Dagger to implement MCP servers and is there some overlap because you have all these modules which maybe you could imagine having multiple MCPS um in as a different mechanism. Yeah, absolutely. So, one way to think about it is um we were we were kind of doing this thing with Dagger modules before MCP came on the scene and then obviously we're like oh this is super aligned with the way we think things should be in a lot of ways. Um, and so you can today even take a Dagger module and you can say Dagger-m the name of the module MCP and so you can expose a Dagger module as an MP MCP server for example and yeah and we've got some more things that we'll be probably sharing soon about that kind of stuff but yes uh we think it's uh the vision is compatible in that way and uh yeah you can use you can use the MCP ecosystem as well. Yeah. So, there's a few different layers to it, right? There's there's um within our agent that we just built, we installed modules and that uses uh basically our internal implementation of MCP to talk between modules within Dagger. But you can also take a Dagger module, expose it as an MCP server and then um in I don't know the near future next week or something you could connect to external MCP servers to bring them into Dagger as well. Yeah, I mean it's I wanted to be speak clear like the internal the internal implementation it's it's before MCP so it's not MCP per se but it it very much logically you can think of it in a similar way. Yeah. And because you can expose everything as MCP servers it ends up being practically you know very very much the same for users. Check it out. We got our PR what finally. Oh we got a PR. So we got our PR open. says make the main page say that closes that issue we created. We have that commit pushed up and we see the user is this GitHub action spot and we have on the welcome.view it changed from documentation to so maybe that's right. Oh, it deleted this other thing too because it decided that's not needed. Cool. So we have a really cool agent. Yeah, it needs lots back there. agents just like um but yeah the main thing is we were able to get it to run in GitHub so I was able to request that feature and it ran handsree and now yeah exactly so right now we we only built in the one thing where it says we create an issue that's a feature request um but if we look at I think on this examples list um we have uh this one this greetings API which is my main like demo project and it has a ton stuff in there's like five different agents in here and one of them is I want to give feedback on a PR and so we could probably open one of these uh and I say I give it I give it some feedback I say slash agent uh add this other fe so this one the original one is like make a new endpoint for my API and then it did that and then I say okay here's some feedback the endpoint should be authenticated and then it picks up again pushes some new changes and And then I have another agent where I say slashreview and that will create a review for my PR with any other changes that I need. And then I can say okay make those changes and then also please don't delete all the tests. Very important to add and that could be like you don't have to be inserting yourself at every one of those points, right? But in this case it it's great for when we're Yeah. If you want an example of how you could take what that workshop just built to the next level where you have all this feedback and more advanced things, this is a great repo to look at this greetings API because it has all of these different agents doing tons of things. It even has one where if we look at uh if I as a human push up a broken thing because we still have humans developing stuff sometimes, right? Uh, so I pushed a broken thing and the test failed, which is super annoying because I, you know, I skipped running tests because I didn't have a good prompt that told me to run test three times. Uh, this agent can actually look at the test failure automatically and propose a a fix for that that I can just click on it and fix that uh, test change. Right? So, this is all stuff in this demo repo uh, where you can see like how to build all these things yourself. A question over here. There's a lot I really love here. Um, I just had a question almost getting at the motivation for some of this stuff. Yes. It feels like there's a world where Dagger could have really prioritized just like the containers, the workflows and let you just bring your own AI agent. Like what's the motivation behind making it its own primitive and going down that path? I think there there's a lot of levels to it, right? Like if you're already really baked into like Pantic or OpenAI agents SDK, you can still use those container workflows in that. And I'll show it. Maybe I shouldn't, but um but I have crazy. Uh if if you've done the OpenAI agents quickart um if it loads here uh or sorry, this is the the agent quick start we have but with the agent SDK where I've used the OpenAI agent SDK that says like um here's my completions model. This is actually using Olama. Uh this is what their SDK looks like. Uh but in that SDK I'm actually still using Dagger. So I actually recreated that same workspace where we have read file, write file, and build u but I've created that with Dagger inside of the open agents SDK. So I'm I'm using their agent but using Dagger code for the containers like what like why use I guess yeah the main thing is like this I had to write all of these tools and how to use them. If it's all within Dagger, you get that cool thing where we have that whole Dagger Versa modules. I can just plug one in and that's just given to the agent, right? Yeah. Your your whole your whole method signature is instantly translated into the right form to work with tools, right? You get tools for free as well as functions and Yeah. And we do have some people in our community that are using Dagger. they're like with paidantic and other things where they're just like they want the sandbox capability because they're like oh I don't want to you know I don't want to use another cloud sandbox vendor or whatever I want to have it locally but I don't want it on my computer in my file system either I want containers I want it easy so they're so yeah but I think yeah the sweet spot is kind of doing it all because it just harmonizes really well question there thanks for the great demo uh so I had a question let's say if I want to build a uh agent for programming HTML games. Yes. Which run in browser. So for that game building agent, I would need the testing envir so the running and testing environment to be browser. Yes. So does Dagger has th those sort of constructs like let's say if I want to spin up a browser environment and then do some kind of automation in that for testing that game which the LLM might have written. Yeah, I mean you certainly can. I mean I've done I've done some headless browser stuff. I've also done some browser stuff and then connect over VNC and or different yeah you can do a lot of you know you can do uh you can do a lot of stuff u with Linux containers um so yeah we should talk about it you should come come in the community let's like do it great demo and uh thanks for compressing a lot of information uh so is my understanding that you build CI/CD infra and all these things once and then let Dagger do the asynchronous job of with guardrails and you know uh all the things in place like is it is my understanding that Dagger is sort of this asynchronous AI agent that does things on its own but with guardrails not just leaving uh cloud code or something uh in a trust all mode and then let it do its thing is is that right I think yeah so the question was like yeah is what is Dagger in a certain sense too but Dagger gives you this platform to create these software engineering workflows that can be used for shipping software. They can be used for developing software, you know, and the environments that we saw and then you can use them uh as a platform engineer or as a developer, but then you can also hand them off to agents. And so we think that's really powerful, the fact that you can use that same platform to do all those things and to create those guard rails like you say. Um you can the one thing I wanted you to show and you got one minute. Can you just show your terminal and just let's get vibe for just one second. So you're connected to an LLM right now, right? So um go ahead just like let's talk to this LLM. So it turns out that we've been using the shell mode which lets you know kind of like very declaratively say like I want container from Alpine with this file and give me a terminal into that or whatever. Now what we've done is we just had we're like we're chatting now directly with the connected LLM and this LLM can see all the Dagger objects you have. So another way you can use Dagger is you can just say like all right I'm just going to create this container and I'm gonna say hey LLM you see that container why don't you write some software in it. So you can get that kind of that kind of workflow going too. So he's there you go. So he's actually saying like hey give me a Python container and so it's going to actually look and see what methods exist in the Dagger API. It's oh there's this container method in the API which we were using earlier and then it's going to like you know decide oh I'm going to use container maybe from container from container with exec to execute. So these are just it's exploring the dagger API right now. Got it. And now it's going to like it's actually pulling a Python 3.11 container then it can like do things with that. So you know it's actually using containers like kind of like computer use or something like that. But um so yeah, so you can get you can go we didn't even show that side of it because you know we're trying to show the ver show the guardrails but you can also use it in this kind of a style too. Got it. Uh and one one follow-up question uh typically LLMs are good at small to medium tasks and that's what we have seen like a small to medium task here. How uh good is Dagger at orchest orchestrating uh like a large task which need which needs design or some user input or you know multi-turn prompt uh like you know not a small medium task but a large task. How is dagger with that? Yeah the question is like a size of task that dagger is good for. I think if you make it if you decompose things down and you can architect things right it can handle a lot of different sizes and we should I know we're at time now so we're going to like we're going to end here. We'll take some more questions outside the room, but in the hall for sure. Thank you so much for everybody that attended. Thank you guys. [Music]