Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger

Channel: aiDotEngineer

Published at: 2025-07-27

YouTube video id: Fzb1a24hF-o

Source: https://www.youtube.com/watch?v=Fzb1a24hF-o

[Music]
Okay, we're going to kick this off. Uh
we're trying to sort out some internet
options here, but uh in the meantime,
I'll uh give us our little intro. Um so,
first of all, we're I'm Kyle. This is
Jeremy. We're from Dagger. Uh, and
you'll see more about what Dagger is
through this workshop where we're going
to build a cool uh, sweet agent and
we're actually going to deploy it to
GitHub. And so even like worst case
scenario, if we can't get things running
locally when we actually push things to
GitHub and see agents running GitHub,
then that's going to be out of our
internet hands and it's all going to be
really cool. So, uh, first of all, on
the left side here, cool. Uh on the left
side here we have kind of where we're
getting started from. So this is the the
documentation site where we have uh
install instructions. So I'll I'll walk
through those real quick. Um and then
our quick starts where we're actually
going to walk through these as like the
content of this workshop. Um and then
also a shout out to uh tomorrow night we
have a a hack night at the Cloudflare
office. It's on the the um external
events list for uh this conference as
well. Um but here's a QR code for it.
Um okay, so real quick.
So there's a there's a there's a
question about whether there's a Slack I
think Slack for the workshop. Yes,
absolutely. So, if you go to the Slack,
there's it says dagger-workshop
ship agents that ship is got it. Okay,
let me pull that up as well. So, if
there's questions, put them in there or
raise your hand and Jeremy will get to
you. Climb over people. I will I will do
make it happen. Yes, I'll do my best.
Awesome. So, um yeah, if if everyone if
you're following along, awesome. If if
you can't like because you don't have
desk room or uh can't get the internet
or whatever, I'm going to walk through
it live because I already have
everything on my machine. Um and then
you can always uh you know check back
with this later on once you have a a
solid connection. Um so if you're not
able to get out your computer or follow
along, just watch me and I'll go through
it and it's going to be really neat. Um
but if you are following along, uh
here's the installation page on uh
docs.dagger.io IO. So you can install um
from the homebrew tap or uh straight
from our install script or with windget
uh you can install the dagger cli. Um
the only other dependency is that you
need a container runtime such as docker
or podman or uh nerdctl. So like
anything that can run containers uh
because dagger itself runs its engine as
a container and I'll explain what that
means in a second. But, uh, if you're
following along, get started on this
while I talk through a bunch of stuff
about what we're actually doing and
what, um, all these technologies are
trying to accomplish. So, I'll take I'll
take I'll pause you really quick. Yeah,
just for the for the folks on the tech
team. Um, and so for some of you in the
room, we're having we we're finding the
Wi-Fi may or may not work for you. Use a
hotspot if you got one. If that works
down in the basement, you're then you're
amazing. Also, I'm trying to do a little
something through the wired connection
here, but for the tech team, it's
requiring a password for me to use this
service. So, um anyway, if you have
that, slip me a note at some point. Um
but otherwise, yeah, we're working on
we're working on getting more
connectivity as we speak. Awesome.
Yep. There we go. Uh so, the QR is
actually for the the hack night tomorrow
night. the um the docs and what we're
going through are at docs.dagger.io.
Um so that's like the main content for
what we're going to walk through. Um and
I guess real quick we can intro as well.
If you want to intro yourself first,
Jeremy. Yeah, sure. I'm Jeremy Adams. I
look after kind of the ecosystem. Uh I'm
part of the ecosystem team that Kyle and
I are both on and uh I've been at Dagger
for a few years. And so I've got to see
already a progression of folks using us
for all sorts of things and most
recently a lot around AI agent kind of
workflows. Um but I love in this
workshop we're going to blend together
some of the classic use cases we've seen
with Dagger around CI um and dev
workflows as well as you know giving
those to agents. Awesome. Yeah. I'm I'm
Kyle. Um I'm on the same team. Yeah. Um
and I have a background in like DevOps
and platform engineering. Uh so much
more on the the kind of cloud infra side
of things versus um that building side.
So it's cool to come at this from uh
that perspective of you know trying to
deploy agents somewhere and and make
things work. And that's why in this
workshop we're going to deploy things to
GitHub because that's uh eventually what
you're going to want to do when you
build an agent. You have to put it
somewhere to run it. uh can't just live
on your machine all the time. Uh I guess
depending on what agent is. Anyway, um
so if if you made it this far, you've
made it to the docs, uh then we're going
to talk a bit about what Dagger is and
why we're building agents with it. Um
and so basically Dagger is like I said,
it's a container runtime. It's a
workflow engine. And so people have
historically done things like build
their CI/CD with Dagger because you're
building these pipelines that
orchestrate containers, run all these
tasks. Uh, and it runs the same on your
machine as it runs in any cloud like in
um, you know, in your Kubernetes, in
GitHub, wherever your CI might run, but
it runs the same everywhere. So you're
making these workflows. U, and the cool
thing is that's also what agents are,
right? is that they're they're just
these processes where we have um we have
a bunch of tools we want to give to an
agent. Um anyway, okay, we're we're
going to see in action. Um and so uh
let's see, we have components, right?
So, Dagger itself is made up of uh core
components like containers like I said
also repos directories files and now LMS
are also a component that you have to
work with uh within this kind of toolbox
of Dagger and how we're building things
and so it's just another building block
right it's not uh a framework special
like just build for making an agent then
you have it living next to your software
it's another component within your
toolbox and so you're bringing LMS into
these existing workflow those um and
that's why it's a little bit different.
Yeah, you could think another way of
thinking about Dagger in a nutshell is
Dagger's for software engineering
workflows and environments. So you're
going to see us building some
environments essentially some
containerized environments with some
functions and all of these things can
become tools that you know human
software engineers uh AI agents um use
for both development side as well as uh
app delivery side of things. So you'll
see that kind of of course these these
areas are all blending and kind of
squishing together right now. We're
seeing all this stuff happen in in real
time. So, Dagger is kind of uh gonna be
one tool that you could use for that
whole range. You say you explain Dagger
as a tool to build containerized
environments. What's the what's the
distinction between?
Yeah. So, the question was uh if Dagger
is a tool to build containerized
environments, what's the distinction
between Dagger and Docker? So, yeah,
Docker has been around for a long time
and in fact the founders of Docker are
the founders of Dagger. And so we could
think about the scope of the original
scope of Docker was really about
containerizing an application and making
that thing portable. So I can run in my
laptop or Kyle's or up in Kubernetes or
anywhere. So now what we're doing is
we're taking a whole workflow and making
that a portable thing.
So yeah, there's definitely multiple
containers and other types of objects,
but everything's everything's sandboxed
by default. So you get So yeah, and
we'll see as we get into it. Great
question. Yeah. So we we're we're
writing code that is workflows itself.
So that code can be Go, Python,
TypeScript, Java, PHP. We have all these
different languages you can write with.
Um and the cool thing is that you're not
kind of choosing your language for
Dagger and then that's the world you
live in. Um Dagger has this cross
language interop. So if I write a cool
Dagger module um with Oh, it's it's
going to want to load. That's why I have
it. Oh, you're amazing. Yeah. Yeah. uh
error occurred. That's Wow. Okay. Um
Oh, no. Okay.
So, um I broke it. If I write a cool
module with Dagger that say does like uh
TypeScript build or something, right? Um
I I can share this on the Daggerverse
and maybe I wrote that module in
Typescript and you're writing your
modules in uh Python. you can just
install my module and you have these uh
native bindings in your language uh to
work with Dagger modules across
language. So anyway, that's that's my
point in that we're not when you pick a
language, you still get to benefit from
the whole Dagger ecosystem.
Um and we don't have images on these.
That's okay. There's some sweet
animations there of like happening. It's
so cool. Yeah, like the coolest
animation you could think of. Um
awesome. So I think we could probably
skip forward here. Yeah. Um, and so, uh,
hopefully we we've installed or we're
downloading or maybe we're still
downloading, um, Dagger. Uh, so I'll run
through real quick, um, kind of the
basics of Dagger. So, hopefully that's
big enough. Maybe I'll make it a bit
bigger. Yeah, that's good.
Um, there we go. Yeah. Cool. So, we've
uh we've installed Dagger. We've got
these things uh like uh container
runtime somewhere. Um, and so the first
thing we can do is create containers. So
if I'm in dagger shell, which I think I
am over here,
that's definitely going to get bigger.
Yeah. Yeah.
So I'm in digger shell and I can say
container I think. I don't know.
Let's go over here.
Fighting the internet.
And so what Kyle's showing is there's
like a few different ways of using
Dagger and you uh on the command line
and um including kind of a
non-interactive just fire off a Dagger
command to run one of these workflows a
function that's one of these workflows
or you can use it in this kind of
interactive shell mode that he's showing
here. Yeah. And it's all it's all about
building building blocks right. So like
with the basics of Dagger, you have like
I mentioned earlier things like
containers, directories, LLMs. Uh but
with our code, we're actually going to
be building larger blocks out of those
blocks uh to assemble like an actual
like part of a workflow and then I'll
take those blocks, build bigger
workflows out of those. So as as we're
using shell, we're always going to be
interacting with some level of a
workflow here. about like with container
I can say from uh Alpine and now I've
got an Alpine container and I can get uh
we can do things with that like anything
you might want to do with a container
right so I could literally say uh give
me a terminal and now I've got a
terminal in a container uh and this is
the exact kind of tools that we're
actually giving to our agent as we're
building these pipelines right so it can
you can give it a container but you can
build a specialized workspace
for your agent to do things like write
the code. Uh so it's a lot of setup to
say that we've got all these these
primitives that we can give to agents to
build some really effective software
engineering agents uh by giving them the
exact tools they need to complete the
job. Um but also like we mentioned uh
earlier people use Dagger for CI/CD
because you can create these workflows
for you know running your tests for your
application or whatever. And the cool
thing is that if you've done that, you
can take that same code that you wrote
for running your tests and give that to
your agent. So now your agent isn't just
guessing at some code that it's
generating, but it can actually run your
actual test the same way that your
developers and your CI do uh to make
sure that the code it's generating is
valid code and that can it can iterate
on these things within the agent. So
this is all we're going to build right
now. And we were just talking to
somebody outside before the session uh
who was telling us that in in his
organization
they get uh you know not infrequently
now because of people like a product
manager who's discovered vibe coding or
a team that's using uh you know AI
powered idees or whatever that people
are like cranking out these like massive
PRs for him to review like 25,000 line
PR RS and like and the PRs don't even
stay static. So he was like he's like oh
I just got this PR and I have to review
it and then like I come back and now
there's five more commits on it that
like you know and so you've got this
thrash happening and so part of what the
reason why CI and AI bringing that
together makes so much sense is we
actually have to bring some some balance
back. you know, we got this fire hose.
We can all now just create so much code,
but how do we make sure this is actually
code that we can test and that we can
deploy with some kind of confidence at
some point? So, we need to balance out
and make sure that there's uh software
delivery workflows that are there to to
test and build and validate things
before we put them out in production.
So, some of what we'll get into today.
Awesome. So, yeah, let's actually get
into writing something. So, zoomed out a
bit so you can see where I landed in the
docs. Here on the left side, we have
quick start and I clicked on build a CI
pipeline. And that's basically to get us
to a point where we have a project or
we're going to make an agent inside of
that that can build new features for
that project. Uh, so it's just going to
be a real quick thing where we kind of
set up this uh example project with
functions that know how to build and
test the project. Um, so I'm on this
page and we've already talked through
installing Dagger. Uh, and we talked a
bit through the basics. Um, so now we
have this example application called
Hello Dagger template. And if you go to
GitHub and say use this template, uh,
you can name it whatever you want like
Hello Dagger Workshop or just Hello
Dagger, doesn't matter. Uh, you can
create a a repo in your GitHub from this
template. And the important reason for
that versus cloning it is that that's
going to make it way easier when we
actually push things to GitHub in a
little bit uh to make it easier for you
to uh run the GitHub actions that
actually run the agent. Um, so we're
going to use that template and I've done
that over here
in this repo where I have my Hello
Dagger Pi. Uh, because I've done this in
every language. U, you can use whatever
language you want to use. U, I'll be
walking through Python today because I
think that's probably what a lot of
people uh, here today are most
comfortable with. Um, but if you're not,
I can switch between languages. Just
raise your hand and say, "Show me go."
Um, that's okay. So
I've got let's see. So I've got this
application in my GitHub now. I've
cloned it to my machine. So now I can
look at the code and it's like this uh
view app that has a bunch of things in
it. But the main thing is we want to be
able to make an agent that develops it,
right? Um optionally you can configure
Dagger cloud which is let me just start
loading that web page now. Um,
it's basically a visualization so you
can really easily see what your agent's
doing, right? Because that's the hardest
part of building agents a lot of the
time is understanding like what are they
tripping on, what what's actually going
on inside the agent, how is it
interacting with its tools, what tools
is it even seeing. Um, so with this
visualiz visualization, you're able to
really easily see uh everything that
your agent's doing. And that's helped me
a lot like develop my prompts. Like if I
see um the prompts and environments,
right? If I see a lot of times that okay
that the agent fails because it tries to
call this tool incorrectly, I can
improve like the description of the tool
or maybe I need to change how the tool
works completely. Um and so being able
to see how the agent's behaving is a
huge part of that. Um whether you're
using cloud or or any other thing to
visualize your agents, that's like the
most important part of u making it
reliable. Um okay, so we've cloned the
project. We now want to create a Dagger
module. So if you've installed Dagger,
uh you'll run this command Dagger init
uh with whatever SDK you're using. So we
have these tabs here. Um so I'm going to
be using Python. And then the name of
the module is going to be hello dagger.
And that's important because that is
basically the name of our um object that
gets created. So if I open this up and
I've run dagger and now I can open in my
dagger folder. And sorry that's really
small. I don't remember how to make that
bigger in Zed, but we can. It's in the
uh It's in the uh preferences to zoom
the sidebar. Command comma.
Oh, you command comma and it'll Yeah.
And you just put the there's a the top
there's a there's a
Well, I guess you don't have your set,
but it's a it's a font size. Font size.
that one. UI font size. Change it to
like 25 or something. Watch it. There
you go. Save that. Bam. Boom. Okay. So,
now hopefully we can see the sidebar a
little bit better. Um, so I'm in this
dagger directory and apparently I've
written go for this one. Is this Oh,
because that's the wrong project. Cool.
Uh, let's go to the correct project.
Me open that up and close all these
things. Okay. So I'm in the correct
project and in my dagger I've got this
source hello dagger mainpi and so we
would have generated when we set dagger
in it'll have um basically these files
but some different content. So it's
going to have like the the basic
generated things uh to get you started
building modules but we're going to say
um see dagger functions.
It'll show us what's available in this
Dagger module that just got created. And
so this is basically how you interact
with daggers with the Dagger CLI and you
have this code that are just functions
of how to uh interact with your
application. So for example, this build
one uh we have a container. If we go
down to this function and you see we're
just building building blocks. We have a
function that gives us a Dagger
container that uh is from this base and
we put these files in it and we run this
command. And so in that container when
we want to do a build of our app, uh we
can
call that other function to get that
container with our our code in it, run
another command and then get a directory
from that. And so this is like really
basic Dagger stuff of how you create
your dev tools using Dagger. This is
like good to call out here. So
originally we had this example from Kyle
where he showed us running like a
container and then we said give me a a
Scratch container. Oh, wait. Give me an
im from an image from Alpine or from
Node or from whatever. And then you can
layer on more things like add a
directory a source code to that. Run
exec a test command whatever, right?
Chaining these things together. So you
notice I'm using this builder pattern
here in code instead of in like a CLI.
So it's all the same API under the hood.
It's just in this in this case he's
using a Python SDK into that same API
but the same things are happening either
way. Same one unified cache where all
that stuff is being all those cached
operations are at and and one API. So
that's why it becomes really easy to use
different languages different language
SDKs because it's ultimately all one API
under the hood. And so we got this code
in this the next step of this where it
says construct a pipeline. We've copied
this code into that main file and that
has all of those functions like publish
build test uh and that build m1 we
looked at u and build mv as in like your
build environment and so when we run
dagger functions we'll have those shown
up here with their descriptions and
everything from the code. So now we've
at this point like we we've got the
project that we want to build the agent
in. We've got um some Dagger functions
that let us build and test the project.
We've got the project itself. So now
let's actually get the agent started.
Um, and so now I'll zoom out again so
you can see because I jumped to the next
page here, which is add an AI agent to
an existing project.
And so we're starting from exactly where
we just left off there with that
previous guide where we pasted in that
code. We have our our um build, build a
publish, test in our Dagger functions.
lots of useful functions, but the
expectation was the human was probably
running those, right? Or you were having
them run in CI. You'd kind of set that
up, but nothing really agentic yet,
right? So, we have, you know, we're just
running our unit test or our our build
and creating a production container. And
this is what uh you as a developer or
your your CI environment are running
these functions. But now we want to
create an agent for developers to
interact with um or you know to run
anywhere but also our agent should be
able to use these functions as well. Um
and so we're in this this next guide and
we're going to now create a subm module
because I mentioned like our agents want
these uh refined environments where we
give them access to exactly what tools
they need to complete their tasks. We
and nothing more than that. No, no,
wait. I thought you gonna give agents
like every possible tool. You want to
let them have like a thousand functions
that do very powerful things and just
let them run crazy. Is that not the best
practice? The uh Yeah, maybe not based
on the the smiles across the room. Okay.
Yes.
Oh yeah. So the question is what if the
tool needed changes at runtime? How
about dynamic kind of tools, right? So a
lot of cases we're working with MCPs
that might we might have a lot of static
tool kind of experience you know where
things change what what what does happen
Kyle? Well so the the main thing is like
you want the right amount of tools to
for that agent to solve its task uh
whatever that task is like it needs the
flexibility uh to to be able to solve
complex problems. Uh so it's not just
going straight down a workflow and
saying okay I do this and I do this and
do this because you don't really need an
AI to do that. It needs the amount of
tools to select to choose its own path
uh to solve whatever task you throw at
it. But you don't want so many tools
that now this is a generalized agent
that does anything, right? It needs to
have some amount of focus so that it can
solve a specific set of problems really
well. But we will see like in the agent
loop that's going to happen. We will see
the ability for the a for the LLM to see
this like menu of tools it has and that
for it to select the right tool at the
right time given the context. Yeah. But
yeah, definitely like a big part of
iterating and building these agents is
determining like the scope of the tools.
So like um the kind of the balance
between flexibility and reliability
where you want it to be able to solve a
breadth of problems. Uh, so it needs a
variety of tools that it might need. You
don't know exactly what it's gonna need
ahead of time, but you don't want to
give it so many that now it's getting
lost and confused and fails half the
time, right? And so that's what we're
going to focus on here with this uh
we're going to create a subm module
basically that is kind of it its
playground. It's specific set of tools
that lets it uh edit our source code.
And so if you've worked with uh maybe
agent frameworks in the past that have
like file system tools, we're actually
going to build that in our own code
right now. Um and that and it's just a
few lines of code. So don't let me scare
you with that. But that's the idea is
like we're we're creating these building
blocks and as you scale this up, uh you
can consume these from that other people
have written. You don't have to write it
all from scratch. But for the practice
of building this as a workshop, we're
going to write it all. And so all right
where we going to give the a where we
going to give what we put in this
workspace what kind of functions yeah so
we do another dagger in it here and we
say uh daggerworkspace so we've created
in our file system another subdirectory
here uh workspace underdagger and so
this is another dagger module um and
this one's just going to have just the
functions that we want the agent to have
access to um so you can imagine it wants
to read the files in your source tree uh
so we have a function and again a file
is one of those core components of
Dagger. And so we just our workspace has
a Dagger directory which is our source
code. Um and so we give it a function to
read a file from that. Um so it just
gets the contents of that file. Um and
that's just the Dagger API to say this
is a path to a file. I can do lots of
things with a file. One of those is to
look at the contents. Um another
function it needs is to be able to write
files to the workspace obviously. Uh and
so it's very similar API here where we
say okay give me the path and also the
contents to write to that file. Uh and
then it needs to be able to know what
files are in the workspace. So it needs
to be able to list the files and it's
just going to literally do a tree in
that workspace so it can quickly see the
the file structure of your code. Um, and
so now basically with those three, we
have another one that we're going to
look at in a second, but with those
three now can do all the code editing
you might ask it to do within your your
file system, right? And with with more
complex um projects, you might need more
advanced capabilities of these, like you
might need to be able to read specific
lines from a file or scan files or
insert lines into files. But with our
kind of demo agent that we're building
right now, it's like just the most basic
where we can just read and write files
and list the files. So if the agent had
access to this workspace object, it
would see those functions as tools. Read
file, write file.
Exactly. It will in a minute. Yeah.
Yeah. We haven't we haven't plugged the
the brain into the robot body yet. We
haven't. Right. So right now we're
building if you think about the agent as
a robot body with a brain plugged into
it. We're building the robot body. Uh,
and the brain is going to come in just a
second here, which could be any LLM kind
of a brain in a jar, right? Analogy. So,
our last one finally that I mentioned
earlier um is test. So, when it
generates this code in in its uh
workspace, it needs to be able to test
to make sure that the code it generated
is correct. And if it didn't, it'll get
the test failures and iterate until it's
producing good code, right? And so this
is kind of the most important part of
building this good agent is some sort of
validation tool whether that's like a
test or lint or just something to check
that what it's generated uh is correct
or maybe it's all of these things right
there could be different levels of
complexity but anyway here now we've got
this workspace so if I go in my
workspace I have this exact code over
here and if I run I think I have the
function down here If I say dagger-m. So
now dashm points to a d a specific
dagger module and I say functions.
Remember before we arranged dagger
functions. If I run dagger-m
daggerworkspace functions, I'll see uh
exactly those functions that we just uh
created. Okay. So the next step is we
want our main dagger module to have that
as a set of tools it can use. And so
we're going to say dagger install that
workspace module. So now uh it's
installed as a dependency of my main
module. So it has this object available.
And we'll see why that's really cool in
a second. But basically all all your
dependencies of Dagger like I mentioned
like we have um you know I can look at
this real quick. We have a big community
of people building things with Dagger.
And with that, we have the Daggerverse,
which is this massive index of like
thousands of Dagger modules that do
different specialized things. But
whenever you install one of these into a
Dagger module, it creates,
if we look at my Dagger JSON in this uh
project, uh we have this list of
dependencies.
And so your your um Dagger module has
basically its own Dagger client that is
the core Dagger API in addition to all
of your dependencies. And so that when
you're writing code, you can uh like I
mentioned earlier like native in this
language. Uh you'll see all of these
things available on the the main Dagger
client. So you can do um all these
complex tasks. So basically we've built
two modules already. We've built this
workspace module and the main module
where we're doing our our tests and
builds. Uh and so we want to create the
agent now that can take that workspace
and our tests and we can actually ask
for new features uh or modifications or
whatever. So that's the next step in
this guide we're looking at where we
want to create an agentic function. So
could could we have mixed and matched
like could we have written that
workspace in Typescript or in Go and
still installed it into our Python
module? Yep. Exactly. So like the other
modules any individual module can be
written in any language and you can mix
and match however you want. I knew the
answer to the question. I was just you
check just checking but but yeah we see
people do this a lot where they have
different teams where like you know
maybe there's a front-end platform team
and then a backend platform team and
maybe the these folks are typescript
these folks are go but they can interop
and user stuff. So yeah. Yeah. So like
everything everything every task or
workflow or whatever that you do with
Ager is a function in your code. And so
an agent is no different, right? It's
just going to be another function. Uh
and we're going to call this one develop
because we're going to ask it we're
going to give it an assignment to
complete in our project. Uh and it's
going to complete that assignment. So
that the develop function is our agent.
Uh, and so this is going to give us the
code to copy. And I'll just open in the
editor so that it looks a bit nicer.
Don't worry, it's only like 500 lines.
It's totally fine. And you know what?
It's it's it's really short. And oh,
wait, it's not 500 lines. Yeah, this is
this is it right here. So, we have a few
lines maybe. You have it all like spaced
out nicely. Yeah, it's so we have um a
new function called develop and it takes
in an assignment. And this annotated
thing is just a Python way of getting us
these these um doc strings for the
parameters. But it in different
languages like we can see back here if
we go like go your arguments just look
like this where this little comment is
basically the help string when you're
using the Dagger CLI and say Dagger
functions it'll say the assignment
parameter is assignment to complete
which is really cool. Uh and we see our
source here which is like our project
source. Uh but of course we don't want
to have to pass that as a parameter when
we're calling our agent. Uh so there's
this cool thing with Dagger where you
just say default path is slash and
that's going to be the root of our git
repo. Uh so if we don't pass in
explicitly a source parameter, it's just
going to pass in our git repo as that
parameter. And so now we just have to
say develop build me a cool new feature
and it's going to kick off our agent. So
let's look at the components of the
agent real quick. So the environment is
like the main thing, right? And I I've
used that word a lot today and hopefully
a lot of people are using the same word
in the same way, but you you have your
uh your robot body in the brain like
Jeremy said where your environment is
basically not just the tools that it's
using to complete the task, but also um
your your inputs and outputs for the
agent um any any objects or state that
it's working with. All of this is the
environment. And so we want to construct
this environment and then plug in the
LLM which is our brain and say here's
your environment here's your task
slashprompt
and complete the task. Um and so this is
this is the environment we put together
where the assignment is a string input.
Um so we have we have this cool kind of
way of um declaratively building your
prompt right where our assignment is the
assignment to complete. This workspace
input is a workspace with tools to edit
and test code. So now that our agent
when we connect these things we'll we'll
see this as the description of this
thing that it can use and say okay we're
we're building out this prompt by uh
annotating our code basically. And so
with this workspace input thing that's
referring to the subm module we just
created. So if the workspace exactly so
if we called that something else like fu
workspace and we installed that this
would be with fu workspace input. Right.
We're we're dynamically generating all
of these functions for the environment
type to say
um
any objects in my dependencies I can
have as an input or an output of uh my
environment. And so we notice that we
also have a workspace output which is
the completed task. Um because all
objects in Dagger are immutable. And so
you I give it an object, it's going to
do a bunch of things and give me back a
different object that's it's completed
task. Um, and maybe that's like a boring
detail, but the main thing is the thing
I passed in is still going to be the
same, but it's going to have a new
version that's given me back called
completed. I mean, I think a lot of
people are dealing with this kind of
stuff now, right, with the different
APIs and like doing a bunch of JSON
parsing and validation, right? and
trying to you know there's different
frameworks doing it different ways but
you could just think of it as this is
our way of saying like here are the
typed inputs these are typed inputs
we're expecting a typed output back in
the end and this gives us a way to
ensure that uh we're getting what we
actually asked for right now uh next we
we need our prompt so we have the
environment and the prompt and we give
both of those to the agent basically um
so the prompt I believe is just a bit
lower here if you're following following
along here. So it wants you to create a
dagger/develop
prompt.mmarkdown
and it looks like this. So I'll just
open it again over here on my editor.
So this is our prompt and so we're
saying you're a developer on this
project. You're going to give you're
going to get an assignment and the tools
to complete it. Your assignment is
dollar sign assignment. And so this is
basically it's going to be templated in
by
the
assignment in our environment. So it's
going to drop that right in that prompt.
So the the agent itself doesn't have to
go read this other variable in its
environment. It knows, okay, my
assignment is make this cool new
feature. And then we have a bit of
prompt structure here, right? Where uh
if you've built a lot of these agents,
you've probably kind of refined how you
build your prompts and what those
structures look like. Uh this is a
really simple agent so it doesn't have a
ton of structure but we do say uh before
you write code make sure you analyze the
workspace to understand the project
structure so it's not just going to
create some garbage or be like cool I
made this new file uh but I didn't look
at the project first. Um don't make
unnecessary changes because sometimes uh
you'll see especially certain models uh
without the right constraints will go
make the change you ask for and then
change four other things and be like
cool looks good ship it. um and always
run the test. So, we do have to ask it
to run the test once it's made those
changes. So, it's not just going to see
the test function and be like, "Oh, I
should probably call that." We want to
make sure to tell the LLM like, "Okay,
you have a tool that can validate the
code you're writing. Make sure you use
that tool." Uh, and then don't stop
until you've completed the assignment
and the test pass. So, this is telling
it, you know, keep working until you've
satisfied what I asked it to do and the
test pass. some good reinforcement. You
kind of like told it to run the test
twice. Yeah, you better. And this is
comes from experience, right? Maybe a
third time will help too. I'll say it
doesn't hurt at all because Yeah. And
maybe in all caps because it's like what
we find we end up running evals on these
things, right? Where we'll try different
LLMs plugged in and then we'll iterate
some on the prompts and until we're
getting the results, the consistency we
want across the different the different
ones. And um and yeah, it comes from
experience of knowing like how they veer
off track and etc. How we're writing
these. Yeah. And that's like what what I
mentioned earlier like using something
like Digger Cloud to be able to visual
or see the visualization of all the work
the agent's doing. If I'm frequently
seeing like okay that the agent is just
calling write file and then returning I
know that okay I have to tell it to look
at the code. I have to tell it to test
the code. And that's going to be
different for every model and especially
like the prompt structure is different
for different models.
Yeah. Yeah. So the question is like can
you implement like reflection agents to
police each other and that's something I
probably have an example of that I can
show at the end if we have time. Um but
yeah like remember in the with this each
agent is just a dagger function and so
you can create all these agents layered
on other agents. Um, and even in your
environment, you could actually put an
agent in the environment and say, "Hey,
you have this a this agent at your
disposal uh if you needed to do
something, right?" And I have examples
of that, too. But it's like similar to
the the concept of like Google's A2A
where you you say uh if you're not
familiar with that, it's basically this
um structure where you tell an agent,
listen, you can do these things, but you
also can talk to these other agents, and
that's what each of those other agents
do. And so if you need to, you can reach
out to them and say, "Hey, other agent,
um, I need you to tell me how to write
TypeScript." And that comes back, right?
So you can put agents in environments.
It's all just piecing functions
together, right? It's it's just the same
code we've always been writing, but now
there's an LM component. Um, cool. So
now this line right here, line 94, most
important line of the workshop because
this is the agent where we've actually
taken our Dagger client and LLM. So this
is another type within the Dagger
client. Make it bigger just for a
second, you know, just Sure. Yeah. So
it's off the screen since it's so
important. I feel like it's not even
getting that much bigger. It's just so
huge. Yeah. There we go. Yeah. Uh cool.
So like we we've said, "All right, from
the Dagger client, we need this LLM
type. Uh we give it an environment. We
give it a prompt. And that's the agent."
So now we've got this thing work that is
a Dagger LM. See, people want pictures
of it. You got You got center it. Yeah.
Make it look good. There you go. Boom. I
can If you need your pictures, you can
get one with Kyle and
commemorative. We've got like frames
outside. You can slide it in after. I'll
autograph it. Um, so that that's the
agent. Like that's literally because
we've asked it like we we've said in
this prompt. We didn't really ask. We
told it we told it in the prompt. Uh,
this is this is your task. This is how
you work. Don't stop until it's done.
And so now this work variable in our
code is the completed work. And so from
that work we can look back at the
environment in that and say I have this
output called completed because you
remember in our environment we defined a
workspace output called completed and
this thing should be a workspace. If
it's not somebody screwed up that
happens sometimes. Um it's a good final
check and type check. Yeah. And so from
that workspace, we want to grab the
completed directory which is the source.
So if you remember in our workspace
object here, it has an attribute called
source which is a directory. And so this
is all like a few layers of complexity,
but we've said in that workspace, we
have a source thing that's a directory.
And ignore the node modules folder
because maybe that's going to break in
my machine. Yeah. Uh and then now that
we've got that just to make triple sure
because remember I mean we we did tell
it three times to run test but now we
get this back and in our code we're
saying all right now run the test
because this is all the same code that
we're using throughout our project to
run tests. So we can say okay completed
now manually run the tests and if that
fails you could maybe kick it back into
the LM and say hey this failed try
harder. That's pretty huge right? So
that's like trying to put the agents on
on rails or give them guard rails,
whichever metaphor you like better. But
it's like, you know, that's pretty key
because we're trying to like let them do
the creative stuff they do, the
generative stuff they do, like write
some code for us, but we need to enforce
certain standards, right? It could be
compliance things, could be like you say
linting, so we don't just dump that
garbage garbage back to your machine.
Yeah. Uh, and remember all these changes
that it was making as it's iterating on
these things, that was all done in a
container. It's not just changing your
file system as it's doing its work. And
that's a key thing, too, because now
maybe you have 10 of these agents
running. They all have their own
sandboxed workspace where they're
editing these files. They're not messing
up your local state. And before we do
mess up our local state, we triple check
that the test pass. And then we say,
okay, return that completed directory.
And so now this function
and we'll just triple check here on the
guide side. They didn't miss anything.
We say dagger functions and we have this
develop one that shows here. So now if I
go into dagger shell which is hopefully
what it asks us to do. It is I say
hopefully I wrote this so you know this
we're just checking myself here. Um and
I can go in and say dagger. Now before I
do that um one thing I don't think I
called out at the very start here was
that we had to like configure an LM
provider. So with Dagger, you bring your
own model. You can use OpenAI, Gemini,
Anthropic, um local models, Olama,
Docker model runner, like lit literally
any anything you could hook up to
bedrock. Um so you do have to configure
some environment variables to be able to
for Dagger to make API calls to that,
right? Because we're just we're just the
agent with the tools. The model is
living somewhere else. Um, and so this
is this configuration page.
Configuration,
uh, shows all the different options on
how to configure things. Um, one really
cool thing to call out, I'm just going
to type something really scary.
Um, oh my gosh. So, Dagger also has cool
secrets provider integrations. So, I
don't have my actual API key uh echoed
there. I just have my one password
reference and it's just sitting in one
password somewhere. Um and so let's see.
Yeah. So it's just pointing at this
credential. Yeah. And then if I reveal
in plain text,
um
so I I've configured this in my
environment. So now when I say dagger,
um it's going to take a second to spin
up. And this is the part where if you're
struggling a bit with Wi-Fi,
this might be a bit tough, but it's okay
because if you are following along,
we're going to push this to GitHub in a
second and it's going to run in GitHub
and it's going to be on GitHub's
network. So, we don't have to be uh
beholden to that. But now, can you run
LLM? Yeah, exactly. So, now if I say LLM
pipe model for example, uh where you see
my little one password prompt. Nice. So,
it's got my key. It's going to take a
second to think about it. Uh, and so
with each model provider, we have a
default model, but you can also specify
one. Um, we can also specify one in
code, but right now by default, it's
going to use cloud 35. Uh, so maybe
we're not going to get the best results,
but we'll see. Yes, classic. A classic.
Yes. Um, cool. So now I have that and I
can say, and we have that new develop
function, right? So I can say help
develop. And so this is the thing we
just made where Can you bump that up a
little bit bigger? For sure. Yeah. Yeah,
perfect. Uh, so we have that required
argument of assignment and that was our
assignment complete. We have an optional
argument source which again is just
going to be my repo and this is going to
give us back a directory. Uh, so here's
how I use it. I just say develop and
then do the assignment. So let's say
develop
and then we didn't actually look at the
project we're daggerizing yet, but I
promise it's like uh viewjs website. So
let's ask it to I think in here we say
um
the example thing is to make the main
page blue and I'll say make the main
page say hello workshop people. Oh
doesn't say that right now. Um and I've
never run this so I don't maybe it'll
succeed. So now we can see this
happening. We see our prompts getting
passed in. We see the little uh person
face that's the prompting and the little
robot head of the model which is claude
35 sonet saying cool let me do these
things and we can actually see it
calling tools right so it's it's uh
looking at the functions available we
see that workspace list you said yeah
list files yeah the ones that we made
um and so it figured out okay I can look
at my files now here's this specific
file I might need to edit so let me read
that file and so it it now sees the
contents of this. And while this is
running, let me just open up cloud and
hopefully this will load
so we can actually see like the the
cloud visualization of this because it's
maybe a bit easier to see because we
it's sign in.
I'm clicking the button. I think my
Wi-Fi is failing me on this O flow, but
while it's running, we'll just watch
this. It's the same it's the same open
telemetry in both places. So that you're
getting streaming to your terminal UI
and the web UI. We see it call write
file with some new file contents. And
now says now that we've made the change,
let's run the test. And this is the part
that that really might fail on this
Wi-Fi because it's inst. It's doing an
npm install and downloading a bunch of
npm modules or node modules. But it's uh
it should pass in a second. Uh we'll
just let it go and we'll talk through
it. But we we can see that our agent is
actually it wrote the files and then
it's writing it's running the tests
which is really awesome. Uh cool. So
this opened up over here
with npm installed. Yeah. Was part of
the tool that you gave it or um so this
is Oh yeah. So let's So we see it's
saying like with exec npm install with
exec npm run test unit. If we go back to
our workspace
in our test function, that was part of
it. So this is like the agent just had
to call test and we've defined what
happens when you call test and so it's
not like the random ones like you know
sometimes you're like you know make sure
test and it's like I'm going to try pi
test with these crazy options and you're
like why did you think that was going to
work? Instead you just give it you know
exactly what it should be. We could give
it more flexibility in how it runs
things, but in this case like we already
know like this is how you run tests in
the project. So we just give it a test
fun. Like that's probably the biggest
thing in like creating reliable agents
with Dagger is like
giving flexibility where it's important
for completing tasks and removing it
where you know exactly how things are
meant to happen. So you know exactly how
tests need to run. Uh so it doesn't need
the freedom to just run any command in a
container. we know, okay, all you need
to do is modify files and run this test
function. Um, and for more complex
agents, maybe there's some other
functions there, too. But for this one,
like this is the amount of freedom we've
given it. Can we can we like open
another uh Well, hold on. So, we got
cloud. Okay. Okay. We got cloud. So,
yeah. Well, we'll get back to my pipe
dream in a second. Okay. So, let me see
if I can expand this. Uh, and so this is
like the visibility that we want to see
when we're running these agents. So we
saw the prompt and we saw the assignment
is to make the main page say hello
workshop people. Cool. And then so this
is the prompt we gave it. Now Claude 3.5
is looking at this and saying first
let's look at what objects we have and
check out the workspace make the changes
and then run the tests. Sounds good. It
runs list objects which lets it see uh
what it has in its environment which is
like this this workspace tool, right?
Cool. And then it's going to say list
method. So it's going to see what it can
do with a workspace. Like what the heck
is a workspace? It says it has tools to
edit and test code. And then we expand
that. And so this is like this kind of
visibility into the agents environment
where we say, "Oh, there's this
workspace write file function that gives
it back a workspace type and these are
the arguments." Oh, you mean so we
didn't have to write any of the JSON
kind of, you know, description of tools.
It just gets generated from the
functions. Yeah. So we just gave it that
that Dagger module and then it all got
wired up into the agents environment.
And so that's cool. Let me select these
methods. So now I have these as tools to
call and then let's see what's in the
project. So it's going to call workspace
list files. And remember the the way
that it does that in our workspace code
was it creates like an Alpine container
and runs tree. And so we can see the
tracing of that too which is like the
underlying uh actions of the tools being
called. We also see the return of that
which is what the agent sees and it sees
this whole file structure. Cool. And
then we can see says cool sounds to make
it say that we should probably modify
this one or this one. So let's see
what's in those files. We can see it
read the file and that's uh it's going
to see this whole file of
um the work hello world.view that says
okay I don't think that was it. Let's
see the app.view view and then it reads
that file and then eventually it says I
see that that world that app.view uses
the hello world component and passes a
message to it. So now it's going to
write the file. It's going to change
app.view to pass a different message to
it. Um and let's see, we can expand this
to see the whole thing. Yes. Awesome.
Nice. So hopefully if this ever if it
doesn't finish, it's fine because we're
going to push it to GitHub in a second.
Um and then GitHub can run it for us.
But now it's running those tests. So
this is the part that it's currently at
in my shell where it's been running for
like five minutes. Um, so yeah, that
that's the the visibility part I'm
talking about where we can see exactly
what the agent sees and what's happening
under the hood. Um, so this is to be
clear, right? So this is all running on
your laptop and yet it's all inside that
Dagger engine in containers totally
isolated from your laptop. Dagger cloud
is just showing me the visualization.
It's not running anything for me. This
is on my machine, which is why it's
still running. Well, right. And and this
is like because of the connection we
have and because of, you know, whatever
the load we're putting on it. But it's
the other thing to think about is it
could be like uh we're using Python
here. We're using Node, right? We're
using a bunch of different tools. So
like the app is Node, but the the uh the
workflows that Kyle's writing are in
Python. You could have a laptop say or
any server that just has Dagger and a
connection to the internet and you don't
need any tools installed. So that's why
the environments environments is not
just for the agent developer. I mean it
kind of goes all the way through. So you
could have a brand new laptop with just
Dagger and it would because it's using a
Python runtime container for the
workflow he wrote in Python. That's just
implicitly there. So you don't need to
install Python. You don't need to
struggle with VMs or any other versions
or whatever. It just it's done. And then
inside of that somewhere there's node
container that happened, right? In order
to create this environment, the build
and the build and all that. And that
again, it's all just nested inside of
there and and cached and everything else
automatically. So you can you could kind
of just do this with a very bare bones
machine setup and everything will just
work. Yeah. So what we can see that we
probably won't get to run this part
locally just because um I don't we we'll
come back to it if it finishes but
anyway I'll just describe this flow here
where we say okay we're in shell that
happened like we we ran that develop
thing and it it gave us back something
but now in dagger like I keep saying
we're in shell dagger when you type
dagger and get into that um this view it
is a shell just like bash right where we
can
do things like create variables and
chain things together. And so what we
could do if this finished is say, okay,
let's actually save that the output of
this thing because remember it returns a
directory. Save that to a variable
called completed. And then we could pass
that to our other functions because
remember they they default to using our
git source from our machine. But we
could we could pass in that optional
directory to all of our functions to say
use this directory instead. So now I
could actually run the whole thing uh as
like a local like I could see the
results of this before even saving it to
my machine. So let me just go over here.
I don't know why I keep ending up in
this folder but we'll go uh to the
correct directory
and we'll open another shell here and
I'll just type in part of this command.
Um because what what I can do is I can
run the output from the agent as like I
can run the whole site. I can build it
and serve it to my machine. Uh and I can
see what it's built before I even save
it back to my disk to say yes, this is a
good solution. Um so once we uh get this
connection here,
just waiting on pipes to connect to each
other. Um
and we'll we'll let that run for a
second. But uh the main thing is we we
can pass this around. We can run all of
our functions with that completed
directory and then finally say all right
we say export that saves it back to your
disk and we're done. So the next step is
all right we're good with that. We we
know how to use this agent locally to
ask it to make cool tasks. That's fine.
But my my people requesting features on
my site they don't have this installed.
They don't have Docker and Dagger
installed on their machine. They don't
want to use Dagger shell. they just want
to go to GitHub and say make this new
feature. So that's the next step here
and and it sounds ambitious but it's
really quick. Um so we've got plenty of
time to to look at the solution here and
we'll look at it in Python once again.
And so the first thing we're going to do
is actually install another dependency
from the Daggerverse. And this is my
module called GitHub issue. And it's
basically if we go to Daggerverse, we
saw it installed earlier when you showed
us that Dagger JSON with the Exactly.
But that's because I skipped ahead. Oh,
I see. Yeah. Nice.
Um, so if we search for that and we have
this module called GitHub issue, uh,
it's got a bunch of functions that let
us do things with GitHub issues like um,
we can list GitHub issues in a repo. We
can list the comments on a particular
issue. We can write comments. Uh we can
create pull request comments. Um all
kinds of things with GitHub issues and
GitHub pull requests. So with this
module where I've just basically used
the uh GitHub Go SDK in this Go module
to connect my Dagger Functions to the
API calls. I can install this in my
Python project. And now I I can have the
ability to work with GitHub issues. And
so all it needs is a GitHub token. And
so we create we add another function to
our code um called develop issue. So
remember we created develop now it's
develop issue and all this is going to
do is say we have a GitHub issue out
there with our feature request. We want
to read that GitHub issue give it to our
agent. The agent's going to do all its
things then give us back a directory.
We're going to take that directory and
make a pull request. Oh, so like really
similar to like the assignment that we
gave it, instead it's going to be
reading the GitHub issue and instead of
just getting the directory back
ourselves, we put the directory into a
PR. So we can see the code here. Um, and
so this is the entire thing here where
we're not writing a new agent to do
this. We're using our other agent. We're
just we're we're wrapping it with some
other pieces to say go here to get the
assignment. Once it's done, put that
completed work over here. uh which is
the from here was like read a get of
issue uh and then we get that assignment
and I can open in the editor so it's
probably easier to see um
okay so we we get that uh get of issue
from that issue we get the assignment
from the issue body uh we pass that to
our develop function because this is our
agent and say here's your assignment
here's the source uh which came from
that same defaulted uh input argument.
Uh and then we
uh get the issue title and URL uh which
is going to be really cool because then
we can actually in GitHub automatically
have the new pull request linked to the
GitHub issue uh just by having this the
body say closes this issue and that's
going to create a pull request. And so
this whole thing like you can run this
part locally too. You don't you don't
have to run this part in GitHub, but it
takes the GitHub token and an issue and
the repo name so it knows where to put
the PR and then it
does that whole flow. But we actually
want that to run in GitHub and that's
super easy. Um, so we've made that
thing. We just saw the code. Uh, now we
create a GitHub actions workflow. Uh,
the first two things we need to do is in
the repo we need to create two repo
secrets. one for a cloud token. Again,
that part's optional. Um, but if you
want to see all those things happen in
Dagger Cloud, you just put that token in
the environment. And then whatever LLM
key you're using, so the same one uh I
use locally is going to be in that repo
secret. So if I go over here in my repo
and I say
and I zoom out a bit so I get all the
buttons, I say settings.
We wait for the page to load. And then
down here under
secrets and variables
actions,
I have two repo secrets here that we
just saw from that screenshot. Make it
big again. Sure. Um and then there's one
more thing which is uh let's see that's
how we get our Dagger cloud token and
paste in there. Um there's a little
check box we have to press over here to
let GitHub actions create PRs. Uh
because that's disabled by default. Uh
so if I go under
um okay under actions
general
and then at the very bottom there's this
checkbox allow GitHub actions to create
and approve pull requests. So I've done
that. Uh now I just need to create a
workflow and the workflow is super
short. Um this is a thing you can copy
paste and I'll open it up over here.
Um under GitHub workflows we have
develop and so now we have this is
GitHub actions. If you ever haven't used
GitHub actions, I'll explain this real
quick, but it's basically uh a CI
platform and we have with this
configuration we tell it uh when thing
when events happen uh go do these
things. So in this case we say when a
github issue is labeled and the label is
called develop
then run this command and this command
is the dagger called develop issue with
those arguments like github token the
issue ID and the repo and these things
are all coming from github actions
automatically. So like the environment's
GitHub token is created here where we
say this uh this command needs a GitHub
token with permissions to write
contents. Contents are like commits to
your project. Uh read the issues and
write pull requests. Uh and so we've put
that in the environment. We've g it
given it the API key for our LM and the
cloud token. Um and so now just by
running this dagger call that connects
the dots where github actions whenever
we create that label is going to run
that dagger function and that dagger
function has all the capabilities to run
the agent and open a PR. So that's like
us in the dagger shell when we call when
we are running like the develop function
or some other build function or
whatever. This is just having github
actions run the develop issue function
for us. Yep. Why are you having actions
do it? So then issues. Um, so we're
having GitHub actions do it because we
want this flow to be automated inside
GitHub. So I'll show the flow real
quick, but it can run anywhere. So you
can run it as long as it doesn't matter.
Doesn't matter where it runs. Yep. Uh,
this just happens to be GitHub a GitHub
actions because we're already in a
GitHub repo. It's free because this is
like uh uh we're not using any crazy
compute to run this thing and most of
the hard stuff's happening on your LLM
that you're paying for. and they have
better internet connection at GitHub
than today. So let's say let's create a
new issue and we'll say change the
greeting and we want to what did we ask
for before we asked for like make the
main page say hello workshop something
like that hello workshop people yes okay
so we'll create this GitHub issue
and remember this this whole thing kicks
off when I add the label develop and so
I've already run this on this repo and
obviously made a typo as well at one
point. Um, but if you don't have it
there, you can just say foo and you'll
have a button to say create a new
labeled uh develop. So we want to call
it develop.
Um, so I click that and now my issue has
been labeled and so now that kicks off
GitHub actions to call my dagger thing.
So, let's go over here in the actions
tab and we should see something running
and it says change the greeting and we
can watch this run over here. We can
also pull it up in cloud because
remember I put that cloud token in there
because this stuff is all too hard to
see uh flying by my screen in real time.
So, let's go back here and this is
GitHub actions, right? But it could be
any kind of you know orchestration or CI
orchestration could be Jenkins could be
Gitlab CI it could be anything Azure
DevOps you know whatever whatever you
got. Yeah. Question how much if any like
Chrome modification for you guys is it
literally just puts in that one markdown
file or do you add like is it aware that
it's in Dagger? It is. Yeah. So we have
the question is like how much prompt
modification does the agent have? Uh,
Dagger has its own system prompt that it
adds that kind of guides it towards like
how you use uh tools within Dagger. So,
it knows like call the the select
methods and list functions and those
those things we saw doing. You can add
more to the system prompt. You can get
rid of that system prompt if you want
to. But yeah, there is a default one.
Yeah, we have to make further edits
because the agent is not able to develop
the right code or logic. Yeah. How do we
correct develop but before the rest of
the stuff. Yes. So if the agent does
some if it calls develop and it runs and
it produces something that we say okay
that's not right. How do we go back and
say make these changes? Um can we just
edit the completed source? Oh yeah. So
yeah so you you can edit the completed
source if you want. If you say if you
see the source and say, "Oh, it needs
one more change." Or I can show you
another function where we say, uh, we
have an ability to give it more feedback
to say, "Okay, you've done this so far.
Here's some more changes to make because
you didn't get it quite right." Um, and
so we'll see that happening. Uh, yeah,
go ahead. Possible
test. So that
doesn't write
uh the test directory
it I think it should um yeah so I the
question was giving uh the agent access
to the test directory. I think in test
it runs that and I think in our
workspace we just give it the we give it
like the full source full source of the
repo so it could get down in there if it
wanted to. Yeah. I think it it's kind of
a funny thing like making sure the test
passed because sometimes if the agent
broke the test, it'll go change the test
and sometimes that's correct, right?
Sometimes we actually change the
behavior and the tests need to be
updated. But maybe more often that's not
correct. So you might want to maybe have
that as part of your prompting or part
of your validation to say make sure the
agent didn't change the test or or how
it kind of tough to decide like whether
that's correct or not. Yeah.
So in a oneliner. Yeah. Yeah. So in uh
our workflow we installed Dagger but
it's it's really just there's a a Dagger
for GitHub action and so we just said
what threeliner in this threeliner.
Yeah. So we said this version of Dagger
but this installs Dagger in your in your
GitHub actions runtime basically. Uh so
we used checkout to check out a repo and
then this to install dagger dependencies
like you
wouldn't like
Oh yeah.
Yeah. Exactly. So this is um in our
Dagger JSON we have
all all of our dependencies listed and
so you don't have to say like Dagger
install or anything. When we say dagger
install, it adds it to this and then we
just run it. Um, we don't have to do
anything like npm install like that. It
just it it knows to make sure your
client's generated. But that's that is
the nice thing about having those
dependencies,
you know, u in a in a file saved in git,
you know, alongside the project. So
because like what we've done essentially
like if when we first got this project
this view app project it didn't have any
dagger didn't have anything right it was
just like an app that you could run and
then we said oh well it's dagger in it
in this thing and then we got that
little dagger where we started
developing our build and test functions
right kind of like our tools for
development or for CI just alongside and
then in there is where we've been
installing more modules like the the
workspace the github the GitHub issues
module like anything else you would
need. So now and that's all in git. So
the thing's now like this fully loaded
like daggerized project. So it's kind of
carrying around its own tools on its
back for working you know just for a
developer to use or platform engineer to
use or for an agent to use.
Yeah, we're just like waiting for things
to load here. Um yeah, go ahead. Have
you gotten anything uh like dagger in
dagger where you have it spinning up
like agent fleets?
Yeah. Yeah. Yeah. So that's um I mean
myself as someone that builds a lot of
Dagger code I have agents that need to
write Dagger code and to to um reliably
uh validate those things they need
basically Dagger inside of Dagger. So,
that's exactly like a thing that you can
do. And I can even uh pull up if we go
to And we're a bit short on time, but
we're basically done with that guide.
Just waiting for it to run. Yeah. Um,
but we have
uh an examples thing here on the docs.
And there's tons of examples here, but
one of the really cool ones that I like
the most because I wrote it is I thought
you were going to show mine, but that's
fine. No, it's fine. The
Oh, it's not. Okay. Oh, we're gonna add
it to the list of examples. Add to the
list be even cooler list soon. So, we
have your question next. Yeah, there
there's this repo under my GitHub
kpen/dagger programmer. And this thing
is something I use to uh like in the
docs we saw those tabs of all the
different languages. And so, every
whenever I write a new guide, I have to
have it in five languages. And so, this
agent can take it in one language and
produce all the languages. Uh, and
that's just an agent that knows how to
write Dagger. And so to do that, it has
a lot of cool things in addition to be
able to like run Dagger and Dagger. So
if we look at the the code for that,
it's just like this one's in Typescript.
Yeah, this is the TypeScript one. Um,
and when it runs tests, it runs the
Dagger thing and there's this flag
privilege nesting so that the inner
container can talk to the engine. Um and
this this is writing Dagger code
basically. Yeah. Question here curious
how does this relate to um MCP and would
you use Dagger to implement MCP servers
and is there some overlap because you
have all these modules which maybe you
could imagine having multiple MCPS um in
as a different mechanism. Yeah,
absolutely. So, one way to think about
it is um we were we were kind of doing
this thing with Dagger modules before
MCP came on the scene and then obviously
we're like oh this is super aligned with
the way we think things should be in a
lot of ways. Um, and so you can today
even take a Dagger module and you can
say Dagger-m
the name of the module MCP and so you
can expose a Dagger module as an MP MCP
server for example and yeah and we've
got some more things that we'll be
probably sharing soon about that kind of
stuff but yes uh we think it's uh the
vision is compatible in that way and uh
yeah you can use you can use the MCP
ecosystem as well. Yeah. So, there's a
few different layers to it, right?
There's there's um within our agent that
we just built, we installed modules and
that uses uh basically our internal
implementation of MCP to talk between
modules within Dagger. But you can also
take a Dagger module, expose it as an
MCP server and then um in I don't know
the near future next week or something
you could connect to external MCP
servers to bring them into Dagger as
well. Yeah, I mean it's I wanted to be
speak clear like the internal the
internal implementation it's it's before
MCP so it's not MCP per se but it it
very much logically you can think of it
in a similar way. Yeah. And because you
can expose everything as MCP servers it
ends up being practically you know very
very much the same for users. Check it
out. We got our PR what finally. Oh we
got a PR. So we got our PR open. says
make the main page say that closes that
issue we created. We have that commit
pushed up and we see the user is this
GitHub action spot and we have on the
welcome.view
it changed from documentation to so
maybe that's right. Oh, it deleted this
other thing too because it decided
that's not needed. Cool. So we have a
really cool agent. Yeah, it needs lots
back there.
agents just like um but yeah the main
thing is we were able to get it to run
in GitHub so I was able to request that
feature and it ran handsree and now
yeah exactly so right now we we only
built in the one thing where it says we
create an issue that's a feature request
um but if we look at I think on this
examples list um we have
uh this one this greetings API which is
my main like demo project and it has a
ton stuff in there's like five different
agents in here and one of them is I want
to give feedback on a PR and so we could
probably open one of these
uh and I say I give it I give it some
feedback I say slash agent uh add this
other fe so this one the original one is
like make a new endpoint for my API and
then it did that and then I say okay
here's some feedback the endpoint should
be authenticated and then it picks up
again pushes some new changes and And
then I have another agent where I say
slashreview and that will create a
review for my PR with any other changes
that I need. And then I can say okay
make those changes and then also please
don't delete all the tests. Very
important to add and that could be like
you don't have to be inserting yourself
at every one of those points, right? But
in this case it it's great for when
we're Yeah. If you want an example of
how you could take what that workshop
just built to the next level where you
have all this feedback and more advanced
things, this is a great repo to look at
this greetings API because it has all of
these different agents doing tons of
things. It even has one where if we look
at uh if I as a human push up a broken
thing because we still have humans
developing stuff sometimes, right? Uh,
so I pushed a broken thing and the test
failed, which is super annoying because
I, you know, I skipped running tests
because I didn't have a good prompt that
told me to run test three times. Uh,
this agent can actually look at the test
failure automatically and propose a a
fix for that that I can just click on it
and fix that uh, test change. Right? So,
this is all stuff in this demo repo uh,
where you can see like how to build all
these things yourself. A question over
here. There's a lot I really love here.
Um, I just had a question almost getting
at the motivation for some of this
stuff. Yes. It feels like there's a
world where Dagger could have really
prioritized just like the containers,
the workflows and let you just bring
your own AI agent. Like what's the
motivation behind making it its own
primitive and going down that path? I
think there there's a lot of levels to
it, right? Like if you're already really
baked into like Pantic or OpenAI agents
SDK, you can still use those container
workflows in that. And I'll show it.
Maybe I shouldn't, but um but I have
crazy. Uh if if you've done the OpenAI
agents quickart um if it loads here
uh or sorry, this is the the agent quick
start we have but with the agent SDK
where I've used the OpenAI agent SDK
that says like um here's my completions
model. This is actually using Olama. Uh
this is what their SDK looks like. Uh
but in that SDK I'm actually still using
Dagger. So I actually recreated that
same workspace where we have read file,
write file, and build u but I've created
that with Dagger inside of the open
agents SDK. So I'm I'm using their agent
but using Dagger code for the containers
like what like why use I guess yeah the
main thing is like this I had to write
all of these tools and how to use them.
If it's all within Dagger, you get that
cool thing where we have that whole
Dagger Versa modules. I can just plug
one in and that's just given to the
agent, right? Yeah. Your your whole your
whole method signature is instantly
translated into the right form to work
with tools, right? You get tools for
free as well as functions and Yeah. And
we do have some people in our community
that are using Dagger. they're like with
paidantic and other things where they're
just like they want the sandbox
capability because they're like oh I
don't want to you know I don't want to
use another cloud sandbox vendor or
whatever I want to have it locally but I
don't want it on my computer in my file
system either I want containers I want
it easy so they're so yeah but I think
yeah the sweet spot is kind of doing it
all because it just harmonizes really
well
question there thanks for the great demo
uh so I had a question let's say if I
want to build a uh agent for programming
HTML games. Yes. Which run in browser.
So for that game building agent, I would
need the testing envir so the running
and testing environment to be browser.
Yes. So does Dagger has th those sort of
constructs like let's say if I want to
spin up a browser environment and then
do some kind of automation in that for
testing that game which the LLM might
have written. Yeah, I mean you certainly
can. I mean I've done I've done some
headless browser stuff. I've also done
some browser stuff and then connect over
VNC and or different yeah you can do a
lot of you know you can do uh you can do
a lot of stuff u with Linux containers
um so yeah we should talk about it you
should come come in the community let's
like do it
great demo and uh thanks for compressing
a lot of information uh so is my
understanding that you build CI/CD infra
and all these things once and then let
Dagger do the asynchronous job of with
guardrails and you know uh all the
things in place like is it is my
understanding that Dagger is sort of
this asynchronous AI agent that does
things on its own but with guardrails
not just leaving uh cloud code or
something uh in a trust all mode and
then let it do its thing is is that
right I think yeah so the question was
like yeah is what is Dagger in a certain
sense too but Dagger gives you this
platform to create these software
engineering workflows that can be used
for shipping software. They can be used
for developing software, you know, and
the environments that we saw and then
you can use them uh as a platform
engineer or as a developer, but then you
can also hand them off to agents. And so
we think that's really powerful, the
fact that you can use that same platform
to do all those things and to create
those guard rails like you say. Um you
can the one thing I wanted you to show
and you got one minute. Can you just
show your terminal and just let's get
vibe for just one second. So you're
connected to an LLM right now, right? So
um go ahead just like let's talk to this
LLM. So it turns out that we've been
using the shell mode which lets you know
kind of like very declaratively say like
I want container from Alpine with this
file and give me a terminal into that or
whatever. Now what we've done is we just
had we're like we're chatting now
directly with the connected LLM and this
LLM can see all the Dagger objects you
have. So another way you can use Dagger
is you can just say like all right I'm
just going to create this container and
I'm gonna say hey LLM you see that
container why don't you write some
software in it. So you can get that kind
of that kind of workflow going too. So
he's there you go. So he's actually
saying like hey give me a Python
container and so it's going to actually
look and see what methods exist in the
Dagger API. It's oh there's this
container method in the API which we
were using earlier and then it's going
to like you know decide oh I'm going to
use container maybe from container from
container with exec to execute. So these
are just it's exploring the dagger API
right now. Got it. And now it's going to
like it's actually pulling a Python 3.11
container then it can like do things
with that. So you know it's actually
using containers like kind of like
computer use or something like that. But
um so yeah, so you can get you can go we
didn't even show that side of it because
you know we're trying to show the ver
show the guardrails but you can also use
it in this kind of a style too. Got it.
Uh and one one follow-up question uh
typically LLMs are good at small to
medium tasks and that's what we have
seen like a small to medium task here.
How uh good is Dagger at orchest
orchestrating uh like a large task which
need which needs design or some user
input or you know multi-turn prompt uh
like you know not a small medium task
but a large task. How is dagger with
that? Yeah the question is like a size
of task that dagger is good for. I think
if you make it if you decompose things
down and you can architect things right
it can handle a lot of different sizes
and we should I know we're at time now
so we're going to like we're going to
end here. We'll take some more questions
outside the room, but in the hall for
sure. Thank you so much for everybody
that attended. Thank you guys.
[Music]