How AI Agent Swarms Might Be AI's Next Leap — With Guillaume de Saint-Marc
Channel: Alex Kantrowitz
Published at: 2026-04-13
YouTube video id: 6zZhNehawnI
Source: https://www.youtube.com/watch?v=6zZhNehawnI
What's going to help AI agents take the leap from a promising technology to something that gets stuff done? Let's talk about it with Guillaume de Saint-Marc, Vice President of Engineering at Outshift by Cisco, in a conversation brought to you by Outshift by Cisco. Guillaume, great to see you. Welcome to the show. Hey Alex, thanks for having me. Okay, so today we're going to talk all about how agents work together and what that might lead to as AI advances. But I want to start here. I'm curious if you saw what happened with Molt Book, this group of agents that came together and formed their own social network. Uh it seems like they may have started their own religion. Um what did you think when you saw that? Did Were you scared or were you excited about it? Well, that was fascinating. I mean, we were certainly all all taking the seat back, you know, in January when this went viral and starting to eat popcorn. We were like, "Where's this going?" And um but seriously, uh this is an interesting uh engineering case study. Um at the time, you know, this was skyrocketing in terms of number of agents. Uh you know, went to a uh million plus. Uh you know, more recently I see that they start to try to identify agents a bit more seriously, like backing by a real human behind. So, we're more than like 2 uh few hundred K's. But um all this on a single platform from a pure agentic connectivity standpoint worked. And this was quite fascinating. Message got through. Uh you know, agents could find each other. Uh but of course, so on one side there was a lot of security issues that with, you know, the uh cloud security platform found a lot of vulnerabilities. The back end actually was wide open. And so, sadly, you know, uh APIs uh credentials and and personal email got uh stolen. But um on the other side, um the problem we rapidly diagnosed, uh and it was not a surprise, is that uh the spectacle was good, but um those agents were really pattern matching their way through trying social media behavior. They were not really um, uh, doing proper collaboration. They were giving us like a performing a theater of collaboration. Um, so uh, there's no shared state management, no uh, governance layer, no mechanism for agents to uh, really uh, coordinate on anything meaningful. And um, and so we we are but but we think there is a this is still, you know, uh, very um, this is heading towards a very interesting direction. And we think we get there. And these missing pieces uh, to enable a group of agents to truly not just connect but think together and go after much more autonomously after uh, you know, complex goals and more advanced uh, uh, missions. Uh, this is exactly the kind of tech that we are building at the moment. >> Yeah, it's pretty impressive. So, 14 million comments on Multibook, 2.3 million posts, and about it seems like 200,000 verified agents going back and forth. Uh, Yeah, 200,000 yeah, 100 100 K, yeah, exactly. Yeah. And And the reason why I bring this up is because Outshift by Cisco is the part of Cisco, and correct me if I'm wrong here, that sort of thinks independently, tries to get ahead of uh, what's going to happen in the future. And the thing that you're really focused on is what happens when agents coordinate. I happen to think, like after seeing Multibook, like that made it real to me that you might end up having these agent swarms uh, actually playing a role in the world, in the tech world, in the business world, maybe the broader world that we live in. And you know, I thought it was like, you know, us having a conversation about what that's going to mean uh, was super important now because we've now seen it in action. Yeah, I think there's a lot of inspiration here for the for the enterprise world. And uh, but we've we've been at this for quite a while. So, um when we really started focusing on this, and this was more than 2 years ago, we had this um strong um thesis and vision that the intelligence will come not just from building ever smarter and ever bigger and more powerful single agents, um but more uh through what we call horizontal scaling. So, with distributed systems, a lot of agents coming together and working just like humans, just like what we do. This is how we evolved over time across uh um um million uh you know hundreds and thousands of years. And we think that this is going to be the same with agents, but much more uh rapidly because they work at machine speed. And so, we this emergence of you know, what asymptotically will become super intelligence has been uh fascinating for us from the beginning. So, we we we we we worked on this, and and last year we um we released what we call the Internet of Agent infrastructure. And the goal was really first acknowledging that yes, agents are workloads, but um they also have attributes of humans and users. So, they came across the stack as weird entities, new types of entities that needed a proper layer on top of the cloud native layer to be addressed. And um and so, we worked on four main functions. How can we connect agents? How can we give them a strong identity? How can we discover them? And how can we observe them? And this is no big surprise here. We're Cisco, so we connect, we secure, and we observe things. That's what we do in in our business. But applied to agents, there was a real need for reimagining all these functions. And um and we we to back all these efforts, we've actually created an open source project called Agency. And um and this, you know, became a Linux Foundation project with the backing of a few large uh formative members, uh Google, Oracle, um Red Hat were with us uh and Dell as well. And so, uh this is This is out there. This is in the open. This is being used. And we see we start to see enterprise really moving uh to multi-agent systems. Um but as we did that, um we also discovered that there were limitations. So, just like the Mobot's the Mobot example, it's it's it's great to connect agents. And this is useful. Uh you can achieve stuff like this. You can achieve some um what we call mass multi-agent systems that can go after some pretty complex tasks. And we've uh experimented with a lot of use cases and stuff that we've open sourced and stuff that we've developed with uh design partners. But uh but we've also started to see the limit of it. And and the the limit is really basically if you have a a sort of a predefined workflow. So, basically think about a workflow like the the classic um uh corporate workflow, you know, go step 1 2 3 4. Um and if you replace uh uh the different tasks, if you put agents against it, uh you can get to some pretty good results. But these are not the most interesting missions. They're not the most interesting autonomous sort of deep agent deep multi-agent system uh mission that we want to go after. Those more interesting requires what we call a sort of a self-forming collaboration between and sometimes self-selection and sometimes self-evolution of the mass itself. And for that, we realized that there was all sorts of cognition issues that were starting to rise like agents where could communicate, but they couldn't really sync together. And we uh we put our finger on this, and this is what we are working on now because we want to enable this next wave of um innovation with Argentic. Wait hold on. You said self-forming. How does this end up being a self-forming entity? Yeah. So, self-forming is actually when you think about it and this is fascinating. I'll take another you you mentioned Mole Bot, I'll just, you know, pick up on Open Clo. But if you look at Open Clo, the way it's actually so Open Clo is um let's put it this way, it's a it's a state of the art agentic loop, right? So, the agent can reason, but the way it's doing its reasoning based on memory, based on context, based on skill, also soul and personality is actually going to spawn of agents through reasoning and that is self-forming. So, basically the the set of agents and again, this is just taking a small popular example with Open Clo, but think about it happening at a much bigger scale, you know, within corporates with a much higher level of security and and um um higher level of security and higher level of certification, if you want. And and that's it. So, the ability for this agentic system to reason and to decide which agents they need to bring in, that's why it's so important to be able to discover agents. That's why we have the directory feature in in agency cuz that's one of the fundamental building block. This you know, look at the skills you need, discover these agents and bring them along in your mission and task them with portion of the the the the the plan that you've just formed. So, this is all happening in real time and this is how so you see now the difference between this and a sort of a pre-formatted workflow. Workflow are great because they are very deterministic. They can be reassuring at the beginning for enterprise because they can be certified. But if you really want to go after interesting mission, you need this self-forming reasoning aspect where actually at some point between all these agents some some some deeper cognition and reasoning challenges are going to arise and I can tell you a bit more about this if you want. Yeah, I mean I love this because this is a real technology conversation. This is sort of the cutting edge or the the of where the technology will go. And so, is it your perspective that this is what we're going to see happen is agents will form their own self-form their own swarms and go out and try to do things? Yeah, but you need to absolutely We absolutely believe this is going to be the case. Again, because >> do you control that? How do you control that? Because it seems like a path to runaway AI. Yeah, so so this is where you need again and and your question is spot on. Controlling this is part of the cognition challenges that we've identified because you need to put a lot of guardrails around this. And so, you can control it at different level. So, if you think about the the the pure connectivity level, so again, stuff which is out there and when I say it's out there, I I just want to make sure folks listening to the podcast understand this is very concrete. Go to agency.org or you know, find the corresponding git and you'll find a ton of code. It's It's not concept, right? You have a lot of code. We have example application. We have something called coffee agency. You can go there. It's a you know, the equivalent of a stock shop, you know, for Kubernetes. This is how to get started and we exercise a different agency function in a little example application and then we have more advanced examples there as well. And so, to put things under control, you need to you need to control connectivity. So, for that, we have First of all, we can very strictly control who's talking to who, right? Just like in There is a fundamental And this is where applying networking vision to this technology is so powerful because with networking, you might be familiar with techniques or technology called network segmentation and micro-segmentation. Like for instance, you make sure that the the the laptop of the finance guy cannot see the laptop of the HR guy, cannot see the laptop of the you And you some some some servers in this function, you know, are sort of strictly segmented and cannot just communicate randomly with anyone. So, we do the same with agents. So, agents, they usually like agents like to work in groups, not just work one-to-one. And so, we form we form rooms of agents, a bit like a WebEx room or a Slack room, if you want, where people are exchanging. So, we are creating rooms, and agents have to communicate within these rooms. And these rooms are designed, you know, to address special tasks or special part of the mission, and they don't necessarily see the whole project. Um and so, a lot of these techniques are enabled through a technology called Stream that we have in in Agency, which is really all the agent transport and agent network connectivity. Everything is encrypted and all this. So, it's really important. We are using a technology called MLS, which is the same technology you use for collaboration platforms. So, like like if an agent is going rogue, we can revoke the agent, but it doesn't break the room. All the other agents continue to have access to the data that were exchanged. You know, only this rogue agent is now, you know, completely outside and cannot access anything. So, this is this this has been well suited. And the last point I would say is we also have something called T-BAC. T-BAC is really important, and this ties to agent's identity. So, when we welcome an agent in the in the system, again, the agent has been discovered through through a pretty rigorous way of assigning um cryptographically signed what we call the agent cards, and we work also very closely with Google on this and the A2A group. We are very active here. It's not just Agency. I mean, we we're also teaming up with the rest of the industry, of course, here. Uh and T-BAC is cool because T-BAC is a way to say, well, certain agents can access certain tools, but not others. And there is no reason that this agent this particular agent will try to access these tools, for instance. So, forever, we uh we ban it. But, what we are doing something even more interesting is when an agent is giving a particular task, you know, think about a micro task. The agent is going to do 100 things for you, or one of the sub-agent that has been postponed is going to do 100 things for this mission. But, at a particular point in time, we ask this agent, for instance, "Please check uh the uh currency exchange rate between, whatever, euro and dollar, right?" Something like this. So, the agent is going to do and do this. But, if the agent says, "Oh, in order to do this, I also need to do a transaction." You're going to go like, "Red flag. Why do you need a transaction?" Because I just asked you to check, right? So, this kind of semantic level verification is absolutely something that we are doing. Like, you can like an independent micro agent, if you want, is going to come and check that uh the the kind of the the tools that an agent is going to call are consistent with with the task that the agent has just been given. And so, I could I I could go on and and share more example of guardrails, but this gives you a bit of an idea of how we are putting this under control in terms of the product. >> Okay. Right, because I think these things become most useful when you give them access to the most data, but on the other side, that is sort of where things get kind of tricky. So, that'll be, I think, the thing that the industry is going to need to figure out is how to combine that access with that sort of safety or comfort that you would have, because it's it's not simple. No, you're you're so you're 100% right. Uh spot on. And and this is um an agent with no agency is useless. Correct. An agent with um you know, too much agency can be dangerous, uh and especially if it goes out of control. Uh this is a little bit what we saw with uh again, Open Cloze is an amazing piece of technology, but uh the folks at this is, of course, security is a big focus now for Open Cloze. Um but, um at the beginning, it this was the problem. It had a lot of agency, it could do a lot of cool stuff, but it could also go, you know, uh uh uh off rails, right? Which is which is an issue. Um But yeah, but but but there are also other types of challenges when you try to um have a group of agents working together. So, um and and and this is what we call So, this this is why we are building another So, if you bear with me for a sec, Alex, I'll I'll explain the layers that we are building because there are two, so it's relatively simple. So, think about it as we don't want to reinvent the wheel. So, everything is running on the good old internet. That's already granted. On top of this, you have the um really the um the cloud native stack. The stack that is, you know, used by all the cloud native developers. This is how we've developed all the applications that we love and use today. All the SaaS, all the mobile stuff has been developed on this. So, that's the cloud native the cloud native layer. And so, what we've done with agency is that we've said, "You know, it's time to think about a new layer." And by the way, to be very concrete, uh we've also published a paper extending the OSI model. So, like this, you know, most engineers will know the seven layers of the OSI model, the IP stack, uh and the the the the different communication layers. And level seven is really where the world has been living for the past 15 years. Everything is application level. All the modern transport, you know, from quick to HTTP, of course, and uh and all these lives in layer seven. And this is good, but we thought that for Agoric, it was time to recognize that two new layers are emerging. So, one is what we call the syntactic layer. So, that's really the layer to connect the agents with each other. And this is where you have protocols like A2A or MCP, which uh lives. And this is really what where we've been focusing with agency. And um per our discussion, recognizing that this was absolutely needed but not enough, now we're adding a final, and this is the last one, we hope, layer on top, which is layer nine, and this is the semantic layer. So, this is where we really actually care about um, what the message is about. You know, usually when you when you transport IP packets, who cares what's in the box, right? Like IP packets, you know, content, content. But with layer nine, we're starting to form headers, which are really um, giving you indication about what this message is about, what it is saying. Is this like an agent trying to share um, um, an intent with another agent? Is it just sharing a knowledge? Is it trying to delegate a task? And so, this is this these layer nine protocols are the protocols we are currently building to be able to um, enable these um, the agents to sync together and to have the proper cognition and cognitive behaviors that we are expecting from them. Okay. And and I'll just note your agency is spelled a little differently than the standard spelling, A G N T C Y. We'll link it in the in the show notes. >> No vowels, no vowels, yes. Just A at the beginning, Y at the end, and no vowels in the in between, yes. It's it's a good point because folks can struggle to finding it otherwise, yes. Yeah, okay. We'll we'll definitely link it. Um, can you bring it a little bit more concrete for us in terms of well, what do you expect a swarm of agents to be able to do that a single agent couldn't necessarily do? You know, when you think about this working in an ideal form, what does it look like? Yeah, that's a great question. Um, so, we think that the a good a simple way to think about it is when you think about um, cross-domain, cross-functional agents. Um, I'll take an example. Um, you want um, you want to have an agentic solution to, for instance, um, resolve severe outage on your IT infrastructure. Argenti cops by the way is a great and this is obviously as as algae but also as Cisco. This is a big area of application for us in our you know, backyard. Argenti operations of IT systems. That's what we do. And we we see a massive impact of in terms of productivity and savings that Argenti can bring. So take the example. You want to have an Argenti system which is capable of bracing bringing things back to normal when you have a pretty bad IT crisis. Well, this is this is a complex problem. You have the SRE dimension of it. So basically that the platform you know, is you know, your different clusters, your different Kubernetes containers. And that's the the site reliability engineer that is responsible for keeping the system. >> By the way, just a quick side note on this one. We entirely So in my team, we've actually built our own Argenti SRE system. It's a multi-agent system. It's called Cape c a i p e. And we've entirely open sourced it with from the with a group called Canoe c n o e which is basically a group of you know, you have folks from AWS and Adobe and us you know, and and many other large enterprise you know, progressing the state of the art on on SRE and and platform engineering. So very concrete example I'm giving here. So you need this you need this SRE agent because of course you need to be able to act on the the platform. You need a security agent because there might be a security dimension to your problem. You need a ton of observability agents as well. And this cannot be the same agents. These are entire domains of their own. Like typically your observability agent will come from Splunk if you're a Splunk customer because that's clearly the best team capable of giving you your observability agent. Um the security agent will come from Cisco or from another security provider. Your SRE agent might come from uh one of your SRE provider or maybe you've designed it yourself or maybe you're using Cape. And uh you can also have other agents like um crisis communication agents because you might have to talk about this or you might have to communicate to your customers and to the world about your outage. And so, already you see that with these five, six agents I just mentioned, you need to have coordination between these agents and that's the only way you're going to achieve this mission. Um and so, this is the kind just to give you this concrete example. This is the kind This is the level of ambition we have. So, call it super intelligence or just regular intelligence. Um a lot of the experiments we do indicates that uh we we'll go from hours, sometimes days of trying to put situation like this under control to hopefully just a few minutes. So, this is going to be very significant in the And And you know, another agent will have like will be like the agent capable of finding troubleshooting like finding what we call the root cause analysis. Uh you know, doing the root cause analysis of of the problem. Uh we had some amazing results on this. Literally going from putting experts in a room for 3 days to just a few minutes. Uh so so Anyway, I hope this is making it very concrete why why we are Just want to be very clear. We have good reasons, but we are not fully there yet. You want to connect agents, you want to follow workflows. You Yeah, we already have a lot of good stuff. You want to put a complex cell phone because in this case of the uh the the IT outage um solution, the the the team of agent is going to create a plan which depends on the outage. There's no Of course, they can go through playbooks, but it this is too rigid. You know, this might be something new we've never seen. I mean, we have new attacks or new outages that we've never seen before every day in in the news, right? So, you see how the self-forming, the real reasoning, and the cognitive collaboration of the agencies needed. And uh and so this is what we hope to enable. But, there are challenges. There are challenges to uh to enable something like this. Guillaume, let me ask you. This is something I've been wondering for a while. Um when we think about agent swarms, right? Like there's a coordinator agent and there's like the SRE agent, all these different types of agents. Um is that different underlying technology or is it like the same agent or the sub-agent the same agent just with like a different prompt task to look for something else? Like what's the differentiation between agents there? Yeah, that's a that's a Well, so it depends. Um it's So, it's So, if you look at something like um uh l- like I would say a a a simple yet powerful um agentic solution like OpenGlue, they're kind of the same, right? You can change the model behind, you can give it a different prompt, some maybe access to different skills, but they're roughly the same type of pattern. When you go back to my example, this can be the the diversity of how agents have been uh coded, created can be much wider because they come from completely different domains, different company um there's little chance that the one of um like a a Cisco agent or Microsoft agent or Salesforce agent, which you might need to bring together to solve an enterprise, you know, uh cross-functional use case will be using th- they won't be the same. They won't just be a a prompt. Uh they will be much more than this. They will use memory, they will use their specific guardrails, they might use uh some of the what we call the uh uh cognition engine, so accelerators for um you know, how these agents can collaborate. And so they will be very different. So, um to answer your question precisely, um as complexity is growing and as we again, as I said before, we asymptoted asymptotically start to progress towards something we can call super intelligence, it's going to be very heterogeneous, very different types of agents, different vendors, different clouds, different technical frameworks will still have to collaborate together. And am I right in thinking that if you have like one agent in the stack that's more multi-purpose, maybe underlying that agent you have one of these bigger generalized models, whereas like if you have one with just one task, one task, you can have it run with a lighter model so the compute and the cost is not as intensive. Absolutely. Uh so I mentioned the I'm I'm going to um this is stuff that we are going to share soon, but um like for the the T-Baq uh functionality, the semantic T-Baq, for now we are using um um uh you know, picture model. So connect the model you want to to actually power well the solution. Um but we're also working on a on a small language model uh which is going to make it um you know, much much smaller, tiny, that's why it's called small language model that will power our feature like this. And this is important because when you think about it, T-Baq is something that you bring to uh uh to agents typically through a sidecar um uh to each So each agent has a sidecar which is controlling any networking communication with this agent and which is applying policies, which is applying semantic verification. What we again, we call them cognition engines. And uh so basically it means that each time an agent is going to uh call a tool, you potentially have to generate one more call to an LLM. And if this LLM is expensive, this can start to exponentially cost you a fortune uh to apply this level of sec So, at this point having small language models for these routine tasks, which are highly specialized and as efficient as a generalist model, but they can only do this but very well, is economically super important. And uh and and we are highly conscious of that. And otherwise will require the security, the observability, the explainability, but also the control of the cost on these multi-agent systems. Well, let me ask you this. You have mentioned a couple times that you're open-sourcing some of the projects that you're working on. And I'm I'm happy to see that, but I'm also curious why you're doing it because, you know, my perception is maybe that uh stuff that you're working on is something you want to keep in house to give Cisco an edge uh over the competition. And yet here you go open-sourcing it. So, talk through the logic there. Yeah, well, I'm glad you asked because so we we think this, you know, what what's in front of us is is is really a challenge not just for Cisco, but for the entire ecosystem. By the way, and and if you see if you go back to the root of the internet, the internet has been designed back at the time as an open and interoperable system. And we've noticed that there is a small thing called the digital economy that was that emerged on top of that. And so we've been obsessed by reproducing this model. We saw a genetic coming and this was so profound that we're like, "This cannot just be believing in walled gardens. This has to be uh based on an open and interoperable foundation." That's why we call it the internet of agent, the internet of cognition. And honestly, these are complex topics. So, uh we're happy to contribute. We have a lot of ideas, but no company can do this alone. I mean, we need to do this uh as an ecosystem. And this is the way to maximize uh the value for the entire ecosystem, not just for a few players. So, that's why we are doing this. And not too worried about Cisco differentiating because uh these days you also differentiate by velocity, and we have you know, so we we we focus on innovation velocity, which our product teams are doing a lot. Uh and our role is to make sure that um some of these relevant technology goes as fast as possible to to our peers in Cisco who can Splunk, AI defense, other teams that we are working with. It doesn't mean that we are open source necessarily 100% of what we do. So, of course we can always keep a few things like um okay, this is you know, this piece of tech we can keep in tow and and only graduate internally, but I would say that I don't know, probably more than 80% of what we do needs to be open source because we actually believe in this importance of open and interoperable. Yeah, and that velocity point is well taken. Things are moving fast. Yeah, the things are moving moving really fast, and that's why we we're we're in the middle of conducting a lot of um experiments. We hope to So, for now, just to be very concrete, the internet of agent is out there. Stuff is getting into production. It's um as I mentioned, you know, it's it's tied to the hip by the hip to A2A, and so the the fourth piece of technology is the observability already available through Splunk, and we've open sourced a lot of stuff, and there are many players doing this. So, observability, the connectivity, the identity, and the discovery of agents, all of these technologies are usable today and out there. The internet of cognition, we are still working on this, right? It It's It's So, we've um we've published um a white paper in in January uh with the with the vision, uh and we are working hard on the architecture. Um we are going to we are going to release some code very soon, a bit later in in April. Uh this will be just a a sort of a humble beginning trying starting to share tools and examples that people can take inspiration from. Uh, and we are going to keep pushing. One thing which we are doing at the moment in order to validate our architecture is to conduct a lot of experiments and see when we put swarm of agents together, see where it starts to derail. Uh, so we have a a taxonomy if you want of seven or eight cognition issues which um happen on a recurring basis which we keep seeing popping, you know, across these multi-agents and we are trying to tackle them one by one and we're trying to build an architecture which can alleviate or remove these cognition issues amongst agents in order to to be able to make them function super well. All right, Guillaume. So, if people are interested in learning more and getting involved, where should they go? Well, that's pretty straightforward, Alex. So, we have this algocisco.com website where we keep all the news and look for the Internet of Cognition subpage because here we're just dropping code and links to code that is going to show how we can have Internet of Cognition really like initial infrastructure helps you coordinate more complex mission across across agents of different sorts. We have white papers. We have also links to more academic paper which we have started to publish on the architecture and last but not least we have a pretty cool also demo about all the concepts that I've explained on the video today. So, go check that. There's a ton to keep you busy and keep an eye on it because we are going to roll out more content in the next few months. Awesome. Well, I'm really looking forward to following the journey. Guillaume, thanks so much for sharing everything today. Appreciate it. Thank you, Alex. And please join us because none of that can be done by ourselves. So, all the behind all the open source we do, we have working groups which are open for you to join, for you to contribute, and have fun with us. Terrific. Well, thank you, Guillaume. And thank you, everybody, for watching. We'll be back on the channel with another video later this week.