How to Build Planning Agents without losing control - Yogendra Miraje, Factset
Channel: aiDotEngineer
Published at: 2025-07-23
YouTube video id: sl3icG-IjHo
Source: https://www.youtube.com/watch?v=sl3icG-IjHo
[Music] Hi everyone, I'm Yogi. I work at Faxet, a financial data and software company. And today I'll be sharing some of my experience while building agent. In last few years we have seen tremendous growth in AI and especially in last couple of years we are on exponential curve of intelligence growth and yet it feels like when we are develop AI applications driving a monster truck through a crowded mall with a tiny joysticks. So AI applications have not seen its charge GPD moment yet. There are many reasons why agents don't behave but probably one reason that strikes out is it misses the right context and in case of enterprises often it means that it does not have knowledge of enterprise specific workflows but before that we will see some common context and just like agents human also need a common context so let's start with some key definitions So as you know LLMs are limited by their knowledge at the time of training. So we enhance their functionality by increase it by tool. And when you combine this LLM with tool and memory we call it augmented LLM. When you place this augmented LLM on a static and predefined path we call it a workflow. And if these augmented LLMs have high autonomy and feedback loop, we call it as an agent. Now workflows are controllable and reliable while agents have flexibility and they are highly autonomous. So the question is can we get best of both worlds? So the answer is yes. With agentic workflows we can plan and execute the workflows based on the goal, context and feedback. I see these terms being used very loosely and at times interchangeably. So I would like to make a key distinction between workflow agent and agentic workflow. Workflow agent is a predefined workflow run by agent while agentic workflow is a workflow planned and run by an agent. I know these terms are like quite confusing and in AI we are very bad at naming things. So if you are confused don't worry in case of workflow agent just remember that workflow is in control and workflow is static. In case of agentic workflow agent is always in control and the workflow is dynamic. It is also important to view these systems as agentic system. As Andrew pointed out correctly on agentic spectrum, agentic workflows have more agenticness than workflow agents. Generally speaking, so why all of this matter? Apart from control, reliability, predictability for enterprises, agentic workflows provide a way to automate the workflows at scale. And perhaps most important thing is enterprises can use their existing enterprises uh microservices to build on top of it. And in some cases these enterprises have invested years not if not decades. So before diving deep I would like to say that even though I'm speaking in terms of enterprise context here the concepts are generally applicable. So where do we begin? In last few years the focus really has been on the react based agent and in building agentic workflow we need to move on from react based agent to proactive agents. By the way great philosophy for life as well. So for building agent workflows you need tools, memory and reflection. But more importantly you will need a design pattern called planning by sub goal divi sub goal division. Sometimes also referred as a task decomposition and it is just a fancy way of saying that take your goal and break it down into simpler steps. Here are some specific agentic architecture and research papers that you will find useful and each of that has like its own pro and cons and langun has done fantastic job of uh creating a blog from this and uh also given the code. So I highly recommend checking it out. So how does it look in practice? So what in in fact that what we have done is we are taking this LLM compiler architecture and trying to adop for our problems and you can see some components here uh that you also find that in your organization uh microservices and you build tools around those microservices and when a user question ask it goes to blueprint generator and I will get to that in a bit but consider it as like a high level plan what we call it is a blueprint print that gets fed to planner. Planner is your low level task. Planner it gives the plan to the executor and executor is supposed to execute it and joiner combines the outputs from different tasks based on your replanning logic. Either you do replanning again or you just like terminate and give the response back to the user. Sometimes you also set some recussion limit so that your agent just like doesn't go into loop. On lang graph we are using each of these component as a nodes. So blueprint generator, planner, executor and joiner are all nodes on the langraph. When building this uh tools in in your enterprises around microservices, probably this is where you will spend most of your time. And it's important to consider how this relation between tools and microservices goes. And here the relationship is definitely not one to one or end to end. It's end to end. It's up to you how you want to design your tools according to your microservices so that your agent knows how to use this tool. Perhaps this is like the most key point here that you need to make really put yourself into agent's shoes so that agent really understand what tool to use and it has that knowledge of your microservices. Always follow standard. I know MCP is everyone's favorite. So build the MCP tool server for your tools. And for providing the tool details just think from agent's point of view that you need to provide it tool purpose description and input output contracts. So tool purpose will help you what tools to be selected. Tool detail description will tell you when these tools need to be invoked and input output contracts will tell you how to use this tool. And lastly add some validation checks which acts as a break for your agent. Now I would like to little bit zoom in into this blueprint uh because this is like one of the key architecture change that we made. Uh blueprint is just a series of steps for workflow as per tool capabilities in natural language and it gets fed to planner. But why we are doing it? So what we realized was planner really gets cognitively uh loaded uh when you try to just put too much onto it. So introducing a blueprint which is just a natural language of breaking down of a task is very helpful but we also noticed that it brings lot of other benefits as well. For example, it achieves the final control over task planning. it limits the in context tool for the planner. So when blueprint you can select what tools to be need to be given to the planner and sometimes this uh planners has lot of tool description and you run all sort of problems as context window limit and planner game uh getting very much overloaded. So using blueprint you can limit what tools really goes to the uh planner and thus uh it really helps in in the planning. It also helps interpreting the agentic behavior and lastly when you need to collaborate with nontechnical uh people it's like really helpful because natural language is less intimidating. Let's see a concrete example. So in financial research preparing for an uh company's earning call is a common workflow. So this is a very very simplified version of uh workflow of preparing for a company's earning call. And for example we are showing you preparing for NVIDIA as an call. Now you can see in the blueprint there is a tool and there is task and in the plan there is a tool and the function call. So how does it look in in the blueprint is you have two tools and then you are first step as summarizing the NVIDIA's previous earning call and the next step is retrieval gathering some of the financial data from uh for NVDI and then your reasoning suggesting some questions for the earning call and finally reporting uh generate a comprehensive report from the all the information and there are corresponding function calls and as you can see context is being fed from uh task A a concrete uh example of the response is before you implement agentic workflow the response is pretty much vanilla but after this it can easily capture your workflow and give a very structured response. So whatever we talked about none of this will really work without writing a proper evals. So always make sure to invest and build and maintain your eval framework. You should have at least component and end to-end evals. You should really use the correct techniques like codebased LMS judge human in the loop and more importantly write evals for metrics that you really care for. Aspect bus eval is something like we should really uh think about and for example for blueprint uh you can check an aspect like how many uh blueprint whether resembles a golden blueprint or not and you can use LM as a judge. If you want to see whether tools are selected correct or not you should leverage code based evals. If you want to check whether plan is in line with the blueprint or not, LLM as a judge probably the right technique. And for some cases, leveraging human in the loop is good because report formatting uh that's the best approach to deal with report formatting. So when not to use agentic workflows. So in some cases definitely agentic workflow doesn't make sense. In case of fixed and repetitive task just probably go for ETL pipelines. If your workflow cannot be really captured, uh you cannot really capture use case in workflows, agentic workflows are probably not worth and if deterministic outcome is paramount in cases of strict compliance and safety critical context, uh you should probably should not go with agentic workflow and in case of low latency and cost environment also uh you should probably try to avoid agentic workflow. So wrapping up some learnings um start with simple blueprints work way work uh work your way way up uh building a complex rack system for the blueprints use blueprint to reduce the in context uh tools and provide the high level plan to the planner design tools from agent point of view u always aim for the tool usage simplicity implement safety guardrails and eval observability and all the good software engineering. Uh that should uh help you a lot. And from the whole uh presentation, the key takeaways are agentic workflow is planned and run by agent. Agentic workflows bring the reliability at scale and planning by subgoal division is a key design pattern. Plan and execute is a key agentic architecture. and build your tools to complement your microservices. Always try to leverage your microservices in the tools and modify your architecture to solve the problems. Don't really shy away from changing taking research paper and experimenting on it. And finally, treat your evals like first class citizen. And with that, thank you very much for your time. All right. Uh, thank you. Any questions? We have a little bit of time to spare. I have a question. Um, do you on have um in top of our mind any like a GitHub project or reference that we can follow? Sure. Sure. So if you just go back here um I kind of shared some of the links um for the lang chain it it should have all the code for these research paper and that's probably the most you know best place to start with this plan and execute kind of agents. Thank you. Yeah. Any other questions? Uh all right. Um I guess one question I would have for you is the when you talk about MCP and other forms of orchestration, what do you foresee being the the primary method of orchestration going forward? Is it going to be langraph or some other? Yeah, I think the answer is probably like everything. MCP you use it so that you provide a standard across the arc and MCP will really help for organization to you know build once use it everywhere. uh you can have oftenimes in organizations we see that uh people just like trying to just use this functionality in different AI apps but if you can build an MCP around it you can keep using it and obviously for orchestration langraph is great and whatever the other tools that you find to solve your problem that will be also u so the answer is probably there will be like multiple things that is useful it depends on your use case what is the uh most optimal framework that you want to use amazing Thank you so much, Yuri. [Music]