How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock
Channel: aiDotEngineer
Published at: 2025-08-23
YouTube video id: 08mH36_NVos
Source: https://www.youtube.com/watch?v=08mH36_NVos
[Music] Hi everyone, thank you for having us. I'm Infant, director of engineering at Black Rockck. This is my colleague Wyber, principal engineer and we both work for the data teams at Black Rockck. And today we're going to talk about how we can scale building custom applications in Black Rockck. Specifically, we're talking about like AI applications and knowledge apps at Black Rockck, right? So, just to level set before I get into the uh details. So, Black Rockck is an asset management firm, the world's largest asset manager. What we do is our portfolio managers, analysts get a torrent of information on a daily basis. They synthesize this information, they develop an investment strategy and then they rebalance your portfolios uh which ultimately results in a particular trade. Now the investment operations teams you can think of that as the teams that are the backbone or the engine that makes sure that all of the activities that the investment managers actually perform on a day-to-day basis like runs smoothly right so these teams are kind of responsible for like acquiring the data that you kind of need right uh to actually executing a trade running through compliance all the way to like all of the post- trading activities right so all of these teams actually have to build these internal tools that are actually fairly complex for each of their domains. Right? So building apps and pushing out these apps uh relatively quickly is like of utmost important to us. Right? So if you move on to the next slide again if you actually classify what kind of apps we are talking about what you'll see is that it kind of falls into like four different buckets right? One is everything to do with document extraction. So I have an app I kind of want to like extract entities out of it in that bucket. Second has to do everything with like hey I kind of want to define a complex uh workflow or an automation. So I could have a case where I kind of want to run through X number of steps and then integrate to my downstream systems and then you have the normal like Q&A type systems that you look at like this is your chat interfaces and finally like the the agentic systems right so in each of these domains what we see is u we have this like big opportunity to leverage your models and LLMs to either augment our existing systems uh or like kind of like supercharge those right so that that is like the domain we are speaking about. So I'll move quickly to one particular use case. So this this is a use case that came to us like about like 3 to 4 months back right and we have a team within the investment operations space it's known as a new issue operations team right so this team is kind of responsible for setting up securities uh whenever there is like a market event right so a company goes IPO or like there is like a stock split for a particular organization right the team actually has to take the security and they have to set it up in our internal systems before our portfolio managers or traders can action upon it, right? So, we kind of have to build this tool for the investment operations team, right? To set up a particular security. This is like actually honestly this is like a super simplified version of what happens. But a super high level, we have to build an app that is able to like ingest your prospectors or a term sheet. It pushes it through a particular pipeline, right? U then you talk to your domain experts and these are like your business teams, your equity teams, ETF teams, etc. They actually know how to set up these complex instruments. you get some kind of structured output and now that team works with the engineering teams to actually build this transformation logic and the like and then integrate it with your downstream applications. So you can see that this process actually takes a long time, right? So building an app and then you're introducing new model providers, you're trying to put in like new strategies, the lot of challenges to get an single app out, right? We tried this with agentic systems doesn't quite work right now because of the complexity and the the domain knowledge that's imbued in the human head, right? So the big challenges with scale are again these three categories, right? One is we're spending a lot of time with our domain experts prompt engineering right so in the first phase where we have to extract these documents right they're very complex right your prompt itself in our simplest case like started with like a couple of sentences before you knew it you're trying to describe this financial instrument and it is like three paragraphs long right uh so there's this challenge of like hey I have to iterate over these prompts I have to version and compare these promps how do I manage it effectively and I think even the previous speaker had mentioned you kind of need to eval and have this data set how how good is your prompt performing so that's the first set of challenges in creating like AI apps itself like how are you going to manage this in what direction second set of challenges is around like LLM strategies right what I mean by this is like when you're building an AI app so to speak you have to choose what strategy am I going to use like a rag based approach right or am I going to use a chain of thought-based approach even for a simple task of like data extraction depending on what your instrument is this actually varies uh very highly right if you take like an investment corporate bond like the vanilla one is fairly simple I can do this with like in context positive model I'm able to get my stuff back if the document size is small right some documents are like thousands of pages long 10,000 pages long now suddenly you're like oh okay I don't know if I can pass more than a million tokens into say uh the open AI models what do I do then right then okay I need to choose a different strategy and often what we do is we have a choose choose different strategies and kind of mix them with your prompts to kind of build this iterative process where like I have to play around with my prompts, I have to play around with the different LM strategies and we kind of make want to make that process as quickly as possible. That's a challenge, right? Then you have obviously the context limitations, model limitations, different vendors and you're trying and testing uh things uh uh for quite a while and this kind of goes into the month right then the biggest challenge is like okay fine I've kind of built this app now what how do I get this to deployment and it's this whole other set of challenges right you have your traditional challenges which is has to do with distribution access control how am I going to fedate the app to the users but then in the AI space it's like you have this new challenge of like what type of cluster am I going to deploy this to? Right? So, our equity team would come and say something like, hey, I need to analyze, you know, 500 research reports like overnight, can you help me do this? Right? So, okay, if you're going to do that, I probably have to have like a GPU based inference cluster that I can kind of spin up, right? This is the use case that I kind of described is the new issue setup. In that case, what we do is okay, I don't really want to use my GPU inference cluster, etc. What I do instead is I use like a burstable cluster, right? All those have to be kind of like uh defined so that our app deployment phase is like as close to like a CI/CD pipeline as possible. Then you have like cost controls. So these are again it's not an exhaustive list. I think what I'm trying to highlight is the challenges with kind of building AI apps. Right? So what we did at Black Rockck is what I'm going to do is I'll kind of give you a highle architecture uh and then maybe wuff you can dive into the details and mechanics of how this works and how we are able to build apps relatively quickly right we're able to we took this uh an app took us close to like 8 months somewhere between 3 to 8 months to build a single app for a complex use case and we able to compress time bring it down to like a couple of days right we achieved that by building up this framework what I kind of want to focus on is on the top two boxes that you see which is your sandbox in your app factory right so to the uh the data platform and the developer platform is like the name suggest hey platform is someone for ingesting data etc right you have an orchestration layer that has a pipeline that kind of like transforms it brings it into some uh new format and then you kind of distribute that as a app or report what kind of accelerates at app development is like if you're able to federate out those pain points or those bottlenecks which is like prompt creation or extraction templates choosing an LLM strategy right having extraction runs or like and then building out these logic pieces which are calling transformer and executors if you can get that sandbox out into the hands of the domain experts then your iteration speed becomes really fast right so you're kind of saying that hey I have this modular component can I move across the s iteration really quickly and then pass it along to an app factory which is like our cloudnative operator which takes a definition and spins out an app right so that's super high level with that quick demo. >> Perfect. >> All right, cool. So, what I'm going to show you guys is pretty slim down version of the actual tool we used internally. Um, so to start with, uh, when the operator, so we have like two different, uh, concore components. One is the sandbox, another one is the factory. So think of sandbox as a playground for the operators to sort of like quickly build and refine the extraction templates. Uh sort of run extraction on the set of documents and then compare and contrast the results of these extractions. Um so it's sort of like to get started with the extraction template itself. Uh you might have seen in the other tools both closed and open source they have similar concept like prompt template management where you have certain fields that you want to extract out of the documents and you have their corresponding prompts and some metadata that you can associate with them such as the data type that you expect of the the final result values. But when these operators sort of like trying to run extractions on these documents, they need far more sort of like greater configuration capabilities than just like configuring prompts and configuring the data types that they expect for the end result. So they need like hey I need to have multiple QC checks on the result values. I need to have a lot of validations and constraints on the fields and there might be like a interfield dependencies uh what what the fields that are getting extracted. So as in mentioned with the new security operation uh issuance basically onboarding that stuff there could be a case where uh the security or the bond is callable and you have other fields such as call data and call price which now needs to have a value. So there is like this inter sort of like field dependencies that operators sort of like uh need they need to take that into consideration be able to configure that. So here is like what a like a sample uh extraction template looks like. So here is how a again this is a example template where we have like issuer callable call price and call date these fields set up and to sort of like add new fields we would define the field name uh define the data type that is expected out of that uh define the source whether it's extracted or derived not every time you want to sort of like run an extraction for a field there might be a derived field that operator expect which is sort of like uh populated through some transformation downstream um and once uh again uh whether the field is required and the field dependencies. Here is where you define what sort of like dependencies this field have and sort of validations right. So this is how they set up the extraction. The next thing is the document management itself. So this is where the documents are ingested uh from the uh the data platform. They are tagged according to the business category uh and they are labeled they're embedded all of that stuff. >> Okay. While I think while Viber kind of brings it up. So I think what in essence what we're saying is we kind of built this tool which has like a UI component and like a framework that actually lets you take these different pieces and these modular components and give it to the hands of like the domain expert to build out their app really quickly. Right? >> I think something happened just so let me just sort of walk you guys the what happens next. Um so like once you have set up the extraction templates and documents management the operators basically run the extractions. That's where they basically see the values that they expect from these documents and sort of like review them. Uh the thing with we have seen with these operators trying to use other tools. Uh no this is just saying um >> yeah I did uh the thing we have seen uh with these operators is that most of the tools tools that they have used in past uh these tools basically does extraction uh the they do a pretty good job at extraction but when it comes to like uh hey I need to now use this result that has been uh presented to me and pass it to the downstream processes. The process right now is very manual where they have to like download a CSV or a JSON file, run manual or add a transformation and then push it to the downstream process. So what we have done and again I can't show you but what we have done is like build this sort of like low code no code framework where the operators can basically essentially uh run the uh sort of build this transformation and execution workflows and uh sort of like have this end toend uh pipeline running uh and I I think yeah so I think we'll conclude by saying that our key takeaways of this right I would say that are like three key takeaways invest heavily on your like prompt engineering skills for your domain experts especially in like the financial space and world. Uh defining and describing these documents is really hard, right? A second is like educating the firm and the company on what an LLM strategy means uh and how to actually fix these different pieces for your particular use case. And I think the third one I would say is hey uh the key takeaway that we had is all of this is great in experimentation and prototyping mode but if you kind of want to bring this you have to really evaluate what your ROI is and as is it going to be like more expensive actually spinning up an AI app versus just having like an offtheshelf product that does it quicker and faster. Right? So those are the three key takeaways in terms of like uh building apps at scale. And what we realized was like hey uh this notion of like human in the loop and the one more thing I'll add is human in the loop is super important right we all are like really tempted like let's go all agent tech with this uh but in the financial space with compliance with regulations you kind of need those four eyes check and you kind of need the human loop so design for human in the loop first uh if you're in a highly regulated environment >> yeah and as info said one thing we couldn't show is the whole app factory sort of like uh component which is all the things that operators do through this iteration cycle of through the sandbox. They take all that knowledge, the extraction templates, the transform it transformers and executors they build through this workflow pipeline and through our app uh ecosystem within uh BlackRock they sort of like build this uh custom applications that are then exposed to the users where users of this app don't have to worry about how to configure templates or how to basically figure out how to integrate the result values into final downstream processes. they are presented with this whole end to end app where they can just go and like sort of like upload documents and run extraction and sort of uh get the whole pipeline set uh running. >> Yeah. With that we'll open up for questions. I think we have like a minute or two left. >> Yeah. Uh so I have a question which may directly be related to >> Good morning. I have a question which may directly be related to the uh architecture that you developed. >> You can tell me I can discuss later. But the question is going to be you you have developed uh um the key takeaways. One of those key takeaways had been in invest heavily on prompt engineering. So you have essentially automated the process from the leaf level for example a company's coming to an IPO from that level all the way to cataloging through ETL processes and then to finally to the data analytics. So now your CEO who looks at the balance sheet, assets and liability will be using your AI the most and for C uh your CEO. Now what are the features involved here at the lowest level for example term maturity duration there are so many metrics at the L level. How are you transforming those features from the lowest level to highest level? I'm looking for an answer in reference to decentralized data. >> Yeah, I mean I can give you a quick answer and then we can discuss uh in detail like offline. I think real quickly like the the framework that we built was specifically targeting like the investment operation domain experts who are trying to build applications. To your question of like hey what does the CEO care about? Can I construct a memo that gives me my asset liabilities XYZ? Those would be like different initiatives which may or may not use our particular framework. But uh yes, there are many reusable components in here that people can use. Yeah. >> Yeah. So I do like a lot of document processing for insurance company. Pretty much same problems as you guys run into. So I wonder how do you build a wall around your information extraction from the documents right because there are so many things that can go wrong starting from a CR like doesn't understand what all these terms actually mean no matter how you prompt it right all this stuff. So that's kind of what's for the reason >> again I mean we had all of that that we wanted to show but yeah >> I think a short answer short answer to your question is in terms of like information security and what are the boundaries uh that we're putting in terms of like hey we are not having data leakage or errors or understanding of like in terms of security you can think of it as different layers all the way from like your infra platform application right and the user levels there are different controls and policies in place uh and it's also within SD network like I think there are policies across the stack that we can get into in detail later that kind of uh addresses your um concerns >> and also also to your point um I think we have like different sort of like uh strategies that we use based on uh the sort of like the use case at hand. Uh so it's not just like hey one rag versus this there are multiple model providers that we use uh multiple different strategies etc. uh different like like engineering sort of tweaks. Uh so it's a quite complex sort of yeah process. >> All right. >> Very cool. >> Awesome. >> Thank you. >> All right. [Music]