How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock

Channel: aiDotEngineer
Published at: 2025-08-23
YouTube video id: 08mH36_NVos
Source: https://www.youtube.com/watch?v=08mH36_NVos
[Music]
Hi everyone, thank you for having us.
I'm Infant, director of engineering at
Black Rockck. This is my colleague
Wyber, principal engineer and we both
work for the data teams at Black Rockck.
And today we're going to talk about how
we can scale building custom
applications in Black Rockck.
Specifically, we're talking about like
AI applications and knowledge apps at
Black Rockck, right? So, just to level
set before I get into the uh details.
So, Black Rockck is an asset management
firm, the world's largest asset manager.
What we do is our portfolio managers,
analysts get a torrent of information on
a daily basis. They synthesize this
information, they develop an investment
strategy and then they rebalance your
portfolios uh which ultimately results
in a particular trade. Now the
investment operations teams you can
think of that as the teams that are the
backbone or the engine that makes sure
that all of the activities that the
investment managers actually perform on
a day-to-day basis like runs smoothly
right so these teams are kind of
responsible for like acquiring the data
that you kind of need right uh to
actually executing a trade running
through compliance all the way to like
all of the post- trading activities
right so all of these teams actually
have to build these internal tools that
are actually fairly complex for each of
their domains. Right? So building apps
and pushing out these apps uh relatively
quickly is like of utmost important to
us. Right? So if you move on to the next
slide again if you actually classify
what kind of apps we are talking about
what you'll see is that it kind of falls
into like four different buckets right?
One is everything to do with document
extraction. So I have an app I kind of
want to like extract entities out of it
in that bucket. Second has to do
everything with like hey I kind of want
to define a complex uh workflow or an
automation. So I could have a case where
I kind of want to run through X number
of steps and then integrate to my
downstream systems and then you have the
normal like Q&A type systems that you
look at like this is your chat
interfaces and finally like the the
agentic systems right so in each of
these domains what we see is u we have
this like big opportunity to leverage
your models and LLMs to either augment
our existing systems uh or like kind of
like supercharge those right so that
that is like the domain we are speaking
about. So I'll move quickly to one
particular use case. So this this is a
use case that came to us like about like
3 to 4 months back right and we have a
team within the investment operations
space it's known as a new issue
operations team right so this team is
kind of responsible for setting up
securities uh whenever there is like a
market event right so a company goes IPO
or like there is like a stock split for
a particular organization right the team
actually has to take the security and
they have to set it up in our internal
systems before our portfolio managers or
traders can action upon it, right? So,
we kind of have to build this tool for
the investment operations team, right?
To set up a particular security. This is
like actually honestly this is like a
super simplified version of what
happens. But a super high level, we have
to build an app that is able to like
ingest your prospectors or a term sheet.
It pushes it through a particular
pipeline, right? U then you talk to your
domain experts and these are like your
business teams, your equity teams, ETF
teams, etc. They actually know how to
set up these complex instruments. you
get some kind of structured output and
now that team works with the engineering
teams to actually build this
transformation logic and the like and
then integrate it with your downstream
applications. So you can see that this
process actually takes a long time,
right? So building an app and then
you're introducing new model providers,
you're trying to put in like new
strategies, the lot of challenges to get
an single app out, right? We tried this
with agentic systems doesn't quite work
right now because of the complexity and
the the domain knowledge that's imbued
in the human head, right? So the big
challenges with scale are again these
three categories, right? One is we're
spending a lot of time with our domain
experts prompt engineering right so in
the first phase where we have to extract
these documents right they're very
complex right your prompt itself in our
simplest case like started with like a
couple of sentences before you knew it
you're trying to describe this financial
instrument and it is like three
paragraphs long right uh so there's this
challenge of like hey I have to iterate
over these prompts I have to version and
compare these promps how do I manage it
effectively and I think even the
previous speaker had mentioned you kind
of need to eval and have this data set
how how good is your prompt performing
so that's the first set of challenges in
creating like AI apps itself like how
are you going to manage this in what
direction second set of challenges is
around like LLM strategies right what I
mean by this is like when you're
building an AI app so to speak you have
to choose what strategy am I going to
use like a rag based approach right or
am I going to use a chain of
thought-based approach even for a simple
task of like data extraction depending
on what your instrument is this actually
varies uh very highly right if you take
like an investment corporate bond like
the vanilla one is fairly simple I can
do this with like in context positive
model I'm able to get my stuff back if
the document size is small right some
documents are like thousands of pages
long 10,000 pages long now suddenly
you're like oh okay I don't know if I
can pass more than a million tokens into
say uh the open AI models what do I do
then right then okay I need to choose a
different strategy and often what we do
is we have a choose choose different
strategies and kind of mix them with
your prompts to kind of build this
iterative process where like I have to
play around with my prompts, I have to
play around with the different LM
strategies and we kind of make want to
make that process as quickly as
possible. That's a challenge, right?
Then you have obviously the context
limitations, model limitations,
different vendors and you're trying and
testing uh things uh uh for quite a
while and this kind of goes into the
month right then the biggest challenge
is like okay fine I've kind of built
this app now what how do I get this to
deployment and it's this whole other set
of challenges right you have your
traditional challenges which is has to
do with distribution access control how
am I going to fedate the app to the
users but then in the AI space it's like
you have this new challenge of like what
type of cluster am I going to deploy
this to? Right? So, our equity team
would come and say something like, hey,
I need to analyze, you know, 500
research reports like overnight, can you
help me do this? Right? So, okay, if
you're going to do that, I probably have
to have like a GPU based inference
cluster that I can kind of spin up,
right? This is the use case that I kind
of described is the new issue setup. In
that case, what we do is okay, I don't
really want to use my GPU inference
cluster, etc. What I do instead is I use
like a burstable cluster, right? All
those have to be kind of like uh defined
so that our app deployment phase is like
as close to like a CI/CD pipeline as
possible. Then you have like cost
controls. So these are again it's not an
exhaustive list. I think what I'm trying
to highlight is the challenges with kind
of building AI apps. Right? So what we
did at Black Rockck is what I'm going to
do is I'll kind of give you a highle
architecture uh and then maybe wuff you
can dive into the details and mechanics
of how this works and how we are able to
build apps relatively quickly right
we're able to we took this uh an app
took us close to like 8 months somewhere
between 3 to 8 months to build a single
app for a complex use case and we able
to compress time bring it down to like a
couple of days right we achieved that by
building up this framework what I kind
of want to focus on is on the top two
boxes that you see which is your sandbox
in your app factory right so to the uh
the data platform and the developer
platform is like the name suggest hey
platform is someone for ingesting data
etc right you have an orchestration
layer that has a pipeline that kind of
like transforms it brings it into some
uh new format and then you kind of
distribute that as a app or report what
kind of accelerates at app development
is like if you're able to federate out
those pain points or those bottlenecks
which is like prompt creation or
extraction templates choosing an LLM
strategy right having extraction runs or
like and then building out these logic
pieces which are calling transformer and
executors if you can get that sandbox
out into the hands of the domain experts
then your iteration speed becomes really
fast right so you're kind of saying that
hey I have this modular component can I
move across the s iteration really
quickly and then pass it along to an app
factory which is like our cloudnative
operator which takes a definition and
spins out an app right so that's super
high level with that quick demo.
>> Perfect.
>> All right,
cool. So, what I'm going to show you
guys is pretty slim down version of the
actual tool we used internally. Um, so
to start with, uh, when the operator, so
we have like two different, uh, concore
components. One is the sandbox, another
one is the factory. So think of sandbox
as a playground for the operators to
sort of like quickly build and refine
the extraction templates. Uh sort of run
extraction on the set of documents and
then compare and contrast the results of
these extractions. Um so it's sort of
like to get started with the extraction
template itself. Uh you might have seen
in the other tools both closed and open
source they have similar concept like
prompt template management where you
have certain fields that you want to
extract out of the documents and you
have their corresponding prompts and
some metadata that you can associate
with them such as the data type that you
expect of the the final result values.
But when these operators sort of like
trying to run extractions on these
documents, they need far more sort of
like greater configuration capabilities
than just like configuring prompts and
configuring the data types that they
expect for the end result. So they need
like hey I need to have multiple QC
checks on the result values. I need to
have a lot of validations and
constraints on the fields and there
might be like a interfield dependencies
uh what what the fields that are getting
extracted. So as in mentioned with the
new security operation uh issuance
basically onboarding that stuff there
could be a case where uh the security or
the bond is callable and you have other
fields such as call data and call price
which now needs to have a value. So
there is like this inter sort of like
field dependencies that operators sort
of like uh need they need to take that
into consideration be able to configure
that. So here is like what a like a
sample uh extraction template looks
like.
So here is how a again this is a example
template where we have like issuer
callable call price and call date these
fields set up and to sort of like add
new fields we would define the field
name uh define the data type that is
expected out of that uh define the
source whether it's extracted or derived
not every time you want to sort of like
run an extraction for a field there
might be a derived field that operator
expect which is sort of like uh
populated through some transformation
downstream um and once uh again uh
whether the field is required and the
field dependencies. Here is where you
define what sort of like dependencies
this field have and sort of validations
right. So this is how they set up the
extraction. The next thing is the
document management itself. So this is
where the documents are ingested uh from
the uh the data platform. They are
tagged according to the business
category uh and they are labeled they're
embedded all of that stuff.
>> Okay. While I think while Viber kind of
brings it up. So I think what in essence
what we're saying is we kind of built
this tool which has like a UI component
and like a framework that actually lets
you take these different pieces and
these modular components and give it to
the hands of like the domain expert to
build out their app really quickly.
Right?
>> I think something happened just so let
me just sort of walk you guys the what
happens next. Um so like once you have
set up the extraction templates and
documents management the operators
basically run the extractions. That's
where they basically see the values that
they expect from these documents and
sort of like review them. Uh the thing
with we have seen with these operators
trying to use other tools. Uh no this is
just saying um
>> yeah I did uh the thing we have seen uh
with these operators is that most of the
tools tools that they have used in past
uh these tools basically does extraction
uh the they do a pretty good job at
extraction but when it comes to like uh
hey I need to now use this result that
has been uh presented to me and pass it
to the downstream processes. The process
right now is very manual where they have
to like download a CSV or a JSON file,
run manual or add a transformation and
then push it to the downstream process.
So what we have done and again I can't
show you but what we have done is like
build this sort of like low code no code
framework where the operators can
basically essentially uh run the uh sort
of build this transformation and
execution workflows and uh sort of like
have this end toend uh pipeline running
uh and I I think yeah so I think we'll
conclude by saying that our key
takeaways of this right I would say that
are like three key takeaways invest
heavily on your like prompt engineering
skills for your domain experts
especially in like the financial space
and world. Uh defining and describing
these documents is really hard, right? A
second is like educating the firm and
the company on what an LLM strategy
means uh and how to actually fix these
different pieces for your particular use
case. And I think the third one I would
say is hey uh the key takeaway that we
had is all of this is great in
experimentation and prototyping mode but
if you kind of want to bring this you
have to really evaluate what your ROI is
and as is it going to be like more
expensive actually spinning up an AI app
versus just having like an offtheshelf
product that does it quicker and faster.
Right? So those are the three key
takeaways in terms of like uh building
apps at scale. And what we realized was
like hey uh this notion of like human in
the loop and the one more thing I'll add
is human in the loop is super important
right we all are like really tempted
like let's go all agent tech with this
uh but in the financial space with
compliance with regulations you kind of
need those four eyes check and you kind
of need the human loop so design for
human in the loop first uh if you're in
a highly regulated environment
>> yeah and as info said one thing we
couldn't show is the whole app factory
sort of like uh component which is all
the things that operators do through
this iteration cycle of through the
sandbox. They take all that knowledge,
the extraction templates, the transform
it transformers and executors they build
through this workflow pipeline and
through our app uh ecosystem within uh
BlackRock they sort of like build this
uh custom applications that are then
exposed to the users where users of this
app don't have to worry about how to
configure templates or how to basically
figure out how to integrate the result
values into final downstream processes.
they are presented with this whole end
to end app where they can just go and
like sort of like upload documents and
run extraction and sort of uh get the
whole pipeline set uh running.
>> Yeah. With that we'll open up for
questions. I think we have like a minute
or two left.
>> Yeah. Uh so I have a question which may
directly be related to
>> Good morning. I have a question which
may directly be related to the uh
architecture that you developed.
>> You can tell me I can discuss later. But
the question is going to be
you you have developed uh um the key
takeaways. One of those key takeaways
had been in invest heavily on prompt
engineering. So you have essentially
automated the process from the leaf
level for example a company's coming to
an IPO from that level all the way to
cataloging through ETL processes and
then to finally to the data analytics.
So now your CEO who looks at the balance
sheet, assets and liability will be
using your AI the most
and for C uh your CEO. Now what are the
features involved here at the lowest
level for example term maturity duration
there are so many metrics at the L
level. How are you transforming those
features from the lowest level to
highest level? I'm looking for an answer
in reference to decentralized data.
>> Yeah, I mean I can give you a quick
answer and then we can discuss uh in
detail like offline. I think real
quickly like the the framework that we
built was specifically targeting like
the investment operation domain experts
who are trying to build applications. To
your question of like hey what does the
CEO care about? Can I construct a memo
that gives me my asset liabilities XYZ?
Those would be like different
initiatives which may or may not use our
particular framework. But uh yes, there
are many reusable components in here
that people can use. Yeah.
>> Yeah. So I do like a lot of document
processing for insurance company. Pretty
much same problems as you guys run into.
So I wonder how do you build a wall
around your information extraction from
the documents right because there are so
many things that can go wrong starting
from a CR like doesn't understand what
all these terms actually mean no matter
how you prompt it right all this stuff.
So that's kind of what's for the reason
>> again I mean we had all of that that we
wanted to show but yeah
>> I think a short answer short answer to
your question is in terms of like
information security and what are the
boundaries uh that we're putting in
terms of like hey we are not having data
leakage or errors or understanding of
like in terms of security you can think
of it as different layers all the way
from like your infra platform
application right and the user levels
there are different controls and
policies in place uh and it's also
within SD network like I think there are
policies across the stack that we can
get into in detail later that kind of uh
addresses your um concerns
>> and also also to your point um I think
we have like different sort of like uh
strategies that we use based on uh the
sort of like the use case at hand. Uh so
it's not just like hey one rag versus
this there are multiple model providers
that we use uh multiple different
strategies etc. uh different like like
engineering sort of tweaks. Uh so it's a
quite complex sort of yeah process.
>> All right.
>> Very cool.
>> Awesome.
>> Thank you.
>> All right.
[Music]