MCP UI: Extending the frontier — Liad Yosef and Ido Salomon, MCP Apps

Channel: aiDotEngineer

Published at: 2026-05-06

YouTube video id: o-zkvb0iFDQ

Source: https://www.youtube.com/watch?v=o-zkvb0iFDQ

[music]
>> Okay, hi everyone.
We built this talk well not really
yesterday we built this talk this
morning and it might already be out of
date. I'm Eda Solomon. I'm the creator
of MCPUI,
co-creator of MCP apps and maintainer.
Also creator of Agent Craft if you were
on the previous session.
I'm Liad. I work with Eda on MCPUI,
co-created the MCP app spec with Eda and
I'm also co-founder of Ergo Labs which
is like human agentic interfaces a
company.
So MCP apps are all around us. You might
not even realize it but the interactive
applications you see today in ChatGPT,
Claude and others are actually based on
MCP and the MCP app spec.
But why do we need MCP apps and what's
MCP apps? I mean you heard David this
morning talk about it a little bit.
We used to text, we used to
MCP tools sending text to our chat
agents.
But that's not ideal, right? Because
chat chat is really text is really bad
and actually this was one of the main
blockers of companies or
tools not to send their data to ChatGPT
because they didn't want to be reduced
to like
this thing, this wall of text where you
don't have identity. You don't know if
if this information came from Shopify,
Booking, Expedia or any other company.
But what if
every tool or every company could just
send its own UI to the chat? So instead
of us looking at
this thing we can just imagine that for
the parts that are relevant
we can have the relevant UI, the
relevant UI for Shopify, from Hugging
Face, from Monday and this can can be
not only presentation as this can be
interactive. So we want to be able to
respond to a user click on this Hugging
Face widget.
So we don't have to imagine it anymore.
So back in May last year I released
MCPUI. The concept was pretty simple.
There were a bunch of stuff around it
but the concept was pretty simple of how
do we take UI and find some way to pass
it over MCP. We need some general way to
do that so we can both have MCP over
sorry have UI over MCP and have the
communication between the UI and the
host. Obviously it also had community
SDKs and the general motivation was
we don't need to throw away everything
we know about UI and UX just to get into
this new world of agents.
We can simply adapt and use that and
preserve our branding and identity and
still be practical.
And
just a few months back
MCPUI we partnered with Anthropic and
OpenAI to really put this into the MCP
standard as the first official extension
called MCP apps.
And as you can see here it made kind of
a big splash. We had support from a
bunch of hosts. Now VS Code and Cursor
and Claude and ChatGPT and like
Microsoft Copilot and a bunch of others
already adopted it and you have these
really cool interfaces built in right to
your your assistance.
And going back a little bit there were
early adopters for MCPUI. So these are
some of the companies
shout out for Hugging Face Sean if
you're here that adopted MCPUI. So even
a year ago Shopify was already sending
MCPUI chunks of all of millions stores
of Shopify online stores send MCPUI
chunks. Hugging Face all of Hugging Face
spaces were MCPUI widgets.
And now once it's standardized we have
much bigger adoption.
So we have a VS Code, we have a Cursor,
we have Copilot,
GitHub ChatGPT are supporting MCP apps.
Not only that ChatGPT are recommending
MCP apps as the way to build ChatGPT
apps. So it's really standardized.
Obviously shout out Postman and Goose
and Claude
the first one that released Claude apps
that actually supported MCP apps. But
it's not just support from the big
companies or the big hosts. We also have
huge community adoption.
We have people that building plugins
around MCP apps, building workshops
around MCP apps
building all all kind of support around
MCP apps. Spy just announced support in
MCP apps which is amazing. It's like a
terminal, right? But we have UI in the
terminal right now and we have all of
these advocates that that are speaking
about MCP apps. There are even companies
that are built around MCP apps to help
other businesses build those apps.
We have an official MCP apps repo with
with Anthropic and OpenAI. There's a an
amazing community engagement. We
recommend for you to check it out. We
have a
a work group public work group meetings
and we're meeting tri-weekly once every
3 weeks just to push the standard
forward because as we can see it's going
to be the global standard for UI inside
chat apps. And we're going to talk a
little bit about the concepts behind
MCPUI.
Yeah, so let's talk about the core
concept. So the first and obvious one is
how do we even pass UI over MCP?
Um
so in the old world of a few months back
when we wanted to do whatever let's say
we wanted to create the best playlist
ever we would type something into the
chat and it would send out a tool call
to our MCP server.
So far so good. Um what we would get
back in response
would be text
and as you know text is sub-optimal. But
if we are using MCP apps what we can do
is instead return a resource. So we can
return this actual HTML back to the
host. The host supports MCP apps so you
can take that HTML and transform it into
an interactive application.
And when we say interactive we mean
interactive. So this is not just
presentational. MCP apps also
standardize the way that this UI can
talk to the user and to its back end
because just imagine if the user wants
to favorite this song then the
sub-optimal thing that would happen is
that for this UI to speak to Spotify's
back end and and favorite this song and
then later when the user asks Claude
remind me which song I favorited Claude
wouldn't know because the UI spoke
directly to the back end. But MCP app
standardizes this message passing so
that every UI chunk sends message back
to the host. The host gets this message
message in this case like a tool call
and the host decides what to do. In this
case it decides to actually call the
server tool but the control is in the
hands of the host and everything stays
in context.
Okay, so seeing is believing. Let's see
a quick example of what that looks like.
So this is Claude like actual Claude.
And let's say that I want to do
something like analyze my funnel. So I
type that in in the old world it would
go out to let's say PostHog and I would
get this textual response which is
accurate
but it doesn't really help me understand
what's going on. I have to read this
whole thing now and kind of try and see
what's the deal.
But with MCP apps instead of doing this
I can just say show me.
And now
Okay, the clicker is not yet up to par
but now what we'll have is this nice UI
visualization actually created by
PostHog. So they control the identity,
the experience. It's actually their
component that you would see in the
website.
And now I have like a really cool way to
just see the funnel in one glance.
That's not all. I mean the MCP app isn't
just UI generated by the server.
There are also really cool innovations
from Anthropic and other companies
to do generative UI on top of MCP apps
or even first party UI in general.
So for instance this Claude feature
says
let's say that I don't know what a
funnel is which is
reasonable.
I can ask what a funnel is and instead
again getting that long textual answer
what would happen is that Claude would
be able to generate this UI for me
explain exactly what I need or create
some UI that I need to do some action
and present it to me in a way that is
very digestible.
This is obviously applicable to a bunch
of other stuff and you'll see it in
other hosts as well.
So let's look at
like another cool thing is here is that
this is not just presentational. Like we
said it's interactive so I can just
click on it and it would give me a
follow-up on the specific step of the
funnel
that I have a questions about. So you
can imagine how this goes into into a
bunch of other directions when you want
to do interactive exploration.
So how does it work
in general?
So
let's go over the stages.
We went to the host and we prompted
something. We asked for
funnel data. What happened is that it
sent out a tool call to our MCP server.
And again instead of just returning text
that tool was actually pointing to a
resource.
That resource was our UI.
So if you look at the
code for it then it's super simple like
you just register a resource and you
just have it. So return that resource
back to the host.
The host because it also supports MCP
apps can take that
transform it, put it if you see like
look at just code wise if you want to
build a host it just
like
react component that accepts that
resource and also this callback which is
the way that we handle messaging between
the UI and the host.
So you take that and you render it
inside a sandbox so it's secure.
Like we said it's not presentational, so
we also click on it. And once you click,
what happens is that there's a bunch of
events going back from the UI, from this
view,
all the way back to the model.
So, it can actually take out to do other
tool calls or even follow up messages on
your behalf or fetch additional
resources,
really completing this end-to-end
bidirectional flow.
So, when we look at that, when we look
at this flow, when we look at this
architecture, it's not just technical
change. It's not not just a technology
that's changing. It's also how we
perceive the web. Because this is
ushering a new web, a web where
we don't need websites. We don't need
all of those tabs just to organize
an anniversary. We don't need to
familiarize familiarize ourselves with
bunch of different UIs.
We don't need to
force ourselves to pass our intents to
dashboards of companies where 90% of
this UI is not relevant for an agent. If
I have a personal assistant, I don't
need most of it. I can just take this
and I can just decompose it to atoms and
let my agent build them for me, right?
Because I I have my assistant I convey
my intent to. So, for example, my agent,
my proactive assistant can say, "Yeah, I
see that you have an important
anniversary coming." And instead of
Google just sending the data, Google can
actually send a chunk right of the
Google calendar. And now, this is a
win-win-win because for Google, it's
amazing. It gets to keep its identity.
For me, it's good because I know I know
this interface. I I recognize that it's
Google.
But it's good for the host as well
because the host doesn't need to render
that.
We have domain experts. We have
companies that spend decades in
perfecting user journeys and we can't
expect Claude or ChatGPT or any host to
automatically generate all those UIs.
And if I continue and I ask something
for Amazon, so instead of Amazon just
sending me the data of the product and
that's we reducing itself to be just a
database, it can just send this chunk of
Amazon. And I look at it and say, "Oh,
it's Amazon. Okay, I know."
And then, I can complete the entire
planning of my anniversary,
the entire planning in just one
assistant chat, right? And you can see
that it pulled just the relevant parts
of it because it pulled
the venue from Booking, but it knows me.
It knows that I prefer something that's
close to nature and not in the city. So,
it also knew to pull the the map from
Booking. That's because the assistant
knew me. Booking doesn't know me that
well, but Booking knows how to how to
book a venue. So, this is real synergy
between those. Um
and we have to think about this new
interaction mindset. Why? Because we
have to remember that in this flow,
the apps, the services, the tools, they
no longer own my journey in the in the
platform, right? If I click something in
Booking, it doesn't go to Booking's
backend. It goes to it goes to the host
like we said. So,
what we did with MCP apps is that every
click, every interaction actually sends
this kind of like message back to the
host.
Um and like we said, it this is breaking
the model for for all of the for all the
companies. So, this is like a new
philosophy.
But the messages can can be put on a
spectrum.
So, this spectrum represents how much
control the UI wants for itself and how
much control it gives to the host. So,
for example, notification, that's the
highest level of control the UI has. It
just notifies the host that something
happened. For example, if I increase the
number of items in my cart, it doesn't
need to to go to the host. It goes back
to Shopify. But just notifies the host
that something happened. A tool call is
the UI telling the host call a tool. And
prompt, that's like the the UI just
releases all control and say tells to
the host, "Just run this prompt and see
what happens."
Um so, MCP apps really standardizes this
new software flow and that's something
that we need to remember.
Perhaps in 2 years, we won't have
browsers as as we know them. We won't
have websites as we know them. We'll
have a personal assistant that accepts
only small chunks of UI and this will
replace our our web journey.
Um 2026 is going to be the year that
we're going to standardize
MCP apps as a global standard for UI.
And um
Yeah, but the spec is still evolving. I
mean, there are a bunch of stuff
happening. Just in those last, I think 2
months, we shipped all of those or
almost all of those based on community
feedback, based on community work done
by the work group, which you can join.
So, you're encouraged to do this.
There's the official SDK, XApps. You can
just use that to build your
applications. It's very simple. There
are built-in skills. So, you just let
your coding agent do it for you. You
don't actually need to code anything.
God forbid.
So, you have this and then,
it's important to remember that the
reason to use this SDK is that it's just
always compliant with the spec. Like we
always update both.
So, feel free to use it.
You can see that just the issues and
stuff that people open on it. So, please
feel free to do it.
So, what's next for MCP apps? Obviously,
there are a bunch of stuff in the
pipeline. But just to give you like a
taste.
So, we have reusable views. The idea
here is that today, for simplicity,
whenever you render an app, we actually
render a new one. So, let's say that
you're working with the same app
multiple times. If you keep re-rendering
it and you have some heavy applications.
So, for example, Autodesk had this
problem.
It just takes a really long time and
your experience will be bad. So, we are
working on ways to solve it. The first
one is just why can't we just reference
that same view and push data into it?
But the second one is actually to take
this and flip the script.
So, another thing that we've been
working on is interactions not for the
user to interact with the with the app
and then the app sends it to the model,
which we just saw.
But what if we want the model to be able
to interact with the view? We want
Claude to be able to click on buttons or
to fill forms or do anything inside the
UI. So, today we have solutions like
WebMCP and things like that. We are
working on a standardized way. So, when
the user interacts with the model, the
model the app can actually expose tools
for the model to interact with it, thus
closing this loop. Um and you can you
can check out the PR. It's still an open
PR, but that's something that we work on
in the in the committee.
And the most important thing is that MCP
apps supports all ways of all the ways
of generating UI. Because that's a
question that we always get asked. What
about generative UI? So, if we put it on
a spectrum, then we have the predefined
UI. That's like the classic MCP app.
That's like Airbnb building its own UI,
sending it to Claude or to ChatGPT.
That's predefined. That's a black box.
That's good for 8% of the cases. But we
have things that are a little bit more
structured like declarative UI. Like if
you know JSON render or
things like that where the app can just
declare the the structure of the UI, but
the components are being rendered by the
host. So, the host and the app are
sharing the UI functionality and
visibility. That's good for hosts that
want to control the look and feel of the
apps. For example, just imagine Claude
probably doesn't want to have a Booking
UI then an Airbnb UI then an Expedia UI
in the same chat flow, right? So, this
is pretty pretty good middle ground. And
in the other hand, you have the fully
generative UI, which is what
Claude Anthropic released a few weeks
ago
where the model just generates the UI
out of thin air. Now,
the nice thing is that MCP apps is
really agnostic to how you generate the
UI. MCP apps doesn't assume that Airbnb
created created the UI.
Any any part of this process can create
the UI and the feature that Claude
released, which is the generative UI on
the fly, actually uses MCP apps under
the hood, right? So, it's a generative
UI that's being streamed into an MCP app
and then MCP apps closes that closes
that loop. So, it's good for third-party
UI, which is the black box, but also
first-party UI. So, that's
that's something that we're working on
standardizing. Um we're doing a lot of
work to do interoperability with other
UI protocols like A2UI, which is the
generative UI protocol by Google, WebMCP
like we said, and we just want to build
a unified standard for UI in chat apps.
Um yeah, and that's a like a cool
summary about MCP apps. If you build an
MCP app, it runs everywhere. LibreChat
is an MCP MCP app client. ChatGPT is a
ChatGPT app client, but the same
application works for the same codebase
works for every
every host.
Cool.
So,
if you think about it, this isn't just
some tech, right? This isn't some
protocol. This is a new way to
distribute applications.
>> [clears throat]
>> So, if you look at just a few months
back, Sam Altman said, I think it was
October, that 800 million people are
using ChatGPT on a weekly basis. That's
10% of the world's population.
It's insane. The internet took like 13
years to get to that number of users.
And
now, it's not even 800 million. It's a
billion. And it's not just ChatGPT. It's
also Claude and VS Code.
You have a potential audience that is at
least 160 times the number of users that
iPhone had when the App Store launched.
So, how do you get started? There are
two main ways. As a server, like if
you're developing an app. So, like I
said, you go to XApps repo. There's a
key QR code if you want to do it
quickly. You have the skills. Just use
that.
The other way is in details. The other
way is that if you're a host, so if
you're building an application that
actually hosts application, you can just
take MCP UIs SDK, which is the
recommended client SDK. It's also just
fully compliant with the spec. You just
take that React component and you're
done. It just supports apps out of the
box and then get hundreds of apps from
booking and other providers out of the
box.
Um so just just to emphasize
there was a slide about skills. So it's
really easy to create an MCP app. Just
if you visit the site we just pass
through it but it's just a skill. You
just push it to cloud code and you
generate an MCP app out of thin air. If
you want to get involved in the spec
itself in how MCP apps are going to
operate and if you want to help build
the future of UI in in agents
then obviously visit the official MCP
apps repo that's X apps. Open an issue,
open a PR, participate in the
discussion. We also have the official
discord for the MCP apps committee where
we do surveys and we interact with the
with the community and with other hosts
to to decide on things that relate to
MCP apps. And there's the community
discord which is I think the coolest
place to be because you have all of the
users of MCP apps be it
people that build servers, companies
that build hosts that just talk to each
other, share tips,
troubleshooting,
asking for features.
That's the place to be if you're
interested in MCP apps.
Uh so we said some scary stuff along the
way
like the web is dying and there's like
all the websites are meaningless at the
point at this point but
I kind of hope that you don't look at
this as a threat but more as an
opportunity like basically a once in a
20 years opportunity to think about your
apps again and think what is the core
user experience that we're looking to
get and imagine it not as a monolithic
single app where people go to but
actually a part of a new web of
applications, these chunks of UI that
allow you to communicate between each
other using a smart model in between.
That's that's pretty insane.
And with MCP apps even this early in the
ecosystem in this early in how agentic
apps work we already have
standardization and they all work the
same. You can write your app once and it
will run everywhere.
So what does the future look like?
We're not yet at Jarvis
but with MCP in general and MCP apps in
particular you can bring experiences
that were impossible just a few months
ago to every host in the world including
your own.
So thank you. Thank you very much.
>> [applause]
[music]