MCP UI: Extending the frontier — Liad Yosef and Ido Salomon, MCP Apps
Channel: aiDotEngineer
Published at: 2026-05-06
YouTube video id: o-zkvb0iFDQ
Source: https://www.youtube.com/watch?v=o-zkvb0iFDQ
[music] >> Okay, hi everyone. We built this talk well not really yesterday we built this talk this morning and it might already be out of date. I'm Eda Solomon. I'm the creator of MCPUI, co-creator of MCP apps and maintainer. Also creator of Agent Craft if you were on the previous session. I'm Liad. I work with Eda on MCPUI, co-created the MCP app spec with Eda and I'm also co-founder of Ergo Labs which is like human agentic interfaces a company. So MCP apps are all around us. You might not even realize it but the interactive applications you see today in ChatGPT, Claude and others are actually based on MCP and the MCP app spec. But why do we need MCP apps and what's MCP apps? I mean you heard David this morning talk about it a little bit. We used to text, we used to MCP tools sending text to our chat agents. But that's not ideal, right? Because chat chat is really text is really bad and actually this was one of the main blockers of companies or tools not to send their data to ChatGPT because they didn't want to be reduced to like this thing, this wall of text where you don't have identity. You don't know if if this information came from Shopify, Booking, Expedia or any other company. But what if every tool or every company could just send its own UI to the chat? So instead of us looking at this thing we can just imagine that for the parts that are relevant we can have the relevant UI, the relevant UI for Shopify, from Hugging Face, from Monday and this can can be not only presentation as this can be interactive. So we want to be able to respond to a user click on this Hugging Face widget. So we don't have to imagine it anymore. So back in May last year I released MCPUI. The concept was pretty simple. There were a bunch of stuff around it but the concept was pretty simple of how do we take UI and find some way to pass it over MCP. We need some general way to do that so we can both have MCP over sorry have UI over MCP and have the communication between the UI and the host. Obviously it also had community SDKs and the general motivation was we don't need to throw away everything we know about UI and UX just to get into this new world of agents. We can simply adapt and use that and preserve our branding and identity and still be practical. And just a few months back MCPUI we partnered with Anthropic and OpenAI to really put this into the MCP standard as the first official extension called MCP apps. And as you can see here it made kind of a big splash. We had support from a bunch of hosts. Now VS Code and Cursor and Claude and ChatGPT and like Microsoft Copilot and a bunch of others already adopted it and you have these really cool interfaces built in right to your your assistance. And going back a little bit there were early adopters for MCPUI. So these are some of the companies shout out for Hugging Face Sean if you're here that adopted MCPUI. So even a year ago Shopify was already sending MCPUI chunks of all of millions stores of Shopify online stores send MCPUI chunks. Hugging Face all of Hugging Face spaces were MCPUI widgets. And now once it's standardized we have much bigger adoption. So we have a VS Code, we have a Cursor, we have Copilot, GitHub ChatGPT are supporting MCP apps. Not only that ChatGPT are recommending MCP apps as the way to build ChatGPT apps. So it's really standardized. Obviously shout out Postman and Goose and Claude the first one that released Claude apps that actually supported MCP apps. But it's not just support from the big companies or the big hosts. We also have huge community adoption. We have people that building plugins around MCP apps, building workshops around MCP apps building all all kind of support around MCP apps. Spy just announced support in MCP apps which is amazing. It's like a terminal, right? But we have UI in the terminal right now and we have all of these advocates that that are speaking about MCP apps. There are even companies that are built around MCP apps to help other businesses build those apps. We have an official MCP apps repo with with Anthropic and OpenAI. There's a an amazing community engagement. We recommend for you to check it out. We have a a work group public work group meetings and we're meeting tri-weekly once every 3 weeks just to push the standard forward because as we can see it's going to be the global standard for UI inside chat apps. And we're going to talk a little bit about the concepts behind MCPUI. Yeah, so let's talk about the core concept. So the first and obvious one is how do we even pass UI over MCP? Um so in the old world of a few months back when we wanted to do whatever let's say we wanted to create the best playlist ever we would type something into the chat and it would send out a tool call to our MCP server. So far so good. Um what we would get back in response would be text and as you know text is sub-optimal. But if we are using MCP apps what we can do is instead return a resource. So we can return this actual HTML back to the host. The host supports MCP apps so you can take that HTML and transform it into an interactive application. And when we say interactive we mean interactive. So this is not just presentational. MCP apps also standardize the way that this UI can talk to the user and to its back end because just imagine if the user wants to favorite this song then the sub-optimal thing that would happen is that for this UI to speak to Spotify's back end and and favorite this song and then later when the user asks Claude remind me which song I favorited Claude wouldn't know because the UI spoke directly to the back end. But MCP app standardizes this message passing so that every UI chunk sends message back to the host. The host gets this message message in this case like a tool call and the host decides what to do. In this case it decides to actually call the server tool but the control is in the hands of the host and everything stays in context. Okay, so seeing is believing. Let's see a quick example of what that looks like. So this is Claude like actual Claude. And let's say that I want to do something like analyze my funnel. So I type that in in the old world it would go out to let's say PostHog and I would get this textual response which is accurate but it doesn't really help me understand what's going on. I have to read this whole thing now and kind of try and see what's the deal. But with MCP apps instead of doing this I can just say show me. And now Okay, the clicker is not yet up to par but now what we'll have is this nice UI visualization actually created by PostHog. So they control the identity, the experience. It's actually their component that you would see in the website. And now I have like a really cool way to just see the funnel in one glance. That's not all. I mean the MCP app isn't just UI generated by the server. There are also really cool innovations from Anthropic and other companies to do generative UI on top of MCP apps or even first party UI in general. So for instance this Claude feature says let's say that I don't know what a funnel is which is reasonable. I can ask what a funnel is and instead again getting that long textual answer what would happen is that Claude would be able to generate this UI for me explain exactly what I need or create some UI that I need to do some action and present it to me in a way that is very digestible. This is obviously applicable to a bunch of other stuff and you'll see it in other hosts as well. So let's look at like another cool thing is here is that this is not just presentational. Like we said it's interactive so I can just click on it and it would give me a follow-up on the specific step of the funnel that I have a questions about. So you can imagine how this goes into into a bunch of other directions when you want to do interactive exploration. So how does it work in general? So let's go over the stages. We went to the host and we prompted something. We asked for funnel data. What happened is that it sent out a tool call to our MCP server. And again instead of just returning text that tool was actually pointing to a resource. That resource was our UI. So if you look at the code for it then it's super simple like you just register a resource and you just have it. So return that resource back to the host. The host because it also supports MCP apps can take that transform it, put it if you see like look at just code wise if you want to build a host it just like react component that accepts that resource and also this callback which is the way that we handle messaging between the UI and the host. So you take that and you render it inside a sandbox so it's secure. Like we said it's not presentational, so we also click on it. And once you click, what happens is that there's a bunch of events going back from the UI, from this view, all the way back to the model. So, it can actually take out to do other tool calls or even follow up messages on your behalf or fetch additional resources, really completing this end-to-end bidirectional flow. So, when we look at that, when we look at this flow, when we look at this architecture, it's not just technical change. It's not not just a technology that's changing. It's also how we perceive the web. Because this is ushering a new web, a web where we don't need websites. We don't need all of those tabs just to organize an anniversary. We don't need to familiarize familiarize ourselves with bunch of different UIs. We don't need to force ourselves to pass our intents to dashboards of companies where 90% of this UI is not relevant for an agent. If I have a personal assistant, I don't need most of it. I can just take this and I can just decompose it to atoms and let my agent build them for me, right? Because I I have my assistant I convey my intent to. So, for example, my agent, my proactive assistant can say, "Yeah, I see that you have an important anniversary coming." And instead of Google just sending the data, Google can actually send a chunk right of the Google calendar. And now, this is a win-win-win because for Google, it's amazing. It gets to keep its identity. For me, it's good because I know I know this interface. I I recognize that it's Google. But it's good for the host as well because the host doesn't need to render that. We have domain experts. We have companies that spend decades in perfecting user journeys and we can't expect Claude or ChatGPT or any host to automatically generate all those UIs. And if I continue and I ask something for Amazon, so instead of Amazon just sending me the data of the product and that's we reducing itself to be just a database, it can just send this chunk of Amazon. And I look at it and say, "Oh, it's Amazon. Okay, I know." And then, I can complete the entire planning of my anniversary, the entire planning in just one assistant chat, right? And you can see that it pulled just the relevant parts of it because it pulled the venue from Booking, but it knows me. It knows that I prefer something that's close to nature and not in the city. So, it also knew to pull the the map from Booking. That's because the assistant knew me. Booking doesn't know me that well, but Booking knows how to how to book a venue. So, this is real synergy between those. Um and we have to think about this new interaction mindset. Why? Because we have to remember that in this flow, the apps, the services, the tools, they no longer own my journey in the in the platform, right? If I click something in Booking, it doesn't go to Booking's backend. It goes to it goes to the host like we said. So, what we did with MCP apps is that every click, every interaction actually sends this kind of like message back to the host. Um and like we said, it this is breaking the model for for all of the for all the companies. So, this is like a new philosophy. But the messages can can be put on a spectrum. So, this spectrum represents how much control the UI wants for itself and how much control it gives to the host. So, for example, notification, that's the highest level of control the UI has. It just notifies the host that something happened. For example, if I increase the number of items in my cart, it doesn't need to to go to the host. It goes back to Shopify. But just notifies the host that something happened. A tool call is the UI telling the host call a tool. And prompt, that's like the the UI just releases all control and say tells to the host, "Just run this prompt and see what happens." Um so, MCP apps really standardizes this new software flow and that's something that we need to remember. Perhaps in 2 years, we won't have browsers as as we know them. We won't have websites as we know them. We'll have a personal assistant that accepts only small chunks of UI and this will replace our our web journey. Um 2026 is going to be the year that we're going to standardize MCP apps as a global standard for UI. And um Yeah, but the spec is still evolving. I mean, there are a bunch of stuff happening. Just in those last, I think 2 months, we shipped all of those or almost all of those based on community feedback, based on community work done by the work group, which you can join. So, you're encouraged to do this. There's the official SDK, XApps. You can just use that to build your applications. It's very simple. There are built-in skills. So, you just let your coding agent do it for you. You don't actually need to code anything. God forbid. So, you have this and then, it's important to remember that the reason to use this SDK is that it's just always compliant with the spec. Like we always update both. So, feel free to use it. You can see that just the issues and stuff that people open on it. So, please feel free to do it. So, what's next for MCP apps? Obviously, there are a bunch of stuff in the pipeline. But just to give you like a taste. So, we have reusable views. The idea here is that today, for simplicity, whenever you render an app, we actually render a new one. So, let's say that you're working with the same app multiple times. If you keep re-rendering it and you have some heavy applications. So, for example, Autodesk had this problem. It just takes a really long time and your experience will be bad. So, we are working on ways to solve it. The first one is just why can't we just reference that same view and push data into it? But the second one is actually to take this and flip the script. So, another thing that we've been working on is interactions not for the user to interact with the with the app and then the app sends it to the model, which we just saw. But what if we want the model to be able to interact with the view? We want Claude to be able to click on buttons or to fill forms or do anything inside the UI. So, today we have solutions like WebMCP and things like that. We are working on a standardized way. So, when the user interacts with the model, the model the app can actually expose tools for the model to interact with it, thus closing this loop. Um and you can you can check out the PR. It's still an open PR, but that's something that we work on in the in the committee. And the most important thing is that MCP apps supports all ways of all the ways of generating UI. Because that's a question that we always get asked. What about generative UI? So, if we put it on a spectrum, then we have the predefined UI. That's like the classic MCP app. That's like Airbnb building its own UI, sending it to Claude or to ChatGPT. That's predefined. That's a black box. That's good for 8% of the cases. But we have things that are a little bit more structured like declarative UI. Like if you know JSON render or things like that where the app can just declare the the structure of the UI, but the components are being rendered by the host. So, the host and the app are sharing the UI functionality and visibility. That's good for hosts that want to control the look and feel of the apps. For example, just imagine Claude probably doesn't want to have a Booking UI then an Airbnb UI then an Expedia UI in the same chat flow, right? So, this is pretty pretty good middle ground. And in the other hand, you have the fully generative UI, which is what Claude Anthropic released a few weeks ago where the model just generates the UI out of thin air. Now, the nice thing is that MCP apps is really agnostic to how you generate the UI. MCP apps doesn't assume that Airbnb created created the UI. Any any part of this process can create the UI and the feature that Claude released, which is the generative UI on the fly, actually uses MCP apps under the hood, right? So, it's a generative UI that's being streamed into an MCP app and then MCP apps closes that closes that loop. So, it's good for third-party UI, which is the black box, but also first-party UI. So, that's that's something that we're working on standardizing. Um we're doing a lot of work to do interoperability with other UI protocols like A2UI, which is the generative UI protocol by Google, WebMCP like we said, and we just want to build a unified standard for UI in chat apps. Um yeah, and that's a like a cool summary about MCP apps. If you build an MCP app, it runs everywhere. LibreChat is an MCP MCP app client. ChatGPT is a ChatGPT app client, but the same application works for the same codebase works for every every host. Cool. So, if you think about it, this isn't just some tech, right? This isn't some protocol. This is a new way to distribute applications. >> [clears throat] >> So, if you look at just a few months back, Sam Altman said, I think it was October, that 800 million people are using ChatGPT on a weekly basis. That's 10% of the world's population. It's insane. The internet took like 13 years to get to that number of users. And now, it's not even 800 million. It's a billion. And it's not just ChatGPT. It's also Claude and VS Code. You have a potential audience that is at least 160 times the number of users that iPhone had when the App Store launched. So, how do you get started? There are two main ways. As a server, like if you're developing an app. So, like I said, you go to XApps repo. There's a key QR code if you want to do it quickly. You have the skills. Just use that. The other way is in details. The other way is that if you're a host, so if you're building an application that actually hosts application, you can just take MCP UIs SDK, which is the recommended client SDK. It's also just fully compliant with the spec. You just take that React component and you're done. It just supports apps out of the box and then get hundreds of apps from booking and other providers out of the box. Um so just just to emphasize there was a slide about skills. So it's really easy to create an MCP app. Just if you visit the site we just pass through it but it's just a skill. You just push it to cloud code and you generate an MCP app out of thin air. If you want to get involved in the spec itself in how MCP apps are going to operate and if you want to help build the future of UI in in agents then obviously visit the official MCP apps repo that's X apps. Open an issue, open a PR, participate in the discussion. We also have the official discord for the MCP apps committee where we do surveys and we interact with the with the community and with other hosts to to decide on things that relate to MCP apps. And there's the community discord which is I think the coolest place to be because you have all of the users of MCP apps be it people that build servers, companies that build hosts that just talk to each other, share tips, troubleshooting, asking for features. That's the place to be if you're interested in MCP apps. Uh so we said some scary stuff along the way like the web is dying and there's like all the websites are meaningless at the point at this point but I kind of hope that you don't look at this as a threat but more as an opportunity like basically a once in a 20 years opportunity to think about your apps again and think what is the core user experience that we're looking to get and imagine it not as a monolithic single app where people go to but actually a part of a new web of applications, these chunks of UI that allow you to communicate between each other using a smart model in between. That's that's pretty insane. And with MCP apps even this early in the ecosystem in this early in how agentic apps work we already have standardization and they all work the same. You can write your app once and it will run everywhere. So what does the future look like? We're not yet at Jarvis but with MCP in general and MCP apps in particular you can bring experiences that were impossible just a few months ago to every host in the world including your own. So thank you. Thank you very much. >> [applause] [music]