The State of Generative Media - Gorkem Yurtseven, FAL
Channel: aiDotEngineer
Published at: 2025-07-16
YouTube video id: P370D8Kmlkw
Source: https://www.youtube.com/watch?v=P370D8Kmlkw
[Music] It's so nice to see a generative media track in the AI conference, AI engineer conference this year. Um, we my company, I work at this company called file.ai. Uh we call ourselves a generative media platform. Uh and this is a term that's been around for a while. Uh but we kind of owned it and we called it uh the name name of our of our company generative media platform. Uh and the way we define it at least is it it's a generative video audio or image. And uh our company is seeing all these kinds of models using our inference engine and we are partnering up with some closed source model providers as well. So I've been doing this for a couple of years but it is a really really new market and throughout the talk I'm going to walk you through how we got here and a little bit of the history and what's next. I remember in 2022 when Sam Alman started tweeting about Delhi too. Um I was working at home. It was end of COVID. I know COVID took a little longer in San Francisco, but I I remember sitting on the floor. Could not believe my eyes. People were tweeting at him and he was tweeting back pretty high definition images of incredible things that people people were tweeting. Like looking back to it, obviously it all looks kind of bad quality, but I remember at the time I was I thought this was the most incredible technology ever. And I was I was in the industry. I I knew what was going on. not as much as today, but I I thought at that time OpenAI was so far ahead of anything else and it's going to be so so hard for uh normal people to catch up to this technology. I was I was I remember this was one of the biggest WTF moments of my life. But then you can tell me, hey Gam, this was this was all gonna happen. Uh there was other AI waves before before the this last big wave. Uh there was a GAN breakthrough that people did similar things using GANs deep dream from Google went through a phase and then there was even a viral uh consumer AI application of it called Prism. People uploaded their selfies and they were able to change their avatars. But uh the capabilities and the applications of the technology was not nearly as much as what generative media can be used today. Not only the previous AI wave, generative media or being able to create art with computers has been around kind of since the computers has been around. Um this guy Harold Cohan, this is a recreation of his project, but basically he created these massive computers to draw draw on these huge canvas uh to to create art similar to how a human would draw. And then we have uh computer graphics and generative graphics, things like that. Throughout the years, people tried to generate visuals and art using different computing technologies all along. Uh right after Sam Alman's tweet, uh playing field evened out really really quickly. So Del 2 was April 6. Right after that, Mjourney released their initial model in beta as as a Discord bot. And then very quickly after that uh stable diffusion opensourced their model which was a huge huge thing. People now were able to run a technology similar to Delhi 2 in their homes in their home GPUs. people started building services around it and then SDXI came out and then now there's many different image models open and closed source and most recently flux was released um early uh in the in the in the summer last year and with all the this playing field evening out the the marginal cost ofation is approaching zero and I'm I'm very careful when I choose my words here. I'm not saying marginal cost of creativity. It's marginal cost of creation. I think the storytelling is still really important. Creativity is still really important. But once once you have that set up, creating that next new thing is is approaching zero. And we believe this is going to have huge impacts on different kinds of industries and markets. So anything from social media, advertising, marketing, fashion, obviously film and movies, gaming and e-commerce is going to be transformed by generative media. And this transformation is going to continue until all content one way or the other is is impacted by AI. Um so if you've been following software has been eating media all along. Uh YouTube just from basically ads is generating more revenue than any other media company except Disney. This is this is pretty remarkable. And with Disney revenue there is clearly non media revenue in there. They have parks, they have cruise ships, they have other things. So it's not too hard to say YouTube right now is one of the highest revenue generating media companies in the world and it is happening through through ads and whenever uh ad industry is is impacted by technology it usually grows in volume. So we believe the same thing is going to happen with with generative media and ads. uh we believe ad industry is one of is going to be the first industries to be impacted at a large scale by generative media. We believe the the size of the industry is going to increase. So it's really funny in since 2000 every ad spend has so grew three three times since 2000 but all that growth happened software ads. So we believe something similar is going to happen with AI uh driven ads and ad industry is going to grow and most of that growth is going to come from AI and there are s several different ways how this can happen. We believe ads themselves are hyperpersonalized. So this might mean um you are generating many different versions of the same ad but maybe 10,000 different demographics really quickly or it can it can also mean it's targeted towards a certain individual. If you are coming from a certain website then the ad can be generated on the fly and then it could also be be interactive in in ways you know that that I just mentioned things generated on the fly and that that might mean many different things in the in the industry. One other thing why I think generative media fits the ad industry very well is the abundance of content. For example, I I probably won't watch a blockbuster movie every single day. So like even if we have thousand more movies this year, I I probably have to sit down and watch a movie a day to go through all of them. But I probably won't be able to do that. But ads, there can be kind of unlimited content. Every time I'm phone, I'm seeing ads on TV. There are ads all the time. And it doesn't matter if the ad is different. like maybe there there needs to be some consistency but ad industry can can actually survive with with a lot more content and things can get a lot creative. So we we were ahead of the time a little bit last year we did a we did an ad ad promo with A24 Civil War movie and it was one of those interactive ideas that I was I was talking about. So, if you've seen the the movie, it's about uh a imaginary civil war in the US and they had this campaign of these little green toy soldiers and they created a live marketing site where you could put a selfie and then uh we we created a little toy soldier with your selfie and your description. Uh and they put this on Time Square. people were able to display their own faces on these little green toy soldiers. So AI is going to help us create experiences like this that are interactive and personalized. The the other trend we are watching really closely is e-commerce. Um if you've been paying attention to it, e-commerce is growing about 1% uh every year getting percentage of the US retail industry. So this is a trend that's happening with or without AI and we believe generative media is gonna play a big big part on e-commerce's growth as well. It's it's already there are many companies trying to redefine how people shop online and because online shopping is very visual AI can add a lot of interactivity to the experience. In fact, it's one of the earliest product market fits I've seen in generative media. This has been happening for a couple months, maybe a year, that virtual tryon is is one of the clearest product market fits that I see in the in the AI industry. Many different retailers, e-commerce websites are adapting adapting this technology. Many different startups are being built uh on it. So, I believe this is going to be everywhere. every retailer, every e-commerce website is a potential uh generative media user. And then there is video. Um so when when Sam tweeted Delhi too, I thought OpenAI was so far ahead and no one was able to catch up. People caught up incredibly fast. So this time he did the same trick with Sora when Sora was released a year and a half ago basically. And this time around maybe Sora was even more impressive than Delhi 2 in terms of uh how far ahead things looked like. But uh this time around I was incredibly excited that researchers at OpenAI was able to actually do things like this. And from from the past experience and I know that if this is possible in in one place people are going to go be able to do similar things in others. So I was incredibly excited when Sam Elman started tweeting about Sora because I knew very soon a technology like this was going to be everywhere. In fact, it started happening. So this is a little snapshot of our company's revenue which I think is a good proxy of of the entire market. uh early this year in October we barely had any video model uh usage in the platform and in February this this went all the way up to 18%. Um I didn't get time to update it but I looked yesterday it's around 30% today. So it is growing really fast even though it's expensive even though it still doesn't work as well. uh video models are going to completely take over the generative media market and I have some predictions about how much bigger the video market is going to be compared to the image market. So rough math, but we believe video models are 20x more compute intensive. And let's say if it's 5x more engaging and it's going to impact more industries because it's going to be more useful to the industry, we believe all said and done the video market is going to be generative video market is going to be 100x to 250x uh bigger than the image generation market. uh and we are just just scratching the surface here. I believe the image generation market has a ton of growth that's going to happen in the in the next couple of years as well, but video is growing much much faster than that. And when all said and done, it's going to be a much bigger uh market. And yeah, video models are leveling up as well. You probably have all seen the the newest model from deep mind from Google V3. Uh we keep adding new capabilities uh into the video models. First it was consistency and then now with sound uh really the things people are generating with it is is is incredible and every time a new capability is added it unlocks a different use case in the in the industry. So, um, it's it's not on our platform yet, but I'm I'm very curious to see how people are going to start creating using V3 and what what different use cases it's going to unlock in the ad industry or or or the e-commerce industry. So, that is very interesting to see. So, where is where is the video market going? Uh, we believe there is so so much to improve. um we are going to have faster and cheaper video generation until video generation basically becomes real time. Uh so generating one second of video in 1 second. So you you'll be able to stream generated content uh to the user and this is going to have very different implications on how people interact with this this technology. Everything uh potentially becomes interactive. the line between games and and movies uh gets blurred. Um so how is this going to impact social apps? How it's going to impact live events? Uh people like if you play Fortnite or similar games, people are already having live events there. Is it going to become uh more lifelike? Are more norm like the our parents like you know people who are not used to playing video games are going to be part of this experience. I'm I'm really curious about the the future of this technology and then image models are not done yet as well. Um there's there's been a lot of different uh improvements in the past couple of months on the image models as well. Uh Flux context and GPT40 uh introduced new editing capabilities, better text rendering capabilities. At at one point people thought okay maybe this is as good as image models are going to get but with with these new releases uh and new capabilities it is opening up to more use cases in the industry. Whenever we see a a technological shift like this happening we see a lot of different um more mature players in the industry picking up uh these technologies. So we believe something similar is going to happen with flux context and GPT4 and it's going to blend into more of the enterprise use cases people are uh trying to do. Um and then this is this is pretty much it. Um we we are hiring so please visit our website fi/ careers. uh we we are hiring machine learning engineers, inference engineers, product engineers, uh all sorts of positions and I'll be hanging around rest the day today. So find me, talk to me, would love to discuss whatever uh related to generative media or about the industry in general. Uh thank you so much. [Music]