Spotify Co-President Gustav Söderström on their future with Generative AI
Channel: Alex Kantrowitz
Published at: 2024-11-13
YouTube video id: jV4MinWX39o
Source: https://www.youtube.com/watch?v=jV4MinWX39o
We have a great show for you today because we're sitting here in Four World Trade Center, Spotify's New York City headquarters with the company's co-president, chief product officer, and chief technology officer. Yes, all that in one. Gustav Sodstrom is here. Gustav, great to see you. Welcome to Big Technology. Thank you for having me, Alex. It's a pleasure to be here. Very great to be here. I mean, we're in a beautiful studio in your office. I I've been looking around. I just can't believe how amazing the studio is. And also, it's cool for me to be sitting here with you because I'm using your app every day. And Spotify is the place where I touch some of the most I don't wouldn't even call it possessions cuz I'm subscribed to it, but one of the most beloved experiences that I have, which is music, and so many of us use Spotify all the time, but we hear from you guys rarely. So, I do appreciate the opportunity to speak with you. Me, too. I I appreciate that. I'm very glad to hear that. and uh I'd love to share as much as I can about how Spotify actually works. It's sort of a passion of mine to try to explain things and and how they work. So I uh I actually love these podcasts. In some ways, an app will determine how people experience a format, but in some ways a moment in time will determine how an app sort of has to deal with the content within it. Yeah. And Spotify is going through both of those. Both of those regard uh artificial intelligence. I don't know if you've heard of Sunno. In fact, I'm sure you've heard of. It's one of our favorite things to use on Big Technology podcast. Ron and I, we do this show on Friday. Uh we built a a theme song with Suno and played it and it was a good time. Um and I'm curious from your perspective running product that's Spotify, how do you feel about AI music, AI generated music? because the songs, they're not amazing, but they're good. There have been some big hits. Um, do you view this as an opportunity, a threat? Do you want it on your platform? So, the way I think about, I'm a technologist, so obviously I'm very excited about the technology itself, and I love AI. I think it's a super impressive product. It works amazingly well and it's philosophically it's very interesting that something we thought was impossible just a few years ago that a machine could sound like something a human did can be creative legitimately incredible you prompt it and out comes a great sounding song it is incredible so I think that technology is amazing now my interest is to think of these technologies as tools so if you if you think about music it's gone through a journey of more capable tools. If if you go way back, if you were a musical genius, um like a Bach or someone, you literally needed access to an orchestra to be able to realize that genius. Even if you could play multiple instruments yourself, you couldn't play them at the same time. So, you actually needed like an orchestra. And then we got to recording music and you could record one instrument at a time. So, you got more and more independent. And then somewhere around the 80s, the synthesizer came along and made that and meant that you didn't have to be able to play all the instruments yourself. You could you could sort of quote unquote fake the drums using the synthesizer and the guitar and so forth. So I think there's been this progression of more more powerful tools that enabled more and more creativity. And then somewhere in the 90s the the DA the digital audio workstation came along and and being a Swede very proud of this someone like Avvici came along and and what is interesting with Avichi is he was not very proficient at any one instrument or a singer. So in a previous world he would not have been considered a a very creative person because he couldn't realize that with access to this tool the digital audio workstation. Turns out he was one of the most creative people we had that we are very very proud of. So so for for him the digital audio workstation was as Steve Jobs would say a bicycle for the mind. It meant that he could he get more productive and he could he could express his his genius. And the big question with this next round of tools is the same. Is it amplifying creativity or is it replacing people? And I I think it's amplifying creativity. It is giving more and more people the access to be creative. You need even less um motor skills on a piano or something. You need less technical skills in a digital audio workstation. So I think of them as tools and and I think there's this interesting question on what is AI music. I think people say AI music and it means something that was prompted with like not too much of a prompt and not too much work. So like 100% AI but the truth is that much of music being made today made today is a combination. I think many of the big artists are using AI for parts of their songs or parts of the track or the drums etc. So, I think there's actually a scale between zero AI and 100% AI. And I think we're on this this progression where it's actually going to be very difficult to say what is an AI song. Does it have to be 100, 99%, 70%, 50%. But but the real question is, do you welcome this stuff on your platform? Let's say somebody does prompt 100% AI. Uh Spotify could fill up with songs that are AI prompted. It's very easy to create these songs and then upload them to the internet. How do you feel about those? Do you want them? So there there two questions there. One is what are what is Spotify about? We're a tool for for creators and if creators want to use AI to enhance their music, as long as we follow the legislation and copyright laws, we want them to be able to monetize their music and pay out, right? So for us um we are trying to support creators and and uh the music catalog has grown tremendously since we started from tens of millions of tracks to hundreds of millions of tracks and I think it's going to keep expanding. But what I think is important for for us to figure out that I think is is our job and the rest of the music industry is if you go back to the years of piracy there was this technology called peerto-peer and file sharing that was amazing. We worked on that early on. Exact that. Exactly. We actually incorporated that technology into Spotify. But before Spotify, the technology sort of preceded the business model. It was great for consumers. They could now get all of this music for free, but it didn't work for creators. And I think we're in the same period of time now where the technology has preceded the business model. So, I think the technology is great. I do think we need to find a way for for the creators who have participated in this to be reimbursed. So that's something that we are thinking about and the rest of the industry is thinking about if we can find the business model I think we could unlock an tremendous amount. So, so there's a separate question which is then these models would the way they were trained will that be considered legal or not which is a legal question that is being decided uh on some some time period for example in the US these companies are now sued so I think that question will be decided by legislation but let's assume that there is one of these models whether it has to be retrained on other data or not is that an interesting tool for us if it was trained legally yes if creators can participate in it so first of It's good to hear that you're already thinking about issues of compensating creators, musicians because, you know, I write text in addition to podcasting and I know that models have trained on my text and previously I'm not going to see a dime on that. Um, it's a little different, right, with music, but yeah, if you can channel different musicians, there should be, I think, some renumeration. Um, but I'm going to just ask one last time on this point, then we're going to move on. Um, so Meta for instance, they have AI generators. The feeds have I won't say filled, but there's lots of AI generated images. They're engaging. Meta seems to be okay with this. It doesn't ban it. And now some of the top content on a meta platform is shrimp Jesus, which sort of combines like two of people's great loves, which is God, Jesus, and seafood. And I've seen that shrimp. It's massive. These type of images are massive on Meta. Yeah. So from a Spotify perspective, if these songs generated by AI music generators become engaging and let's say they follow the rules, is that good for Spotify? Well, I think like this. If creators are using this u these technologies, they are creating music in a legal way that we reimburse and people listen to them and they are successful. We should let people listen to them. I think what is different though, I don't think it's our job to generate that music instead of the creators, right? That's a that's a key difference. Are we as a a platform for creators? And then we can have a discussion on which tools are they allowed to use? Like they could use or the workstation but not LLM. Maybe that's not actually we we shouldn't decide that for them. But there is a question should we generate all the music ourselves and that's where we're saying no, we're not going to generate that music and other platforms maybe will because it's it's it's cheap content, right? So that's the key difference of we decided what we want to be in this world and it's a platform for creators. Then then there's a question which tools they are allowed to have which is partially a legal question and partially up to up to the creators I think. Okay. So there's a potential world where one of these tools seems to have violated copyright and you might ban creators from uploading music that have used that tool. We are already taking if if we get we have detection systems for if you if you are um if it's a derivative of work of of something that already exists. So we have systems to take these down. uh if you're creating something completely new that isn't a derivative of anything. There isn't a there isn't a a copyright infringement then the labels tell us. So so that's the other question on like what are these models trained on and we're not creating these models. So so we're watching what happens there and we're going to follow the law but I think from a high level this should be a very exciting tool for creators for for musicians for authors for podcasters. I I think um I think if you look at something like notebook LM for example was actually created by a journalist and a writer as a tool. So I I think my bet is that these are bicycles for the mind but sort of bicycles for the mind on steroids right and that when those shifts happens there's always tension between the the the people who didn't use these tools it feels like this is a little bit like cheating and the people who are saying like no I want to be creative too and it's always a different difficult transition period. It's just the story of technology. And by the way, we're going to get to notebook LM in a bit. So I I definitely want to hear your perspective on that. But let me ask this one. So first of all, what you're describing is just sort of like this is what happens in tech companies. You think you have something figured out and then next thing you know, new innovation you have to account for. That's kind of what makes it exciting. That's what makes it fun that that it happens. And you already have addressed where this is going, which is do we get to a place where remember you started talking about this saying we never could have anticipated that this is possible and now it's like feels like magic a prompt and you get a song out and I called them great earlier. They're not great but they're good enough. And this is literally first generation of this stuff. It's going to get better. And as you think deeper about it, do we go to a place where you can start to prompt music that is going to be better than any song that you might listen to that has been created for certain moods. For instance, like let's say you're in like a introspective mood or in a loving mood or in an angry mood and you're just able to prompt it and create that song that perfectly touches the heart at that moment. And I started off talking about how this this format is beloved. Music is blood. It touches the heart. And if AI can do that, does that become the future of music? So, you've already said you don't want to play in it. But is that something that you can discount from coming in? So, I think two things. Um, music is used for many different things, right? Um, and so you have, for example, music that you're using to study, I think is a good example. The extreme version of that is people listen to white noise. So, like would white noise be generated? is actually already artificially generated. It's one of the top podcast formats, right? So, so there's a scale here. And I think you're right for for certain things. Maybe you could create better white noise. Maybe you could create better comp uh you know, always varying ambient music for your studying. Maybe for gaming, maybe that music should automatically adjust what's happening on the screen. So, I think we're going to see lots of AI generated music for those use cases. But there's another use case which I think is very important. A lot of people use music to build their identity, right? Especially when you're a teenager. You go to a concert, you buy the jacket from that concert. Why do why did you buy that jacket? Well, it's it's a it's a it's like a pin. You're identifying with this band. You're you're building your own identity through this band. I don't think that will work with AI generated music because there is no one behind it. So, I think some music uh and and I'm sure this is happening already. I'm sure many publishers are generating music for for coffee tables and so forth. That will probably happen. Um, but I do think the human need for for having someone to believe in an actual artist that you care about. I don't think Taylor Swift will be replaced by an AI. Not because the music couldn't sound similar, but because the whole point is Taylor Swift and belonging to something. So, I think it's not a it's not a binary answer like is this going to happen or not? No, it's going to not going to happen. I think both both will probably happen. You know, two years ago I might have fully agreed with you that there's always going to be that need for the story and the human connection and now I'm not so sure because because I do think that that this stuff can be good enough. It's already proven that it's it it's already exceeded some of our greatest expectations. And um I think we would like to think that we want that connection with the human. But all right, let's go right into notebook LM. But but I think one thing to say that that I think is interesting is what tends to happen in these worlds is that the thing that is scarce gets even more valuable. So one bet would be that true human connection gets more valuable than ever when a lot of what you talk to in the future may be LLMs. That that that would be my bet. I'm I'm hoping that's the case because part part of the business that I'm running is predicated on the idea of connecting to a human who can sort of dissect and break stuff down is valuable. So I'm hoping that is the case. So but I also I'm not as sure as I used to be. And I think it's wise to not be sure of anything right now given the pace of pro pace of progress. And I think that brings us right into Notebook LM which I was planning to leave for later but you set it up perfectly. And it's this Google product that you can put notes in and then it will actually generate this podcast uh with two co-hosts that sound like ridiculously human. Yeah. They don't they don't sound like robots. And in fact, people have sort of like uh fed them scripts where they like realize that they're actually not real people and they're AIs and they just have this kind of breakdown and it's insanely entertaining. But the bottom line is and they're not quite where they need to be. They're still a little hokey, I think, and just kind of they're like if you listen for a minute, you're blown away. If you listen for 5 minutes, you start to cringe. But they also do a good enough job of breaking things down where they can pass. And I started to see uh them right now showing up in the second half of episodes where people are like, "We're going to do the episode and in the second half we're going to give you the AI to listen to." Uh but what happens if they end up being the first half? And Spotify's made a big move into podcasts. What do you think about the rise of these AI podcast hosts? So I think Notebookm is very impressive and u you know you could predict given the the evolution of voice quality of these things and understanding of a language model that this would happen. So I'm not at all surprised in a sense that you can generate audio that is engaging to listen to talk audio. But what I think was the great innovation of um Notebook LM was that people generated monologues and what what humans really respond to are dialogues. And in retrospect, it's pretty obvious like almost all podcasts are dialogues. Like if I sat here for one hour, it's not that interesting. So I think the big hack was to to go through a piece of material and present it as a dialogue and prompt it the right way. There was also obviously um you know the internal Gemini model at Google that is probably very good and the voice models got better but I actually think what they found was product market fit for the actual audio format and it turned out to be the podcast format quite quite literally. That's pretty crazy. I mean somebody on threads tagged me and was like the male voice sounds like you and I listened and I was like not the same tone but also the cadence and the type of questions. I'm like, does that mean that I'm just like the blend of of all different? Am I like this like, you know, kind of um the unremarkable middle of this or do they copy my voice? I'm hoping it's the second one. It'll be interesting to see if people either get tired of hearing the same two people talk about everything or the opposite, they get used to the same two people and would prefer to hear the same two and build trust. I don't know. I I think um I think humans are very quick and prone to sort of anthropomorphizing. It's it's sort of a hack on our human brain. So you feel like you know these people because you heard them talk about so many things now. So I think it's very interesting. It's hard to predict where we'll go as as a platform. We view it the same way. Of course people are uploading these podcasts uh to Spotify as well and I don't I don't know um from the top of my head how you know if anyone has super high engagement but certainly people are are listening to them. So it's the same question. Does this turn into a tool for creative people um who can write stories but don't want to have the podcast around it or or just have no one interviewing them so they just do an interview around their own material. Um I don't think I think you're going to run into the same problem where if you just ask it to talk about something it's not going to be very good. You need a good source material. So it's the same question. Is this a tool for creative people to get even more productive and creative or is it a replacement of creative people? My bet is it's another tool. It's pretty interesting because it sort of broadens out the long tail. And for those not familiar with the industry jargon, it's basically just that like a lot of listening is concentrated in a small amount of shows and then there's this great long tail, right? Like if you think about like a a bar chart as it just sweeps out and there's uh lots of, you know, seldomly listened to shows. And the thing about these podcast generators, Notebook LM in particular, is you can take it and create podcast for something that's so niche that you would never have a show. Similar with AI code, right? You can start coding things. I think you spoke about this in your interview with Tom uh on building one, another LinkedIn podcast network show where now you'll code things that you would never code before because you can do it. And it's similar. It might go the same way with podcasts where you can, for instance, when I before I was uh heading down to Menllo Park to interview Andrew Bosworth, I just dumped in all my source material and it read me a I created a podcast about like his current statements. There was like seven interviews that him and Zuck did before I showed up there and I was able to get the summary. That podcast never would have actually made sense to produce, but for me it made sense. And maybe that's where this goes. Yeah, I love that framing. Like one useful framing I think of these techniques is is a financial framing like the cost of something goes to zero like the cost of writing code goes to zero cost of doing a podcast goes to zero cost of prediction goes to zero. what happens, you know, and and usually what happens is is the the alternatives to that good, they get challenged, but the compliments to that good, you know, you have the famous like what if the the the uh price of coffee goes to zero, then then the tea is going to be replaced, but sugars or compliment is going to explode. So, I like that way of of thinking about it and and I think what's going to happen is exactly what you're saying. we're going to have enormous amounts of content around niches where it didn't make sense to produce a podcast. So, one way to think about it is just like the cost went to zero. So, I do think that the catalog is going to explode. And then what does that mean? Well, it probably means that the recommendation problem becomes even more important because now it's even harder to keep track of everything that is uploaded. I also think that if you have this like vast sea of the perfect sort of discussion around any topic uh so the recommendation problem becomes more valuable to solve the bigger the the catalog is but I also think you're going to see the same thing as we see in music the superstars will actually also get bigger this is what I find fascinating people say like are you know Netflix winning or YouTube well the truth is both the tail is getting bigger but the shows are getting bigger and they're saying saying are the indas winning or Taylor Swift well indis are winning but Taylor Swift is bigger than ever. I tend to see like these both things happening at the same time which is why I'm hesitant to like say like now that is going to happen right but not this. Yep. Okay. Let's talk about AI recommendation. Uh it's a big part of Spotify and we're going to just start at the end for this conversation because your vision eventually is so right now like we'll go into Spotify there'll be some algorithmic recommendation. And there'll be some stuff that we listen to. Your vision, if I have it right, is eventually you want Spotify to be sort of this ambient friend for us that knows this context of the situations we're in. Maybe AR. We're just talking about Orion glasses before we start uh recording, but maybe they know the context of where we are and can chime in and give us, you know, an example of type of some music that we might want to listen to. Is that right? Why would we why would uh why would you be pursuing that? Well, I I do think of so when we um started Spotify, I was not part of funding Spotify. I joined in 2008, late 2008, 2009. Spotify was found in 2006, but was pretty early on. And um it's interesting that this was before machine learning became a thing. And so Spotify was quite focused on social features for purposes of recommendation. We needed social features because that's how most people discover music, through a friend. So we wanted you to connect to people and then uh AI came came along or what was called machine learning back then and we realized that through all the playlisting data we had uh which is basically one way to think about the playlisting data is almost as labeling for for the user they creating a set for themselves for Spotify they were saying like these tracks go well together these tracks go well together so we got a lot of of label data basically and we said internally now some people have a musical friend that happens to know their taste and so forth, but most people don't. So now we can build this friend for for everyone. That was the AI. But the interesting thing is like that thing of like building a friend for everyone that can give music recommendations like discover weekly. It was always an analogy. People did not think of discover weekly as a friend. Thought of as a set as a service and so forth. I think what's happening now with AI is that the analogy is actually becoming reality. And so you can see you can see us moving a little bit in that direction. and you have the AI DJ that starts to give Spotify voice that talks to you. Um, and I think what is going to happen with these LLMs is at least for some brands, you will start having literal relationships with them. And I would love if it is the case that you think of Spotify as actually a friend, not an analogy anymore, but reality. This is a person that this is a a thing that knows me well. this is a musical intelligence, a podcast intelligence, a book intelligence, and I actually like hearing it, you know, tell me about new things and suggest things I'm interested in. So, I think that's that is where we're moving. I think other brands are moving there as well. I think if you if you look at some someone like uh Dualingo, they've actually only communicated through four characters all along. When you get a push note, it's not from Duolingo, it's from Lily or Star or something. They really they uh they give me a hard time if I'm away for a couple hours. It's like and that was also kind of an analogy but now with AI you can actually talk to these characters. So I think this is a journey many companies are on and it's interesting to to to play that out means that part of what was called branding before is like what personality do you want your company to have? Not as an analogy but literally what personality should Spotify have? I think that's fascinating time to work in in tech and it's something we're thinking a lot about and I think that you might be underrating how much people view Discover Weekly as a friend. Now, for folks who don't use Spotify, Discover Weekly will basically take into account your listening and your preferences and give you a playlist of what 30 songs on a Monday morning. And they're just new songs for you to discover. And people will be like, "Uh, Discover Weekly really got me this week or Discover Weekly is inflicting some pain on me this week or what happened? I thought we had a close relationship and now you don't owe me at all." And you also have so you have this AI DJ. It's you can find it in the app. Um, It's okay. I think I [laughter] there's definite I'm curious the feedback I've heard is people were excited about it initially and have gra have moved away from it. And what is so now I'm sitting in front of the you know person running product at Spotify. What is actually happening with this AI DJ? Is the experience there and are people using it? Yeah. So in the numbers they're not moving away from it. It's actually very successful. So my friends are just pretty snobby music listeners. Well for the people that use it it's actually um their biggest set. It's bigger than their discover weekly usage. So, it's quite a quite a binary experience. I think it's a for people who don't know what they want to listen to and just want to put something on, it's working very very well. Um, what I would say though is when we launched um the AI DJ, the big innovation there was that we managed to basically digitize a voice of a real person to make it sound very believable. But the things that it said around the music were were were like to some extent uristics and kind of repetitive after a while. Uh so what we've done since then is we've invested quite a lot in um this is quite recent that is rolling out in LLMs that actually tell interesting stories about the music and we see very strong effects on this on the retention uh of the application. So whereas the thing used to say here's this and this song from this and that I think you'll like it. Now we can say things like um this artist was just in Copenhagen or has played here on the last. You're starting to you're starting to get interesting stories. You're starting to feel more personal. The other thing that I think is missing that I hope we can do someday is it can talk to you and you can talk back by skipping. But obviously in the in the age of like talking to machines, you would like to be able to just talk to it and say like no this was not very good. my discover weekly this week was not what I wanted and give actual feedback and that is technically very possible now with these LLMs. So, so that's what I'm hoping will happen. This should not be a one-way relationship which Spotify has been for technical reasons. It should turn into a two-way relationship. Okay, I have questions about that coming up. And to introduce that segment, I want to talk to you a little bit about how much we should allow the algorithms to dictate what our music experience and podcast experience is going to be versus how much should be uh dictated by us. How much agency should we have over our own choices? Um Kyle Chika, New Yorker reporter, recently wrote about how he's leaving Spotify. I'm just going to put the argument out there and hear what you think. And I'll just read it straight from the story. He goes, "Through Spotify, I can browse many decades of published music more or less instantly. I can freely sample the uh work of new musicians. It has become aggravatingly difficult to find what I want to listen to. With a recent product update, he says it became clearer than ever what the app has been pushing me to do. Listen to what it suggests, not choose my music on my own." What do you think about that argument? Well, I think this is an individual feedback, but I think generally you have very different types of users. So, I'm I'm going to get I'm going to get this person back on Spotify 100%. I think there is a there's an interesting trade-off here that is that is real. So, people want less friction. Um, they want to spend less time searching. You want to make things as as easy as possible, right? But there is this end of the line where you you sit there and you just receive. You're kind of force-fed and you don't give any signal back. maybe a few clicks and so forth. Um, and that's something that that we want to avoid. I think this is where the industry is going. It's going more towards distraction content and sort of just sitting and receiving. And it's a little bit of a dystopian um end of the line there. So, what is interesting with Spotify, which we are reemphasizing, is that it was actually a platform where you invested quite a lot in your own playlisting, right? And the there's a trade-off here between if we you could have as a vision is we should be so good at machine learning that you should never playlist again. That would be the goal. Um because then you've done the user a great service supposedly. But then you also receive no signal and the user does no investment. So we're actually reemphasizing playlisting quite a lot. Okay. Your own investment and and you know over the years we we've gone more towards um machine learning and algorithms because it works. people listen more and they they appreciate the service more. Um, but we need to cater to everyone including this reporter. So the Spotify user base is divided into many different kinds of people. You have the the sort of the track listeners only listen to playlist. You have the hardcore album listeners. It's like I just want to listen to an album the way the creator thought about it. I don't want the songs in between. um you have like the artists, radio listeners only listen to to one one type of artist and it's a it's actually a big challenge to build a service that serves everyone when people are very different. Uh so we we try our best to make sure that the sort of music aionados who want their library to be album can have their service and then but then you have the other people who just want like I just want my daily mix to play in my air. I don't, you know, I just want to collect tracks. They also need to be successful. So, we're we're trying to build and cater for both. You can never please everyone 100%. But we're trying to be statistical about it uh to make sure that it is um it is uh vastly better for the majority of people. But we our goal is to cater to everyone. And I do think there's a real point around going to zero user investment seems good in the short term, but I don't think it's good in the long term because you actually lose signal from that user and at the end I I think they feel less participatory in the experience. Even if the engagement looks high, if you've done no feedback, I don't know how much you feel this is actually your service. Definitely. And look, I'll confirm that Spotify does listen to user feedback. I sent a a tweet out uh a couple years ago talking about how like some of sometimes I'm baffled by the Spotify product decisions and I mean maybe it was because I was a reporter but someone from your team reached out and I talked about how I wanted to see recently played. Like often times I'll be listening to something and then I'll go away from it and I can't find in the app and then a couple months later there's a recently played button in the app. There are some great updates coming for you as well on that topic cuz this is a big user need. Maybe it takes a little bit longer than we want, but obviously our goal is to is to listen to user feedback and try. But we get very sometimes really completely opposing user feedback. That's the tricky thing. Who who do you listen to the most? The people who want this desperately or hate this desperately. And and there's a lot of both types of feedback. So it's product development at this scale is sort of a statistical experience, but you still have to have a bit of an opinion. If you only treat it as statistics, the application is going to be very weird at the end of the day. You have to combine some sort of vision and conviction, but you have to be still very datadriven. I think an interesting example of user investment and AI that that we launched recently is something called um AI playlisting. Uh so this is I think a good example of like the first time you can talk to Spotify. So the AI DJ talks to you and it's getting better but it doesn't listen. It listens to the clicks maybe. But with AI playlisting um we built this experience where you can you can prompt what is an LLM with what kind of playlist. So we have an LLM and the LMS have a set of world knowledge about music but then we have the music catalog and we have your listening history. So this is an LLM that understands your particular taste and you can ask it for a playlist with you know big u big drops and EDM for driving fast at night or something and then it will try to do that and then you can say like no um a bit more upbeat or not that artist and so forth and and this I think is a good mix of using AI but not to force feed you stuff. It's actually very high signal. you are literally telling us what you want and then when we say here it is you say that one yes no no yes and then you can reprompt so so it's back to I think it should be a two-way conversation and I think the first wave of machine learning allowed us to do the one-way push uh the the next wave generative allows us to actually listen to you even in clear text so communicating with Spotify just through skip buttons it's a pretty narrow signal so it's kind of hard for us to understand like when you skip it was it because you hated it or because you liked but it was too many times. Now you can actually say like I really don't like this cuz like remove it. So I was DMing with Kyle last night. I was like, "Hey, I'm going to meet with Gustav. What should I ask him?" And one of the things he said is uh should Spotify users be able to tweak the recommendations? And your answer here is resounding yes. And you're working. Absolutely. Absolutely. We are working on these things both the obvious things where you can say like uh I didn't like this particular thing. But I think the free text element is very interesting. If you could talk to it, you probably it would learn much more. But it you would probably also get more trust definitely. Let me ask you one broader question about this because I I'll I won't stick on Kyle's uh stuff for the entire uh conversation, but I thought it was really interesting and he wrote a book called Filter World. The main argument, he's been on the show. I I'll link it in the show notes. The main argument is that al our world mediated by algorithms has become too bland and you know effectively that the algorithm have flattened out you know what used to be a more vibrant yeah experience with things like music. Do you see that at all? I think this is a really interesting argument there. There are two ways I want to address that. Uh one is for Spotify specifically. We've seen the feedback that people feel like it's great for the kind of stuff I already listen to, but I feel like I'm in a bubble. I'm getting more of the same. I'm now getting new stuff. This is sort of a Spotify specific challenge because most of the time your phone is in the pocket and you're listening. And when you're listening, you're listening to a session. Let's say you're listening to indie folk or something. Then it's quite easy for us to say, "Here's another indie folk song." And and you're going to say, "Oh, that's that's a good recommendation." But if we start playing Metallica there, you're going to be like, "What is this?" So most of the recommendation sort of inventory we have is kind of constrained naturally to what order they're listening to because we can't put in very random things. You would say this is a bad recommendation. So this is a challenge for us when you know when we want to show you something completely new. The favorite example is I love Regaton, but you wouldn't have seen that from my listening history. How do we solve that problem? So we started investing about two years ago in in other types of of foreground recommendation. So sort of like the feeds that you see on social media, but you can it you can literally say like, "Okay, I'm bored. I want to go wide." Then you can go into these um foreground feeds of music where you can swipe through many tracks and they're very efficient. The hit rate is going to be low because now we're in a territory where the whole point is we don't know that you like this. So our hit rate is going to be low. Then I think you need a very efficient UI to evaluate lots of content, right? Because the hit rate may be one in 20. You're not going to listen to 20 songs. That's over an hour of music. You need to go quick. So we try to solve that problem for for when like Alex is bored and he wants to branch out as soon as we see that signal. We didn't have tools for that before. So So we built that. So that's part of the answer. Spotify being an audio service made it a bit harder to go explore. So now we have these foreground feeds. They have music videos, not in the US yet, but but in much of the rest of the world, we have music videos. They're very helpful when you're evaluating new music. [snorts] But the more philosophical part of this answer is did the algorithms sort of flatten out? Because they are to some extent trying to find statistical patterns and averages. And I think if you look at recommendation technology, I don't think this is widely known yet, but these deep learning based systems, they had flattened out in terms of if you added more use data or more parameters, they did not get better like the LMS. There there were no scaling laws. It's just like it is what it is and you could move it 2%. There's something that has happened there recently recently which is called generative recommendations where you actually use a sort of large language model instead of these old deep learning models and you basically think of um user actions as a language. So you have a sequence for us. So they they click this, they listen to that, they click this, they listen to that. And then just if you turn that into tokens, just as you can turn a language into tokens, you can just as you can try to predict the missing word in a sentence, you can try to predict the missing action in a sequence. And it turns out that these generative recommendations, they do scale with more user data and more parameters, just like the LLMs. So this is a long-winded way of saying I think he's right that the recommendations did flatten out. It's also true that people are changing recommendation stacks and it now is unclear why they couldn't continuously get better. So I'm hoping that the recommendations do get more intelligence because intelligent because now it's not just a statistical average. They can look at your specific user history going years back and they could potentially understand that it's actually uh you know Christmas again and last year at Christmas you did this. So I'm hoping it gets more intelligent. And one last question about recommendations or maybe I have two but one important one that comes from Ron John Roy who's on the Friday show with us. He would like there to be a parent mode on Spotify where if you have kids you can be like I'm on child mode and then recommend kid music and then parent mode, you know, and don't uh blur my recommendations. What do you think about that? So So we have a a bunch of different solutions uh for this. Obviously, there's a family plan. So, hopefully your kid can have their own account and then it doesn't cost more the recommend. Exactly. What are you going to do for your three-year-old? Exactly. There's the other thing is you can create a playlist for your kid and then if you click the settings, you can say do not include in my recommendations and then it actually doesn't destroy your recommendations at all. Uh so, so there are those solutions. We're also trying to understand that all of this is kids music. So, while this is part of your taste profile, we should not play this in your other sets because this is probably something you're doing for sort of a use case. So, you probably want a kids music playlist in there, but you don't want that music to affect your your other sets. There's an algorithmic component. There's a there's a subscription plan component and then it's back to like more user control. You can actually already say that this playlist should not be considered my taste and so we're going to build more of those controls. Okay. Raj will be happy to hear that. Yeah. Uh, okay. Really last question about recommendations, then we're going to go into podcast and some other formats. Um, I don't know if you have seen this YouTuber, his name is Fontana. He did this thing about the Shabuzzi being song being the song of the summer explaining why. And he made an observation there that was interesting to me, talking about how we used to hear music on the radio often. And that was the music that was played there was music that would often be played when we're with other people, with friends, having a good time. And it led to more, you know, dance songs, rock anthems, and stuff like this. And today we're like mostly accessing music via streaming platforms. and he says those are much more individualized recommendations which has kind of shifted the way that music is made and even the hits in music. What do you think about that argument? So there is a philosophical question there which has been researched a few times which is do you have an innate taste in your brain and our job is to search for that and find it or do what we play actually affect what you like and there are all these experiments in colleges where you know you play like different songs to different groups and then you see what they like and it seems like it's a bit a bit a bit of both. You have some sort of innate taste but you're also affected by what you hear to this argument like the the radio can change your your taste. Uh so so I think there's um there's truth to that argument. What I think is interesting about um our music listening is that when we survey users and we ask them what percentage of your listening is with others it's a huge percentage like double digit percentage. So music is actually a very social activity still and in some cases we see this we have this feature called jam that is is taking off like a rocket for us. It's doing very well and and jam is essentially we can detect when two phones are close to each other. It's just like hey do you want to join Alex's jam and now we have a joint queue. So at a party the way you party right now with Spotify is you don't go and like interrupt. You just bring up your phone you join the queue and then you can queue things up right and so uh we have a lot of of u joint listening and people are listening like I said I don't want to say the exact percentage but it's double digit percentage of listening happening in groups. It just looks to individual as individual listening to us. So I think it's actually happening more than maybe people think. It's not 100% individual listening. But because we don't see them as group listenings, we're still treating them as individual listen. So now that we're getting more data on what is good group music, that becomes a different category. So I I think uh the radio use case is happening. You're hearing songs at parties and with others and when you're riding in the car and so forth. It just looks to these services as lonely listening, but it's actually quite social, right? Okay, let's take a quick break and come back to talk about podcast, audiobooks, and see how many random questions I can get to before our time is out. We'll be back right after this. And we're back here on Big Technology Podcast with Gustav Sodestrom. He's the chief product officer, chief technology officer, and co-president of Spotify. So Spotify is investing heavily in podcasts. Um this has been going on for a long time first through largely through an original strategy and now less so. Um also audio books. You can find my book always day one on Spotify if you're a premium listener which I'm happy about cuz more people can listen to to the book. What has gone into the decision to just bring all these formats together in one app? And um I mean are they good businesses for you spot uh uh podcasts and audiobooks? Yes. If we start with the first one, how do we come to this decision? Uh what happened is that we saw internally actually at Spotify a lot of our developers sort of hacking Spotify into or hacking podcasts using RSS into the Spotify experience. And we saw it again and again at hackw weeks and first we thought like maybe it's a it's a it's a niche random need we saw it again and again and so then we just it's like user feedback or user research you know Spotify is still like many thousands of employees so it's it's not a very representative sample of society but it is some sample of society so if you see the same user need many times you should take it seriously so we started looking at that and then we looked at podcast that we saw had a lot of potential and was growing but we didn't think anyone was doing something very interesting with it. So we decided to to then uh just approach it because we saw the user need internally. We saw the market growing. We sized it and then we saw that there was no one really investing in it. Apple hadn't invested in it and they had like 98% of the market. So that's how we came to it. And then the question is yeah that Apple podcast app needs work. Okay. But sorry go ahead. [snorts] But we were grateful for that. Uh so then the question is why in in the same application? Why not as a separate application? And uh that's there there are two views of that. One is it's a strategic decision. The the the biggest barrier to something new right now unfortunately isn't necessarily the quality of the application. It's the user acquisition cost. Distribution is everything. Distribution is still everything. And and actually at the beginning of the iPhone era, there was a lot of organic distribution. People went to the app store every day. It's like no one goes there anymore. So you almost have to pay for revenues. So user acquisition cost is probably the biggest inhibitor to most business plans. So if we built a separate app, we would have to reacquire our own users again and that would make it very expensive and we have seen all of these big big companies the American tech companies launching app after app and basically nothing worked. Then we look at China which is a different strategy of the super apps where they double down on their introduc on their own distribution and so you can think of like podcast pre-installed. So that was the strategic angle for where this made sense. But I actually have a a user angle on this where I think it is the better experience. So I think in 2024 the user should not adapt the software to the content. I think in 2024 the software should adapt to the content. So if you play a piece of music, there should be skip buttons. If you play a podcast, it's not rocket science to change the skip buttons to 15-second scrub. And if you play an audio book to to change them to chapters. Like come on, it's 2024. Why do you have to switch apps for that? Right. So, we we actually both believe that it was strategically the best for us because then we we could double down our own distribution, but we also think this long-term is the right user experience. It is the easiest for the user. Now, we have these beautiful connections between the audio book and the author being interviewed in a podcast on the same thing where it's seamless instead of like now you should switch the app and go somewhere else. So, so that's the reason that we do it in the same application. And talk a little bit about discoverability because that's the biggest issue for podcasts. I mean, if I and as a company that's an expert in recommendations, which we've spent like most of this show talking about, that should be something that you get done pretty well. But for instance, like if I'm listening to tech shows, you know, and and I'm not listening to Big Technology Podcast, I probably want to see that um there's a show called Big Technology Podcast out there. And from what I've heard, discoverability like both from um product people and from podcast producers has been the biggest issue. Uh probably because there's like a huge investment that goes into listening to even that first 5 minutes of a show. I mean that's like 2 minutes longer than your average song to try out a new show. And most of them most I mean I actually changed my show that we could do our like really like you know informationrich uh intro which you just experienced and then take a break take a break and come back in because if people are going to try it out I want them to know what they're getting versus like the typical Long way.