Books reimagined: AI to create new experiences for things you know — Lukasz Gandecki, TheBrain.pro
Channel: aiDotEngineer
Published at: 2025-07-22
YouTube video id: Kcka7rzcxLk
Source: https://www.youtube.com/watch?v=Kcka7rzcxLk
So, my name is Lukasz Gandzel and I've been programming since I was a little kid and I want to tell you about my newest project. Um, books reimagined. So, how to use AI to create new experiences for things you already know. So, how it all started? Uh, I was reading a book about uh, Donald Trump re-election and since as you can hear I'm not from the United States uh, if there was a bit a few too many characters to me. I didn't follow everyone. Uh, so I decided to wipe code my way through the understanding. I uh, built a little bit of an AI companion application. Looked terrible, but uh, it gave me context for the people that were on the page with a little bit it found the images for them and gave me a little bit of a summary in the context of the of the page that I was at. And a month later it turned into um it turned into something different. So, this is going to be the snow this is one of the first experiences we've built. This is the Snow Queen book and this is the part where the sorcerer's uh, apprentices are flying away with the mirror that distorts the reality. Um So, all right. So, it tells the story about they're flying and flying and the heaven is so far away. There's music and it reads, but you can't hear it. I'm sorry. But then the crash happens and uh the mirror shatters and it distorts everything all around. So, this is one of the first experiences we've built and uh, but it's all in Polish. So, I want to actually demonstrate one that we built just for this conference that's in English. This is 1984 and um what's interesting here, which I don't think I'll be able to show you, is that you can send a quick voice note to the book and ask what's going on in this scene right now. I don't really have audio. But the the point is that there's many different uh AI voice assistants, but they are all almost always just terrible if not all of them to be honest. Siri is terrible. We had a demo from uh, Google yesterday. They were saying up front that it works 50% uh, it's usually there's a delay. They they start talking in the wrong uh, position. Uh, I mean at the wrong time. Uh, they they interrupt you. So, we've built here a system where it you hold it as to just uh, specify when you are speaking and you then you let it go and it immediately 100 milliseconds responds to you and then you could scroll further and then ask a question like what happened between the last time I asked you a question now and it can summarize what's going on. Um So, you have to believe me that you can check later on bookgenius.net. Um Another thing that we were talking thinking about is the search. That's very uh, common thing searching. Uh, so the most normal search would be just exact search. But if you want to the way our brains work they don't memorize the pages. So, if you want to find a scene where Winston met O'Brien then exact search is not going to work, but embeddings work. So, you can quickly find the scene you were thinking about this way and then you can uh, go to that go to that uh, spot, read a bit more and you can go back to the place where you were reading. But there's also uh, one step forward I mean one more thing you can do. You can basically say uh, talk about all the way the party um, propaganda works and you can do deep research and it's going to actually read the whole book till the to the point that you finished at to give you the answer. So, that's very useful. It's going to take a couple minutes. I'm going to go back to the presentation. So I started with wipe coding vanilla JS very confusing code, but it gave me the freedom to iterate very quickly. Um, you basically don't know what you don't know and if you start to especially right now the time it takes to plan everything up front is often wasted because you can much quicker just tell your thinking to the AI and generate something that works and then you see oh, that's actually not that great. Let's try this and that. And I realized that throwing away code that you poured your heart into often feels terrible. Like you're vested, you've spent so much time. But throwing away code written by by AI actually feels great. Um So I would describe this as waves of changes. So, basically um, once I start feeling that the I don't rewrite the whole code base day after day. Like the the amplitude of the waves is getting lower and lower and there comes the time where I can start old school engineering. Uh, I can start adding tests and refactor. But there are traps refactoring. Do I refactor the worst piece of code? I would suggest that it's better to focus on low hanging fruits. So, for example I had a piece of a code for from open AI audio processing. Uh, it's like JavaScript very quickly written no types very confusing, but I never have to touch it. So, I'm I'm not refactoring it. Although it was very tempting. So, we often think about refactoring by adding this how bad how painful how easy. But if something is very bad and very easy to change, but it's not painful at all and it's probably not a good idea to change it. So, I would suggest that it's better to look at how bad the code is multiplied by how painful I multiply by how easy. When all those factors are taken into consideration, then it starts making sense to make a decision. So a lot of the AI experiences that we see and talk about are basically either chat GPT wrappers or image generators uh, or half working useless voice assistants and including Siri. Um So our approach was to hide the AI from the user. Uh, so when we produce the books the AI does the initial draft and we do the rest and I would argue that the human touch is invaluable in in situations like this. Uh Like not AI cannot tell if the music that they generated is not good. It cannot say if the graphics are good looking or if the avatar is actually matching the vibe of the person uh, that uh, the book is talking about. So, you want to make the AI disappear. And uh, multiple things connected together simple things simple building blocks make for the magical experience for the reader. There's nothing really new here. You could already ask a friend about a question about the book. But is your friend available 24/7 and all knowing? Probably not. You can already search, but is the search spoiler free search? Is it natural language search or exact match? Um So, I think that beautiful graphics help you get into the mood of the book and help you with the character recall. Uh, and music that matches the scene makes it the experience like watching a movie and we know that music influences the emotions hugely and it's very nice when um, you're reading the book and the the music just flows with the book and and gives you this great experience. So, nothing new, but at the same time completely new, which is what AI allows us to do nowadays and I would encourage everyone to think about those tiny little niches where uh, we can create some completely new experiences on top of something that we known for so such a long time. So, in thousands of years it was never possible to read books like this nor even really produce books like this because if I had to do all those graphics and music per every single book it would cost me I don't know 100,000 of dollars per book. So, it never meant made sense to do this. So, how do we do this? The process is we use a combination of LLMs to the scene analysis, book characters detection. We give the AI an overall music theme. So, we say for Sherlock Holmes books for example that it's like Victorian London and all that and it should be noir music and kind of a on a sad note. Um, so with scene scene analysis plus mood detection we do music generation and we also do structured XML with metadata. So, for example we have a text like this and we AI is very good at doing this kind of a mapping which then it's very easy for us to use in the uh, book when we say okay like we can display the avatars that are in the scene. It would be very time consuming for a person to go through the whole book and map every single thing like this. So, today we are open sourcing the player. So, anyone can create the Netflix style experiences for books. And if you want AI that feels like magic not like chatbots, come talk to me. We build AI experiences that ship and delight and not slides. Although I hope the slides were nice. So, thank you and you can find me at those places.