Often Wrong

Thoughts on Personalized Podcasts

Podcasts have exploded in popularity (About half of Americans have listened to a podcast in the past year). What started as a niche medium has become a platform for everyone from celebrities to politicians to business leaders. We're even seeing podcasts dig into deep technical content, including scientific papers. Recently, we have seen tools such as NotebookLM and Eleven Labs Reader that let you make podcast episodes out of pieces of text, using AI.

Despite this innovation, the core experience of listening to a podcast episode, regardless of how it was generated, remains static. Once the hosts finish recording, editing, and publishing, you get the same show as every other listener. The episode does not adapt to your interests. For example, if you were a long time listener of a podcast, say ATP FM, over time, you get to know the hosts and their personalities. After a certain point, some of the content in a weekly episode might be repetitive to you. The host might be sharing their opinions about a new product that just came out or discussing some news. But because they are catering to new listeners, they understandably have to preface the opinion with some amount of context. With podcasts today, you have two choices – either you listen to it as a whole, or skip the entire chapter (assuming the hosts have enabled chapters) about the product, neither of which is desirable.

Now, imagine an advanced AI system that reconfigures each podcast on the fly — sculpting, editing, and remixing the actual audio in real time to suit your preferences. This goes beyond asking your voice assistant to play/pause/skip chapter, the podcast itself becomes an interactive experience, shaped by your prompts. This post is my thoughts of how this deep integration could work, and what benefits it could provide.

Current Use of AI in Podcasts

Two big AI use-cases have gained attention so far:

  1. Automatic Generation: Tools like NotebookLM generate a fresh podcast-style dialogue from a document provided. Simon Willison has a few examples in this post.
  2. Transcript Services: Apple Podcasts recently introduced interactive transcripts, letting you scroll through text and tap any sentence to jump right to that point in the audio. To be fair, this is a great feature. If you remember a specific moment in a show, you can skim the transcript and tap to play from there rather than manually hunting with the scrub bar. The downside is that it still requires you to look at your phone and interact with it.

An AI-integrated Podcast Experience

Imagine a podcast player that is always listening for a wake word/phrase, similar to how your phone listens for "(Hey) Siri". In this deeply integrated system, the AI could remix the content itself on your command in a couple of different ways. For now, I'm assuming the hosts give consent for this player to modify their content in a few limited ways. This could allow,

Concerns

The obvious one is that the creators may not be okay with AI rearranging or splicing their content in ways they did not intend. Even if they do give consent to this tool, I feel like the AI-generated bits would probably need to be marked clearly, by using a distinctive voice instead of mimicking the hosts' voices, or by announcing before/after there is an AI-generated clip.

Assuming this is carefully implemented, I do believe it could add value to the listening experience by providing absolute personalization to the user, without taking away the creative voice of the host(s).

#Podcasts