YouTube takes free uploaded podcasts clips and charges us outrageous premium fees to view.
The obvious alternative is to use a podcast app but I like the 10-20min clips and recommendation engine on YouTube. So I built podtoc.com to serve LLM generated podcast clips youtube-feed style.
The Tech:
- LLM Pipeline: I built a pipeline to extract meaningful clips from long-form content, specifically designed to capture one of the core insights covered in the podcast.
- Recommendation Engine: Suggests clips based on previous listening to solve the discovery problem.
- The App: A React Native (Web/iOS) app featuring a "swipe to next" UI for seamless browsing.
If this sounds like a problem you’ve faced, I’d love to hear:
1. Which podcasts would you like to see added to the library?
2. Any feedback on the UI or bugs you encounter?
3. Any questions about the pipeline or suggestions for the recommendation logic? (would love to open source after some cleanup)
nice! I would love some insights how you identify the the meaningful clips (how to explain to the LLM what meaningful mean for a given content) - I have to build a similar tool internally and that's the question I am trying to find a good answer to right now.
Regarding your UI, it's nice. I would suggest adding some basic control for audio level in the player. Else. adding some search bar with auto complete or suggested query can make the interface more engaging for new users and more practical for returning users.
Then next level, you can try to make TikTok for audio with scrollable vertical view and animated audio waves (listening to audio while seeing something nice is a good way to hook people in) and generated subtitles. Viewing the text from what you're listening increases focus.
1. gemini has native audio understanding so I would recommend checking out uploading there and playing with the prompt to get it's output matching what you are after
2. for audio over 1-hour I found chucking it into 45min segments made it easier for Gemini to give back reliable timestamps
3. you do need to check the LLM outputs for valid timestamps - it can go off the rails
I'll add search with the existing vector embeddings used for recommendation system and audio waves to the feature list - great idea!
YouTube takes free uploaded podcasts clips and charges us outrageous premium fees to view.
The obvious alternative is to use a podcast app but I like the 10-20min clips and recommendation engine on YouTube. So I built podtoc.com to serve LLM generated podcast clips youtube-feed style.
The Tech:
- LLM Pipeline: I built a pipeline to extract meaningful clips from long-form content, specifically designed to capture one of the core insights covered in the podcast.
- Recommendation Engine: Suggests clips based on previous listening to solve the discovery problem.
- The App: A React Native (Web/iOS) app featuring a "swipe to next" UI for seamless browsing.
If this sounds like a problem you’ve faced, I’d love to hear:
1. Which podcasts would you like to see added to the library?
2. Any feedback on the UI or bugs you encounter?
3. Any questions about the pipeline or suggestions for the recommendation logic? (would love to open source after some cleanup)
Check it out: https://podtoc.com/app
nice! I would love some insights how you identify the the meaningful clips (how to explain to the LLM what meaningful mean for a given content) - I have to build a similar tool internally and that's the question I am trying to find a good answer to right now.
Regarding your UI, it's nice. I would suggest adding some basic control for audio level in the player. Else. adding some search bar with auto complete or suggested query can make the interface more engaging for new users and more practical for returning users.
Then next level, you can try to make TikTok for audio with scrollable vertical view and animated audio waves (listening to audio while seeing something nice is a good way to hook people in) and generated subtitles. Viewing the text from what you're listening increases focus.
Thanks for checking out
Couple tips on audio front:
1. gemini has native audio understanding so I would recommend checking out uploading there and playing with the prompt to get it's output matching what you are after
2. for audio over 1-hour I found chucking it into 45min segments made it easier for Gemini to give back reliable timestamps
3. you do need to check the LLM outputs for valid timestamps - it can go off the rails
I'll add search with the existing vector embeddings used for recommendation system and audio waves to the feature list - great idea!