That's a great point, I'm parsing out ads to determine sponsorships anyway so filtering these from search should be straightforward. Thanks for the feedback :)
Do I understand correctly banana pricing is that it costs $1.87 per hour, so the hour of audio with large model costs you about $1? Thats probably a bit too expensive compared to cloud providers.
Currently I'm using Algolia but others have pointed out some alternatives I'll have to check out. Would love to hear any feedback/ideas you have from your project!
Thanks! The search function is built with Algolia, I'm sure they support boolean ops like "AND" but I'll need to dig into their API. I think if you search both terms, transcripts containing both should be ranked higher.
I’m doing a similar personal product. Highly recommend switching to Typesense before your Algolia trial is up. I’ve heard good things about Meilisearch but Typesense has been rock solid for me.
You might want to try semantic search instead of fiddling with keywords. Disclaimer: I'm building a plug-and-play semantic search API at https://kailualabs.com
Building out the search a little more to support exact matches would also be super useful in this flow. For example, I've been on several podcasts talking about Notebook.ai, but searching for the name also matches "notebook", which results in an unusable signal-to-noise ratio (seeing every podcast that says the word "notebook"). Likewise, it'd be great to quote-search exact matches for "Andrew Brown", instead of seeing all podcasts that mention "Andrew" or "brown".