I built an application for exactly this. It's called A Personal Search Engine, APSE for short.[0]
It OCRs screenshots and stores the text in a search index, so you can query by keyword, date, boolean operators, the whole shebang.
It's all local. It is really useful for me - yesterday it saved me after Firefox wigged out and lost all my tabs. It's in a great place to try out, and I am actively developing it.
Maybe, stop and think before you ask this. Someone offers up an example of their hard work and you instantly accuse them of being a malware author that steals your data. Nice.
I would rather call it the new-found paranoia most legal departments now have, when IT dept. mentions rolling out new, unknown, non-auditable software in the company from a new vendor that wasn't grand-fathered in (= the legacy windows/cisco deployments). I'm happy about it. I'm just waiting for them to forbid closed-source firmwares/hardware.
I made an application[0] that does capture your view. It's screenshot-based. It works outside of a browser too - anything on your screen. All local too, so no privacy concerns :)
Having to pay $15/month ($180/yr!) to be able to search stuff on my own computer for years seems awfully expensive. I'd rather depend on some simple open-source piece of software that I can understand and maintain if necessary.
Yeah, the sheer idea of paying a subscription for software that is running on my computer to index local resources is crazy. This kind of software should be should sold as one-time buy license.
Decades ago there was an amazing piece of software from lotus when I worked there called magellan. I remember the first time I saw someone search, and find results in text documents, spreadsheets and many other of the common formats of the day.
That was in 1989 and today I mostly search my computer using find and grep commands, since that's what just keeps working.
I'm building something that does this. It lets you search text in images, video, local files, anything that comes across your screen. You can manually add e.g. web pages too.
It's at https://apse.io. It has been working really well for me. I'm happy to answer questions if that sounds interesting.
Yes this model could cause issues such as the one you describe. With phrase queries/multi-token queries this becomes less of a problem. Phrases aren't anagrams that often.
A secondary index might become needed with the most popular terms, to resolve which anagram is the right one.
I work a lot with image and video processing - this sounds like the kind of thing I'm interested in. Would you want to chat some time? Email is in my profile.
SEEKING WORK - Southern WI/Chicago - Remote OK - Travel possible
I'm Ryan, a software developer specializing in computer vision.
I help businesses work with imagery or video and use that imagery to create data products or get actionable information.
I have worked on projects ranging from NASA's James Webb Space Telescope, to drone startups, to enterprise software shops.
Typical computer vision projects might involve object detection/recognition, deep learning, photogrammetry, LIDAR, remote sensing, drones, or more!
Whether you're looking to build an MVP or scale an existing system, I will help you ship a product that helps you hit your next milestone.
I'd love the opportunity to work together if things are a good fit. If you'd like to know more, check out my work or shoot me an email: https://foxrow.com.
It OCRs screenshots and stores the text in a search index, so you can query by keyword, date, boolean operators, the whole shebang.
It's all local. It is really useful for me - yesterday it saved me after Firefox wigged out and lost all my tabs. It's in a great place to try out, and I am actively developing it.
[0] https://apse.io