Shareable conversations would definitely make the tool more useful yeah.
I really like the query parameter approach over UUIDs so it would make links human-readable
On the limited dataset: Completely agree - the public files are a fraction of what exists and I should have mentioned that it is not all files but all publicly available ones. But that's exactly why making even this subset searchable matters. The bar right now is people manually ctrl+F-ing through PDFs or relying on secondhand claims. This at least lets anyone verify what is public.
On LLMs vs traditional NLP: I hear you, and I've seen similar issues with LLM hallucination on structured data. That's why the architecture here is hybrid:
- Traditional exact regex/grep search for names, dates, identifiers
- Vector search for semantic queries
- LLM orchestration layer that must cite sources and can't generate answers without grounding
"can't" seems like quite a strong claim. Would you care to elaborate?
I can see how one might use a JSON schema that enforces source references in the output, but there is no technique I'm aware of to constrain a model to only come up with data based on the grounding docs, vs. making up a response based on pretrained data (or hallucinating one) and still listing the provided RAG results as attached reference.
It feels like your "can't" would be tantamount to having single-handedly solved the problem of hallucinations, which if you did, would be a billion-dollar-plus unlock for you, so I'm unsure you should show that level of certainty.
Trump famously told New York Magazine in 2002: "I've known Jeff for 15 years. Terrific guy. He's a lot of fun to be with. It is even said that he likes beautiful women as much as I do, and many of them are on the younger side."
Trump and Epstein were social acquaintances in Palm Beach and New York circles during the 1990s-early 2000s. They socialized together at Mar-a-Lago and other venues
If you want to understand these people, watch the Daily Beast podcast "Inside Trump's Head" with Michael Wolff. It's a little slow but will paint the picture of their motivations, friendship, falling out, etc etc.
The economics are simple. When an agent guesses, it produces wrong code, failed runs, and wasted time. External context is the biggest source of those mistakes, because IDEs only index what’s in your repo. We are a complement to Cursor, ChatGPT, and Claude, not a replacement.
JetBrains is great at indexing your local codebase and understands it deeply. We don’t try to replace that. Nia focuses on external context: docs, packages, APIs and other remote sources that agents need but your IDE can’t index.