Not at the moment -- we're currently searching the abstracts of most major journals (which are public even for paywalled papers) which have been compiled in the Semantic Scholar database (https://www.semanticscholar.org/about/publishers).
In short, yes, though it's geared toward topic search.
From a strategy perspective, we designed it for topic search because it makes more sense to find everything on a topic first, then filter for the most recent, if recent is what you want. That's because there is a lot of useful information in older articles (citation connections, what people discuss, and how), and gathering all that helps uncover the most relevant results. Conversely if you only ever filtered on articles in the last year, you might discover a few things, but you wouldn't have as much information to adapt to help the search work better.
So, you can ask for articles on coffee (though ideally it should be something a bit more specific, or there will be thousands of results). Our system will carefully find all articles, then you can filter for 2024 articles or look at the timeline.
The few minute time delay is primarily because of the sequential LLM processing steps by high quality LLMs, not database access times. The system reads and generates paragraphs about papers, then compares them, and we have to use the highest quality LLMs, so token generation times are perceptible. We repeat many times for accuracy. We find it's impossible to be accurate without GPT-4 level models and the delay.
Ours is slow, but accurate, even for complex topics. The rest are fast, but generally can't handle complex topics. (There's more nuanced explanations in other comments)
Semantic Scholar seems more focused on
1. being the data provider/aggregator for the research community, and
2. long term, I think they plan to develop software at the reading interface that learns as a researcher uses it to browse papers (a rich PDF reader, with hyperlinks, TLDRs, citation contexts, and a way to track your interactions over time, and remind you of what you've seen or not).
Their core feature now is a fast keyword search engine, but they also have a few advanced search features through their API (https://api.semanticscholar.org/api-docs/) like recommendations from positive/negative examples, but neither KW search nor these other systems are currently high enough quality to be very useful for us.
FYI our core dataset for now is provided by Semantic Scholar, so hugely thankful for their data aggregation pipeline and open access/API.
Do you plan on adding an API? I already have an inhouse knowledge discovery, annotation and search system that could be augmented by your service. Not super critical at this point, but a would be nice.
And yes, Semantic Scholar is a wonderful part of the academic commons. Fingers crossed they don't go down the jstor/oclc path.
I've used undermind for literature search and it was very precise! Thanks for the product! I wonder how you plan to extend the search to full paper content (will Semantic Scholar api allow this) - and do you plan to connect more datasets (which ones)? (many of them are paid...)
We'll certainly be able to include open access full texts, which is already a substantial fraction of the published papers, and a growing fraction too, as the publishing industry is rapidly moving toward open access. Paywalled full text search would require working with the publishers, which is more involved.
Great! I can definitely ask undermind for an overview paper of the scientific information landscape, unless you have a favourite in quick access to share?
I think the biggest difference is our focus on search quality, and being willing to spend a lot on compute to do it, while they focus on systematic extraction of data from existing sources and on being fast. It's a bit of an oversimplification (they of course have search, and we also have extraction).
Feature-wise, we definitely have a lot of work to do :) What crucial pieces do you think we're missing?
From what I understand, that’s not the case. They are working on both. I’d be concerned about how you can differentiate and compete with them. They have a big head start