rawsh's comments

rawsh · on Oct 18, 2024

You can actually get solid performance with pretrained chat models: https://raw.sh/posts/chess_puzzles

On lichess puzzles gpt4o with the compiled prompt is around 70%, I think the 270M transformer is around 95%

sinuhe69 · on Oct 21, 2024

It depends on the difficulties of the puzzles. If you read the article of the Leela author, (link cited by other comment), then you will see a much different picture: the new DeepMind is better then AlphaZero but much worse than the best open source model. Not specially trained transformer like ChatGPT4o (and even 5o) has absolutely no chance to solve the more difficult puzzles.

rawsh · on Aug 28, 2024

Bit confused what the value add is over a framework like DSPy. This still requires you to create an eval dataset with ground truth, basically the only hard part of using DSPy. Easily getting the optimized prompt and having some metrics out of the box is not worth nearly $1k/mo IMO

Side note: I’ve had a lot of luck combining automatic prompt optimization with finetuning. There is definitely some synergy https://raw.sh/posts/chess_puzzles

antonap · on Aug 28, 2024

Thanks for the feedback, love your article diving deep into DSPy! Here's how our platform is different:

1. You are absolutely right, the dataset is a big hurdle for using DSPy. That's why we offer a synthetic dataset generation pipeline for RAG, agents, and a variety of LLM pipelines. More here: https://docs.relari.ai/getting-started/datasets/synthetic

2. Relari is an end-to-end evaluation and optimization toolkit. Real-time optimization is just one part of our data-driven package for building robust and reliable LLM applications.

3. Our tools are framework agnostic. If you can build your entire application on DSPy, that's great! But often we see AI developers hoping to maintain the flexibility and transparency to have their prompts / LLM modules work with different environments.

4. We provide well-designed metrics and/or custom metrics learned from user feedback. We find good metrics very key to making any optimization process (including prompts and fine-tuning) work.

rawsh · on May 22, 2024

I built a web version with WASM at https://pdfgrep.com a few years ago in case it’s helpful to anyone

rawsh · on May 22, 2024

https://github.com/VikParuchuri/marker is solid, but slow and needs gpu(s) to be practical

rawsh · on Oct 8, 2023

Is it possible to use this for hybrid search in combination with pg_embedding? My understanding is that hybrid search currently requires syncing with Postgres

philippemnoel · on Oct 8, 2023

Yes! We have another extension, pg_search, which is specifically for hybrid search using pg_bm25+pgvector. You can find it here: https://github.com/paradedb/paradedb/tree/dev/pg_search

rawsh · on July 27, 2023

Documents actually never get uploaded! PDF text extraction happens on the client using a web worker and MuPDF compiled to WASM.

1. PDF parsed and chunked on the client

2. Sparse vectors are regenerated for the entire document corpus and the existing vectors are updated

3. Dense vectors are generated for the new text and upserted along with the new sparse values

The original documents stay on your device.

replwoacause · on July 29, 2023

Yeah but what is GPT5?

rawsh · on July 26, 2023

Nope, it’s a serious project; I mostly made it for personal use during my last semester of college. I rewrote it a few times and packaged it up because I think it’s genuinely useful. Langchain gets you 80% of the way there but you run into issues with it very quickly.

hidelooktropic · on July 27, 2023

What is GPT5?

rawsh · on May 15, 2023

DankGPT is able to draw context from a library of documents (textbook, papers, class slides) to explain any topic and answer complicated reasoning problems.

It’s very similar to ChatPDF, but you can include multiple documents and it has much better context selection. This leads to better answers in practice (less “the source does not contain information on…” and hallucinations)

https://dankgpt.com

HN For You