More

botro · on April 3, 2025

This is damn near prescient, I'm having a hard time believing it was written in 2021.

He did get this part wrong though, we ended up calling them 'Mixture of Experts' instead of 'AI bureaucracies'.

robotresearcher · on April 3, 2025

We were calling them 'Mixture of Experts' ~30 years before that.

https://ieeexplore.ieee.org/document/6215056

stavros · on April 3, 2025

I think the bureaucracies part is referring more to Deep Research than to MoE.

botro · on March 29, 2025

I read the article on archive and figured there was a big chunk missing. It really does not make any sense.

Sutskever and Murati were methodical, they waited until the board was favorable to the outcome they wanted, engaged with board members individually laying the groundwork... and then just changed their mind when it actually happened!?

jdminhbg · on March 29, 2025

The article says Sutskever was blindsided by the rank-and-file being on Sam's side. Presumably he thought the outcome was going to be business as more-or-less usual but with Murati or someone as CEO and then panicked when that didn't happen.

ethbr1 · on March 30, 2025

Or someone said "If you don't switch and back me, I am going to fight every bit of your compensation. Or you can back me and leave with favorable terms."

Panic is a less likely driver.

botro · on March 22, 2025

Thanks for writing this out, it's helpful for me as a layman.

Isn't part of the prohibition on trades among officers and directors also because of the inside knowledge they have? Public companies generally report quarterly but the insiders presumably have up to the minute information on sales etc.

And while we wait on the quarterly data, consistent insider selling is indicative of ... something.

botro · on Dec 20, 2024

The LLM community has come up with tests they call 'Misguided Attention'[1] where they prompt the LLM with a slightly altered version of common riddles / tests etc. This often causes the LLM to fail.

For example I used the prompt "As an astronaut in China, would I be able to see the great wall?" and since the training data for all LLMs is full of text dispelling the common myth that the great wall is visible from space, LLMs do not notice the slight variation that the astronaut is IN China. This has been a sobering reminder to me as discussion of AGI heats up.

[1] https://github.com/cpldcpu/MisguidedAttention

kizer · on Dec 20, 2024

It could be that it “assumed” you meant “from China”; in the higher level patterns it learns the imperfection of human writing and the approximate threshold at which mistakes are ignored vs addressed by training on conversations containing these types of mistakes; e.g Reddit. This is just a thought. Try saying: As an astronaut in Chinese territory; or as an astronaut on Chinese soil. Another test would be to prompt it to interpret everything literally as written.

dwaltrip · on Dec 23, 2024

Interesting... It took me 3 different attempts, but I found a set of custom instructions that allowed Claude to get the right answer on the initial prompt. Here's the instructions (I tried to keep them as general and non-specific as I could):

Carefully analyze questions to not overlook subtle details. Take each question "as-is", don't guess what they mean -- interpret them as any reasonable person would.

botro · on Dec 7, 2024

I made https://aimodelreview.com/ to compare the outputs of LLMs over a variety of prompts and categories, allowing a side by side comparison between them. I ran each prompt 4 times for different temperature values and that's available as a toggle.

I was going to add reviews on each model but ran out of steam. Some users have messaged me saying the comparisons are still helpful to them in getting a sense of how different models respond to the same prompt and how temperature affects the same models output on the same prompt.

adrianomartins · on Dec 8, 2024

Hey, this is pretty insightful! Wonder if, in the course of researching to build this website you reached any conclusions as to what’s the AI assistant currently ahead.

rtsil · on Dec 8, 2024

I can confirm, it's still very helful, thank you!

botro · on Oct 22, 2024

And to take a historic analogy, cars today are as wide as they are because that's about how wide a single lane roadway is. And a single lane roadway is as wide as it is because that's about the width of two horses drawing a carriage.

falcor84 · on Oct 23, 2024

The story goes that this two horses width also limited the size of the space shuttle's boosters (SRB), so we ended up taking this sort of path-dependence off to space.

botro · on Oct 1, 2024

Which LLM is used for the book summaries and which TTS for the audio?

jazz3020 · on Oct 1, 2024

llm: prefer not to answer tts: open ai

opdahl · on Oct 1, 2024

How do you expect anyone to trust the summaries if you are not even going to disclose which AI model generated it?

jazz3020 · on Oct 1, 2024

The most common first action people take on our site is reading a summary of a book they've already read to assess its quality themselves. I don't think they care whether it was written by OpenAI or a monkey, as long as it's good.

botro · on Oct 1, 2024

This is a great video, thank you for sharing. My favorite part:

"...next we have this rubber sheet, which is very clever, and very patented!"

botro · on Sept 27, 2024

"Suddenly, the chat window on Sequoia’s side of the Zoom lights up with partners freaking out.

“I LOVE THIS FOUNDER,” typed one partner.

“I am a 10 out of 10,” pinged another.

“YES!!!” exclaimed a third.

What Sequoia was reacting to was the scale of SBF’s vision....We were incredibly impressed, Bailhe says. “It was one of those your-hair-is-blown-back type of meetings.”

This is 'smart money' in reference to Sam Bankman Fried.

s1artibartfast · on Sept 28, 2024

Not really the dig you think it is. They were right but got unlucky that SBF broke the law, which is hard to predict.

botro · on Sept 5, 2024

"The task consists of going from English-language specifications to Wolfram Language code. The test cases are exercises from Stephen Wolfram's An Elementary Introduction to the Wolfram Language."

I think this benchmark would really only tell me whether Wolframs book was in the training data.

bufferoverflow · on Sept 5, 2024

It's available online in HTML form, for free:

https://www.wolfram.com/language/elementary-introduction/3rd...

smusamashah · on Sept 5, 2024

Yeah, may be should skip that benchmark.

HN For You