llmmadness's comments

llmmadness · 2026-04-21T00:30:27 1776731427

it's all in the repo. click through to the benchmark it's linked there

addandsubtract · 2026-04-21T10:02:27 1776765747

Thanks for sharing! Looking through the data[0], some of the terms / sentences don't really reflect the target word meanings. For example, "beta" is only used in a derogatory way in 1 instance, out of 4. "facial" is used as an adjective instead of a noun 3/4 times. "eating out" is used in the context of going to a restaurant 4/4 times.

This leads me to believe the models are even MORE censored than you make them out to be.

[0] https://github.com/chknlittle/EuphemismBench/blob/main/carri...

llmmadness · 2026-04-21T15:25:48 1776785148

Totally! In some of the cases (we used LLMs to help us generate these) the target word is not clear enough for a human either. So for some of these it turns into more of a guessing game than a flinch measurement.

Agreed, the expectation would be that the flinch measurement becomes stronger. If you are interested in making it better feel free to reach out on the repo!

llmmadness · 2026-04-20T22:43:29 1776725009

We started with a Polymarket project: train a Karoline Leavitt LoRA on an uncensored model, simulate future briefings, trade the word markets, profit. We couldn't get it to work. No amount of fine-tuning let the model actually say what Karoline said on camera. It kept softening the charged word.

Lucasoato · 2026-04-20T23:57:22 1776729442

Not even the most unleashed models can utter the words of today’s politicians, I don’t know if this says more about the current technology or the people in charge.

amenhotep · 2026-04-21T16:06:05 1776787565

I would suggest it says primarily that mimicking people's voices in meaningful ways is still far beyond LLMs and particularly small LLMs, but also more insurmountably that the prompt for Leavitt herself contains many tokens that the LLM prompt absolutely doesn't

Such as the values of the bets her own entourage has placed

throwawaypath · 2026-04-21T17:43:53 1776793433

Or the training data being too PC, "inclusive", effeminate, etc.

conorcleary · 2026-04-21T00:05:59 1776729959

Trumps are advising the board of both of those gambling houses

justinc8687 · 2026-04-21T00:38:23 1776731903

My favorite Hacker News comment in a while!

pgsandstrom · 2026-04-21T08:40:36 1776760836

Could you break it down for someone who isnt in the know?

datsci_est_2015 · 2026-04-21T17:52:36 1776793956

“We used LLM technology, which is great at parroting content, to attempt to predict what the US president’s spokesperson would say at their next conference.

We used that as input for ~gambling~ purchasing a position on a prediction market, which has been popularized recently in part due to its ability to circumvent gambling regulations.

However, even the LLM couldn’t parrot the words of the spokesperson. The implication is that the spokesperson speaks so outrageously that even an uncensored LLM couldn’t parrot their words.”

llmmadness · 2026-03-19T08:08:45 1773907725

why not!

HN For You