More

CqtGLRGcukpy · 2026-02-23T05:00:01 1771822801

It's worth keeping in mind that Loops is made by dansup which has also made and runs Pixelfed, FediDB, and has a history of being hostile to developers.

You can see the recounting of his hostility at https://dansup-open-letter.github.io/appendix/

(I'm not a signature of the open letter)

boriskourt · 2026-02-23T09:31:41 1771839101

This comes up here and there to discredit the developer but having followed all the drama for many years now I just want to add that Dansup has apologized multiple times, and has been far more open about his process. His communication has also changed for the better over the last two years especially. Its not easy being human, and I think its a good sign to see that he takes this seriously.

Kovah · 2026-02-23T07:38:45 1771832325

Unfortunately, I can second this, both as a developer and a user. His IMHO childish behavior has ruined his image for me, and is not a good lighthouse for the Fediverse itself. Also, as a OSS veteran myself, I see it extremely critical that he is starting new projects all along, denies to get proper help and build up a maintainer team, and leaves older projects in the dust. Pixelfed is the one product he might should focus on, jet it feels like the platform is in maintenance only mode. Pixelfed is a wonderful addition to the Fediverse and deserves to be on good hands.

Maybe, and this is a very personal opinion, his product success and the Kickstarter campaign raising over 100k made him feel like he's better than everybody else. And one can see the effects.

vyr · 2026-02-23T10:30:07 1771842607

yeah he has a long history of saying dumb shit in public and then trying to cover his tracks.

also, having had to figure out some of the Pixelfed code for previous projects, i wonder if he's up to the task of maintaining any of this once the next shiny thing comes along. Fedi software has a lot of quirks in general (comes with the nearly nonexistent budgets) but as a representative issue, the dude managed to build a photo blogging service with no way to export or back up your photos and that hasn't been fixed in seven years.

ultimately, though, if we ignore software quality and developer reputation, Loops is going to live or die based on whether anyone on Fedi actually wants to make short-form video. given existing Fedi culture, plus how expensive it can be to produce and how the RoI is basically zero, i don't think we're going to see much native to Fedi. some might get crossposted by TikTok/Shorts/Reels creators that want a backup location that won't get erased the second someone makes a spurious copyright claim, but i suspect we're just going to see a few months of stolen TikToks and then not much after that.

creamyhorror · 2026-02-23T17:10:33 1771866633

> given existing Fedi culture, plus how expensive it can be to produce and how the RoI is basically zero, i don't think we're going to see much native to Fedi.

Yeah, actual adoption will require getting the actual people to come onboard who want to entertain/influence others, plus the viewers (two-sided market problem). When weighing that against network effects of the big players, the chances look a little slim.

Probably need another more low-effort or attractive angle to grow the Fediverse, tbh.

prmoustache · 2026-02-23T15:59:41 1771862381

the thing is anyone can already host short-form videos on many other fediverse/activitypub apps like Mastodon.

Same for image, I never really understood the point of pixelfed as other fediverse/activitypub apps can already host pictures.

vyr · 2026-02-23T17:07:09 1771866429

that's pretty much correct: these apps all do the same thing under the hood. the difference is UX. Mastodon's built for text and the media features are really barebones. no filters, no crop, no EXIF metadata, no albums. so that was Pixelfed's pitch.

RamblingCTO · 2026-02-23T13:42:51 1771854171

And this needless drama is relevant why? Can we keep that on the fediverse please?

ParadisoShlee · 2026-02-23T15:51:32 1771861892

Quick question: even if this is/was true, do you think isolating him is the correct response? or maybe engaging and taking some of the pressure off him might actually help.

People love to bully people who slip up... dudes a hot mess but I think he needs community more than being openly attacked.

CqtGLRGcukpy · 2026-02-23T18:15:33 1771870533

In my honest opinion, I think the correct response is to stop using anything that dansup makes until he realizes he has to give up some of the control, or the project stops being developed.

While it's nice if people want to help take some pressure off, it only works if the main developer (dansup in this case) is willing to accept that help. And based off the link I gave, and some of the comments here, it doesn't look like dansup is willing to accept help.

CqtGLRGcukpy · 2026-02-22T00:11:18 1771719078

FYI, the HN guidelines state, "Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity." I would encourage you to submit content from others rather than just your own.

CqtGLRGcukpy · 2026-02-19T20:10:44 1771531844

For when the article goes behind a paywall: https://archive.ph/EOL7V

CqtGLRGcukpy · 2026-02-19T03:23:50 1771471430

Radio stations realized they could make more money by having less DJs, or by having one DJ cover a bigger area.

It's not just YouTube Live, DJs moved to Tiktok and other live stream platforms.

I know I listen to less radio because I pay for a music streaming service that allows me to listen to exactly what I want, with no ads, and pays the artist more.

CqtGLRGcukpy · 2026-02-19T03:15:50 1771470950

I'm just waiting until it's cheaper for companies to hire humans than to pay AI to do what's needed. It will come, or I believe there will be a day AI is no more.

CqtGLRGcukpy · 2026-02-18T20:13:41 1771445621

"Support for the draft specification is available now in Pebble, a miniature version of Boulder, our production CA software. Work is also in progress on a lego-cli client implementation to make it easier for subscribers to experiment with and adopt. Staging rollout is planned for late Q1 2026, with a production rollout targeted for some time in Q2 2026."

CqtGLRGcukpy · 2026-02-14T20:09:39 1771099779

The AI companies won't just scrape IA once, they're keeping come back to the same pages and scraping them over and over. Even if nothing has changed.

This is from my experience having a personal website. AI companies keep coming back even if everything is the same.

giancarlostoro · 2026-02-14T21:12:00 1771103520

Weird, considering IA has most of its content in a way you could rehost it all idk why nobody’s just hosting a IA carbon copy that AI companies can hit endlessly, and then cutting IA a nice little check in the process, but I guess some of the wealthiest AI startups are very frugal about training data?

This also goes back to something I said long ago, AI companies are relearning software engineering poorly. I can think of so many ways to speed up AI crawlers, im surprised someone being paid 5x my salary cannot.

mlnj · 2026-02-14T21:26:15 1771104375

Unless regulated, there is no incentive for the giants to fund anything.

cm2187 · 2026-02-15T07:22:20 1771140140

There is no problem that cannot be solved with creating a bureaucracy and paperwork!

jniles · 2026-02-15T14:34:54 1771166094

I understand this is tongue-in-cheek, but do you have an alternative/better proposal?

cm2187 · 2026-02-15T15:00:48 1771167648

Let the market do. If good data is so critical to the success of AI, AI companies will pay for it. I don't know how someone can still entertain the idea that a bureaucrat, or worse, a politician, is remotely competent at designing an efficient economy.

mlnj · 2026-02-15T17:13:14 1771175594

All the world's data was critical to the success of AI. They stole it and fought the system to pay nothing. Then settled it for peanuts because the original creators are weak to negotiate. It already happened.

hn_go_brrrrr · 2026-02-15T15:19:33 1771168773

No they won't pay for it, unless they believe it's in their best interests. If they believe they can free-ride and get good data without having to pay for it, why would they lay down a dollar?

cm2187 · 2026-02-15T15:33:33 1771169613

Because the companies in control of that data won't let them have it for free, like what is happening in the article.

hn_go_brrrrr · 2026-02-15T15:46:21 1771170381

Or, they'll just create more technically sophisticated workarounds to get what they want while avoiding a bad precedent that might cost them more money in the long run. Millions for defense, not one cent for tribute.

AnthonyMouse · 2026-02-15T20:09:20 1771186160

Now apply the same logic to laws, except that laws are a lot slower to change when they find the next workaround.

And it's a lot harder to get the law to stop doing something once it proves to cause significant collateral damage, or just cumulative incremental collateral damage while having negligible effectiveness.

Nathan2055 · 2026-02-14T21:36:58 1771105018

That already exists, it's called Common Crawl[1], and it's a huge reason why none of this happened prior to LLMs coming on the scene, back when people were crawling data for specialized search engines or academic research purposes.

The problem is that AI companies have decided that they want instant access to all data on Earth the moment that it becomes available somewhere, and have the infrastructure behind them to actually try and make that happen. So they're ignoring signals like robots.txt or even checking whether the data is actually useful to them (they're not getting anything helpful out of recrawling the same search results pagination in every possible permutation, but that won't stop them from trying, and knocking everyone's web servers offline in the process) like even the most aggressive search engine crawlers did, and are just bombarding every single publicly reachable server with requests on the off chance that some new data fragment becomes available and they can ingest it first.

This is also, coincidentally, why Anubis is working so well. Anubis kind of sucks, and in a sane world where these companies had real engineers working on the problem, they could bypass it on every website in just a few hours by precomputing tokens.[2] But...they're not. Anubis is actually working quite well at protecting the sites it's deployed on despite its relative simplicity.

It really does seem to indicate that LLM companies want to just throw endless hardware at literally any problem they encounter and brute force their way past it. They really aren't dedicating real engineering resources towards any of this stuff, because if they were, they'd be coming up with way better solutions. (Another classic example is Claude Code apparently using React to render a terminal interface. That's like using the space shuttle for a grocery run: utterly unnecessary, and completely solvable.) That's why DeepSeek was treated like an existential threat when it first dropped: they actually got some engineers working on these problems, and made serious headway with very little capital expenditure compared to the big firms. Of course they started freaking out, their whole business model is based on the idea that burning comical amounts of money on hardware is the only way we can actually make this stuff work!

The whole business model backing LLMs right now seems to be "if we burn insane amounts of money now, we can replace all labor everywhere with robots in like a decade", but if it turns out that either of those things aren't true (either the tech can be improved without burning hundreds of billions of dollars, or the tech ends up being unable to replace the vast majority of workers) all of this is going to fall apart.

Their approach to crawling is just a microcosm of the whole industry right now.

[1]: https://en.wikipedia.org/wiki/Common_Crawl

[2]: https://fxgn.dev/blog/anubis/ and related HN discussion https://news.ycombinator.com/item?id=45787775

ccgreg · 2026-02-15T01:37:26 1771119446

Thanks for the mention of Common Crawl. We do respect robots.txt and we publish an opt-out list, due to the large number of publishers asking to opt out recently.

There's a bit of discussion of Common Crawl in Jeff Jarvis's testimony before Congress: https://www.youtube.com/watch?v=tX26ijBQs2k

radiator · 2026-02-15T10:49:23 1771152563

So perhaps the AI companies will go bankrupt and then this madness will stop. But it would be nice if no government intervenes because they are "too big to fail".

miki123211 · 2026-02-15T13:19:11 1771161551

Are you sure it's the AI companies being that incompetent, and not wannabe AI companies?

What I feel is a lot more likely is that OpenAI et al are running a pretty tight ship, whereas all the other "we will scrape the entire internet and then sell it to AI companies for a profit" businesses are not.

polytely · 2026-02-15T18:27:08 1771180028

They run a tight AI ship but it is in their interest to destroy the web so that people can only get to data through their language model

Peaches4Rent · 2026-02-15T14:06:21 1771164381

OpenAI cannot possibly running a tight ship, even if they have competent scientists and engineers.

zmmmmm · 2026-02-14T23:26:10 1771111570

yeah, they should really have a think about how their behavior is harming their future prospects here.

Just because you have infinite money to spend on training doesn't mean you should saturate the internet with bots looking for content with no constraints - even if that is a rounding error of your cost.

We just put heavy constraints on our public sites blocking AI access. Not because we mind AI having access - but because we can't accept the abusive way they execute that access.

disposition2 · 2026-02-14T23:36:14 1771112174

Something I’ve noticed about technology companies, and it’s bled into just about every facet of the US these days, is the consideration of if an action *can* be executed upon vs *should* an action be executed upon.

It’s very unfortunate and a short sighted way to operate.

HWR_14 · 2026-02-15T04:09:54 1771128594

The main issue is a well behaved AI company won't be singled out for continued access, they will all be hit by public sites blocking AI access. So there is no benefit to them behaving.

AnthonyMouse · 2026-02-15T20:24:30 1771187070

> So there is no benefit to them behaving.

That's assuming they're deriving a benefit from misbehaving.

There is no benefit to immediately re-crawling 404s or following dynamic links into a rabbit hole of machine-generated junk data and empty search results pages in violation of robots.txt. They're wasting the site's bandwidth and their own in order to get trash they don't even want.

Meanwhile there is an obvious benefit to behaving: You don't, all by yourself, cause public sites to block everyone including you.

The problem here isn't malice, it's incompetence.

CrossVR · 2026-02-15T05:49:38 1771134578

Why should a well-behaved AI company be singled out for continued access? If the industry can't regulate itself then none deserve access no matter if they're "well-behaved".

Receiving a response from someone's webserver is a privilege, not a right.

goku12 · 2026-02-15T10:39:01 1771151941

Honestly, has any of these AI companies ever offered a compensation for the data they pillage, except in case of large walled up information silos like reddit? This is like asking why the occasional burglars are not singled out for direct access into your house, compared to the stripmining marauders out there.

Why does any of them deserve any special treatment? Please don't try to normalize this reprehensible behavior. It's a greedy, exploitative and lawless behavior, no matter how much they downplay it or how long they've been doing it.

miki123211 · 2026-02-15T13:23:52 1771161832

No single piece of content (unless you're a really large website) is worth the paper that such a contract would be written on.

This is the problem with AI scraping. On one hand, they need a lot of content, on the other, no single piece of content is worth much by itself. If they were to pay every single website author, they'd spend far more on overhead than they would on the actual payments.

Radio faces a similar problem (it would be impossible to hunt down every artist and negotiate licensing deals for every single song you're trying to play). This is why you have collective rights management organizations, which are even permitted by law to manage your rights without your consent in some countries.

energy123 · 2026-02-15T09:12:58 1771146778

This is just tragedy of the commons.

dawnerd · 2026-02-15T16:49:31 1771174171

It’s insane actually how fast to re-request the same pages, even 404s. They’re so desperate for data they’re really hurting smaller hosts. One of our clients site became unusable when one of the ai bots started spamming the Wordpress search for terms that I’m guessing users were searching for but were unrelated to the sites content. Instead of building a search index they’re just hammering sites directly. So annoying.

nickpsecurity · 2026-02-15T15:26:27 1771169187

It can be 10,000 requests a day on static HTML and non-existent, PHP pages. That's on my site. I'd rather them have Christ-centered and helpful content in their pretraining. So, I still let them scrape it for the public good.

It helps to not have images, etc that would drive up bandwidth cost. Serving HTML is just pennies a month with BunnyCDN. If I had heavier content, I might have to block them or restrict it to specific pages once per day. Maybe just block the heavy content, like the images.

Btw, anyone tried just blocking things like images to see if scaping bandwidth dropped to acceptable levels?

iririririr · 2026-02-14T21:48:31 1771105711

> The AI companies won't just scrape IA once, they're keeping come back to the same pages and scraping them over and over. Even if nothing has changed.

Maybe they vibecoded the crawlers. I wish I were joking.

terminalshort · 2026-02-15T02:21:40 1771122100

Isn't this just how crawlers work? How do you know if a page has changed if you don't keep visiting it?

heavyset_go · 2026-02-15T12:32:26 1771158746

HEAD requests

Findecanor · 2026-02-15T14:16:34 1771164994

Scrapers could also use the "Cache-Control", "Expires" and "If-Modified-Since" headers properly, to reduce their traffic a little, but do they?

_zagj · 2026-02-15T03:16:42 1771125402

> The AI companies won't just scrape IA once, they're keeping come back to the same pages and scraping them over and over. Even if nothing has changed.

Why, though? Especially if the pages are new; aren't they concerned about ingesting AI-generated content?

dragonwriter · 2026-02-15T03:26:10 1771125970

Possibly because a lot of “AI-company scraping” isn't traditional scraping (e.g., to build a dataset of the state at a particular point in time), its referencing the current content of the page as grounding for the response to a user request.

CqtGLRGcukpy · 2026-02-11T01:22:51 1770772971

Right now I pay $0, I don't see the benefit in paying for it when it provides enough in their free plans.

CqtGLRGcukpy · 2026-02-07T03:31:12 1770435072

What do you define as an international job board? On weworkremotely.com I see roles for those in Europe (sometimes you will have to dig into the job posting or go to the company website to see where exactly where they are hiring people from)

15charslong · 2026-02-07T06:11:39 1770444699

Pretty much Non-US persons for Non-US Remote jobs. Almost all Remote Job boards prioritize US-based Remote Jobs, which can get a bit difficult for someone who just wants a job.

I am currently job hunting, and I also use WeWorkRemotely.

I was just wondering there are any others that are getting missed when I search on Google.

And yeah I've had to do that as well, even though they list it as Worldwide, it really is just from like 5 countries in Europe, no hate. I get it.

CqtGLRGcukpy · 2026-02-06T03:25:12 1770348312

Original story at https://edition.cnn.com/2026/02/04/food/pizza-hut-closures

HN For You