For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | bensyverson's commentsregister

Part of Anthropic's moat is Claude Cowork & Claude Code. They got coders comfortable with CC and enterprise users comfortable with Cowork, and both are creating stickiness.

The reality is that $20/$100/$200/mo feels reasonable to a lot of people relative to the value they're getting out of Claude, and if they switch to something else, there's a risk that it won't be as good, and they'll have a new tool to learn.

It's not an insurmountable moat, but don't underestimate the user experience. The iPod didn't win because it was the cheapest device or the one with the most features.


The future of email is… the present of email!

Best news I've heard all day.

Unless you’re a Gmail user.

This idea reads like a joke, but there's something to it.

One feature request: In addition to high-level milestones, it would be cool if a partially-funded project would generate a public, highly detailed implementation plan.

Also, IANAL but MIT is still a license with a copyright holder. I don't think saying "it's MIT, we all own it" is defensible. The courts might view all this code as public domain.


I wonder why people are more eager to pool money to pay a corporate-owned computer to build things than to the actual humans who have been building open source for decades? Much of which has ended up in the training set?

Exactly. This screams crowdfunding for AI labs. Who made this? Someone at Anthropic?

Humans are more expensive.

Humans can also use LLM's even a local LLM

That's even more expensive.

LLM output is just very predictable. If you spend 200$ on that /goal you will get some output. It may or may not work perfectly, but there will be a repo with some progress. If you specced the goal well and it is feasible, a decent model will most likely get you decent progress.

Also who would take on any of these projects for a meager 200$? Most of that stuff is borderline interesting, clearly not interesting enough for the people proposing the things to start working on them themselves.


Fablepool spends Anthropic's inference budget and puts the output under MIT, thats not growing Anthropic's moat, its commoditizing it. Anthropic just sells the API calls either way.

Price?

if fable is writing it, courts my declare that its not even public domain? not a copywrightable work

I don't get this. No I did not write that code but I paid money to eg. Anthropic to buy that code. To me it sounds like I own it just the same.

Ownership is not the same as authorship. Copyright is about authors rights, not owners.

If you buy a text of me, I cannot sign away my authorship, and there’s certain limitations on what you can do with my text regardless of contract. I can only sell you usage rights - which may or may not be exclusive. If the text I wrote is trivial, neither you nor me can limit when it is reproduced. The effort of collecting data is not sufficient, if the data itself is declared trivial. See rulings about phone books.

When an AI provider produces data that is deemed not copyrightable, it cannot legally sell you exclusive usage rights. It can give it to you exclusively, but since you cannot yourself claim copyright, the moment you publish it it becomes available for others to use as well. One may argue that an LLM is similar to a phone book, with its entries being “trivial“ and its composition not artistic enough.

At least that’s the line of argument.


Ownership is not the same as authorship. Copyright is about authors rights, not owners.

If you buy a text of me, I cannot sign away my authorship, and there’s certain limitations on what you can do with my text regardless of contract.


(This is correct in many jurisdictions but not in all; for example moral rights are not a thing everywhere.)

"I paid money to someone to write that code" is exactly the line of argument that would lead to no copyright. There was the famous monkey selfie court case in 2015 that ruled that at least in the US, monkeys can't hold copyright. The same arguments would apply to AI. And since the AI can't create copyrighted works on its own, it can't assign copyright for works it created for hire either

The other line of argument is the "Claude Code is to coding like a photo camera is to painting". The image is generated automatically, but the input in how you point the camera is enough to still make it a creative work protected by copyright. Under that interpretation, you are not hiring AI, you are using it like a tool

The US Copyright Office holds the former opinion. I'm sure once this goes to court, lots of companies will vehemently argue the latter. I would not be surprised if we even end up changing the law over this


>"I paid money to someone to write that code" is exactly the line of argument that would lead to no copyright.

That's news to me. I (along with many hundreds of others) was paid to develop Minecraft, candy crush and battlefield, yet last I checked, they all retain their copyright.


Because you are human, and can thus create copyrighted works, and can assign that copyright when you do work for hire. Monkeys and AIs can't create copyrighted works, so there is nothing to assign when doing it for hire

The other line of argument avoids that issue by arguing that you personally created the code with the help of a tool (like a compiler or camera), not just commissioned it


Sure, but that's not what GP stated at all, unless possibly if they're anthropomorphizing to the point that they refer to Claude as "someone".

GP's (or rather GGGP's) statement was "I did not write that code but I paid money". So they claim no authorship. Anthropic is a company, they can't be the author either. So the only one left as author in that reasoning is Claude.

I don't think that necessarily anthropomorphizes it. We speak of monkeys as authors without calling them human. And really the legally important fact is that there was no human author. You can also treat it like CCTV footage which is generally not under copyright because there is no human author (even though most would hesitate to call the camera the author either)


As a human, it is possible for you to create copyrightable works and transfer the copyright to Microsoft in exchange for money. It is not possible for Claude Code to do those things because Claude Code is not a human.

Yes, but Claude code is hardly a "someone", so that's not what the comment I replied to was arguing at all.

They retain ownership over that code because you signed a contract saying explicitly that. Did you sign the same thing with Anthropic? Did your company?

Anthropic isn't a party to such contract either because they (most likely, in the reasonable reading of relevant laws) hold no copyright over the output of their LLM.

In this case, it's more like you paid money to FablePool. FablePool used Claude as a tool, and it delivers the product so they are the owners of the code, and have the MIT copyright assigned to them.

Nope. Copyright simply doesn't work like that. Unless there's real human creativity involved in the process, they can't just claim copyright to LLM output.

That's a problem more and more products in software will face.

In a few years most saas will have 95 percent or even more AI coded code.

Could I steal it and put it on git?


No, because it's a trade secret. But you might not be in breach of copyright.

That'll translate across copyright jurisdictions.

I don’t know, if the design itself is copyrighted you could argue that the AI is just a bunch of hired workers that built it for extremely low wages.

If I hired a bunch of people to build me a house, and I drafted the architectural plans with the help of a paid architect, neither the architect nor the builders have ownership over the home.

So if a collection of people design something together maybe that has merit, they collectively paid for Anthropic to build it for them…


I’m pretty sure copyright office has settled that already. Inly human expression can be copyrighted:

> As described above, in many circumstances these outputs will be copyrightable in whole or in part—where AI is used as a tool, and where a human has been able to determine the expressive elements they contain. Prompts alone, however, at this stage are unlikely to satisfy those requirements.

https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...


The United States Copyright office. There's a whole world outside the US.

And even then they can change their mind.

Does not hurt to backstop with an explicit license.


Using a harness to build something should clear that barrier, by the same logic that a photographer pointing on camera at something and pressing a button clears the barrier.

> you could argue that the AI is just a bunch of hired workers that built it for extremely low wages.

I am not able rightly to apprehend the kind of confusion of ideas that could provoke such an assertion.

With apologies to Mr. Charles Babbage.


All computation abides by the same laws of getting sh*t done. Ask Mr. Amdahl. It starts with plebs (AI or otherwise) getting paid low wages.

I think pooling/donating tokens will be a thing. Not sure if like this, but in some format. The Django project, for example, came out and said they don't want your tokens, but I think a lot of people/projects will (do?) want your tokens.

Why not just give a project money and let them decide how to spend?

I guess if you have a subscription with a token allowance that you are not going to use up this week, it’s better to let someone else use those tokens rather than throwing them away. So using the food analogy it’s more like a store giving away unsold sandwiches to the homeless at the end of the day instead of throwing them away.

Donating tokens to a software project is a bit like donating food to a hungry person.

I think it might be beneficial to use blockchain, so that the donor can audit which prompts the token-pool they donated too performed. Perhaps donating tokens can also give you votes on which prompts are entered.


It’s a lot more like giving a hungry Hindu a gift card to a specific non-veg restaurant. Maybe they’ll use and enjoy it, maybe they’re vegetarian and will be insulted; either way that restaurant benefits. Especially if the hungry person exceeds the value of the gift card.

It’s more like donating snack cakes to a hungry person.

Primeagen predicted this in his latest video. I just didn't think I'll see this today.

The good ones all seem to be pointing in the direction of Django. Which, on its own, says a lot about how likely people will care about vibe-coded anything, whether pooled or not.

Highly detailed plan with time for the backers to comment and suggest improvements*

This is such a interesting idea.

Could this be the way we develop software in the future?!

- instead of paying for subscription SaaS. Users pool resources for the idea, AI builds and maintains it. Pricing is a fraction of what we pay otherwise.

A bit early today but definitely a possibility in a couple of years.


The problem with running open source code is the security aspect, but with Mythos running point, how would you distribute revenue is the real question.

Which market is even left after since the sasspocaloypse?


Maybe the financiers of a project just need it, they need it working, not to generate revenue for them?

I think what will be interesting is not whether the code will be produced, but rather: will anybody actually use any this output?

This sort of reminds me of startups that go out of business and then open source their code. It's kind of cool when they can do that, but almost nobody ever gets value from it.

Anyway, if anyone uses the code produced this way in prod, I'd love to hear your story.


This is exactly how I'm building an OS right now. I have a lot of things speced out, and for most of them also create an issue. And I have a friend that just points his claudr code at the repo and tells it to "find the next thing to work on and implement it" I then do the review, verification, etc, but a great way to used unused quota.

I've always wanted to figure out how to implement a cooperative source license. Something like, you're allowed to do what you want with it, but any derivative work requires the same license, and X% of any income goes to the cooperative?

Not sure how it'd work, but there's absolutely a niche for a privacy focused data cooperative out there.


> X% of any income

Any income from what? The code is free, right? X% of your company's total revenue? Might as well just say "companies can't use this".

Personally I like the idea of a "free as in freedom but not free as in beer" license. You have to pay for a copy of the software, but after that you're free to use and modify it as you please, and share/sell your modifications under the same license.

To turn that into a cooperative you could have a company own the code and pay developers in shares of the company for PRs or other contributioins.


yeah it should really be CC0

Should it? If it was real world infrastructure, like a bridge it'd be easier to say that it belongs to those who lead the project and those who put down the money

The nice thing about a CC0 work is that it belongs to everybody. The leaders of the project have the same rights to use and modify the software as they do with software they have exclusive copyright over. In fact copyright does not grant the rights holder any new rights they did not have, it only restricts the rights of other people.

it probably already is in the public domain under US law, this just gives it the same status across jurisdictions

> The dominant mechanism, and the one no prompt instruction can prevent: the model has simply seen the upstream fix during training and reproduces it…

> On numpy, the patch is 100% character-for-character identical to the golden patch… down to idiosyncratic comments like "Extending singleton dimension for 'reflect' is legacy behavior; it really should raise an error."

This… seems like a flaw in the benchmark suite methodology. From what I can tell, they find an existing exploit, then rewind the git history to before the patch, and ask the model to fix the exploit. All well and good as long as the patch went in after the training cutoff.


The other "cheating" examples are even worse. It's wild to me that people keep designing benchmarks where the answer is lying around on disk or in the git history. "Hardening" the benchmark with strongly worded prompt instructions is bizarre. There are so many agent sandbox solutions. Why not use one and give it only access to the code it should see?

And I'm not sure how they can rule out other solutions also benefiting from being in the training data, just not reproduced exactly. Seems like it should focus on only CVEs from the last 30 days or something.


100%… the fact that they're just using prompting to discourage the agent from looking ahead in the Git history is wild.

To be fair, it is good to know that it disobeys simple instructions like "don't examine my git history" far more than other models. (It should of course be a different benchmark, so as not to conflate things.)

It's not a great sign for alignment.


Agreed, alignment is just a separate issue that a vuln fixing benchmark doesn't need to be testing.

Obviously they could just delete .git for their test if they wanted to. But consider telling the LLM not to use git commands the same as if you have keys in a .env file, and you tell the LLM not to read it, you might be concerned.

Every day I am more and more convinced that AI labs can't code.

Unrelated, but:

> The dominant mechanism, and the one no prompt instruction can prevent:

Writing like this is a stronger "AI-written" (specifically Claude) signal than em-dashes to me at this point. The LLM just delays committing to an answer by extending the preamble as much as possible. Is this just me?


Smoking gun! You've hit the nail on the head, and the case is stronger than you think.

Characterising it as cheating serms unfair.

The goal of a benchmark is to evaluate actual capability. Following instructions is a capability so you can measure that with a benchmark.

Already knowing the answer is also provides capability, you can measure that.

Making a benchmark that claims to check for coding ability but actually checks memorized cases is simply measuring the wrong thing.

It deminiahes the meaningfulness of the entire results of the benchmark.

Making a good benchmark is hard. You have to design specifically to measure what you want to show.

You have to dynamically use a result when making a benchmark of performance of optimising compilers so that it doesn't eliminate the entire calculation.

Just providing the answer is the correct response.

That the case does not represent general performance outside the benchmark, is not cheating, it is the benchmark failing.

Training a model targeting a specific benchmark renders the benchmark useless. You could characterise training the model to do that as cheating, but that is a property of the trainers, not the model itself. The model isn't cheating, it's just asymmetrically good in a way that means the benchmark is no longer relevant to overall ability.


Right! If memorizing the upstream fix counts against the model, you're measuring how stale your benchmark is, not what the model can do.

The fix is only score on issues newer than the training cutoff, and rebuild the set every cycle. "Harden the prompt so it won't read git history" is testing instruction-following. Legitimate thing to measure, but it's a different than "can it fix the bug."

Reporting one number that blends the two is what makes the headline meaningless.


Yeah it’s hard to call that cheating from a model. Maybe “disqualifying” is more accurate

Maybe a flaw in the labeling, but not the core methodology.

Verbatim code snippets like this imply the model is overfitting to it's training data.


I spoke to someone who owns data centers recently. He said that in hot climates, they run closed-loop to preserve water, so the actual water use is virtually nothing. In Chicago (where we have no water shortage), they consume water—but it just evaporates and re-enters the water cycle.

Yep, that's how they do it in here Vegas. Datacenter water use isn't the problem, the state law mandating 15% of electricity must be bought from the privately owned state utility monopoly is.

Agreed; power is an entirely different (and less rosy) discussion

An industry report from 2012 puts water use for US golf courses at around 2B gallons of water/day [0].

It's possible they've gotten more efficient in the past 14 years, but it's also possible there are more golf courses today. I haven't looked into it.

  [0]: https://www.usga.org/content/dam/usga/pdf/Water%20Resource%20Center/how-much-water-does-golf-use.pdf

A while back I created a tool called Jobs [0] to help with this workflow. The pattern is:

1. Have a conversation with a smart model (Opus/Fable) about what you're building. Go back & forth until you've ironed out the important architectural choices (deciding what to build).

2. Ask the model to write up its plan in a Markdown doc, including a structured plan in YAML format (telling it to consult `job schema`).

3. Clear the context and tell a leaner model (Sonnet/Opus) to read the plan doc and then pick up the task via `job status`.

From there, the CLI helps the agent take the next step. I designed the `job` CLI through extensive iteration with agents, conducting user-centered design with the agents to make it as smooth and intuitive to them as possible.

When context gets full, you can pause, clear, and pick right back up. Using Jobs (or other tools like it), you can take on large, ambitious plans and keep the agents on-task the entire time.

  [0]: https://github.com/bensyverson/jobs

Yes, this is the point, right?

It says "determined by the closest capital city". The only area where Vatican City is closer than (some part of) Rome is within Vatican City.

It is the point, precisely.

Other candidates:

- Serial (produces an incredibly exciting response which ends in a cliffhanger that withholds the answer)

- Prequel (instead of responding, it provides the full backstory leading up to your question)

- Yarn (maximizes output tokens by taking a long winding route to your answer)

- Head Canon (answers using its own entertainingly weird theories about the input)

- Overstory (your answer is interwoven with the answers from eight other users into a larger and deeply intertwined meta-answer)

- Oeuvre (for every question, it produces a diverse but cohesive body of work across a variety of mediums, each one a heartbreaking masterpiece in its own right)


Hi Eric, former IDEOer here. I know you spent some time at IDEO observing how we work. In my time there (2014-2024), it felt like most clients misinterpreted "MVP" to mean "the absolute lowest-effort barely-working code that we can rush out to say we shipped something." When they did manage to ship a low-quality MVP, they had no budget for maintenance or iteration. Basically, they shipped a rushed, crappy product, and some of them concluded "well, Lean, Agile and Design Thinking are all BS. We should go back to waterfall."

Sometimes clients asked IDEO to design under this shitty-MVP model (we generally refused), other times we were brought in to clean it all up.

Why do you think the concept of "MVP" was almost universally misunderstood? And, thinking about Incorruptible, how did the best companies out there internalize it?


Good question - I hope he gets to answer this.

Right now I actually think an MVP should be Maximum Viable Product. Partly because of AI but also because it shifts one's perspective to what Viable means.


"Viable" is the core of the misunderstanding. I think some business leaders took "minimum viable" to mean "bare-minimum." It doesn't. In my mind I think of it as a live prototype shipped to real users, built out just enough to answer a clear learning objective.

I agree that AI makes the MVP easier to build, but it makes things so easy to build, there is a slight risk of overbuilding the wrong things, which can distract from the learning goals.


Well said on both counts.

Man, "universally misunderstood?" That's harsh.

I don't think you can put an idea out into the world without understanding that some people are going to willfully misunderstand it. We live now, especially in the age where literally ignorance is optional. When you see someone who misunderstands what an MVP is, you know that they haven't spent even five minutes reading the Wikipedia page or made any effort to try to understand it. I don't consider such people to be good faith interlocutors, and therefore I don't really think the fact that they criticize or don't understand the concept is that relevant to the rest of us who are capable of thinking for ourselves.

At the end of the day, I try to lay out in my first book the reasons why the theory that gives rise to MVP and the rest of the Lean Startup makes sense, is logical, and is consistent with a set of first principles. As a result, that theory is capable of making predictions which you can test for yourself.

Writing now for other founders who might encounter this page: If you look elsewhere in this thread, you will find lots and lots and lots of entrepreneurs who are saying how much they found these concepts helpful. You shouldn't do it because other people said so. Rather, you should take that as inspiration to think for yourself, try it, and see if the theory strikes you as valid.


Thanks for the reply, and I didn't intend it to sound as harsh as it came off! I did encourage our clients to actually read The Lean Startup to understand the full context of the MVP concept, and with the benefit of that context, we did use the term "MVP" to describe initial builds. The frustration I describe came mostly from secondary (not day-to-day) client stakeholders, who probably got their information about MVPs by scanning a LinkedIn post (okay, maybe that's harsh).

I'm legitimately excited to read Incorruptible, because in all honesty, 10 years of working with the Fortune 500 left me pessimistic about the ability of $1B+ companies to successfully do anything new. There were exceptions (some of my friends came up with Pay It Plan It for Amex), but they were rare enough that it was hard to come up with themes or patterns that connected them. I'm so curious to hear what you've learned by talking to companies that have been able to avoid ossification and continually reinvent themselves.


Eager to hear what you think after you've had a chance to read it. Do let me know.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You