For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | btown's commentsregister

Never underestimate the power of an LLM that's spent its entire context passing its own self-generated strings to `bash`, to think "maybe the quickest way to get this done is to pass a self-generated string to `bash`."

The S&P 500 may not be a fund itself, but Standard & Poor's is a business whose ability to sell services is correlated with the continued relevance of the S&P 500. It absolutely does balance interests - namely, its own - beyond simply being an academic vehicle for communication of a stable thesis.

It seems entirely reasonable to say: "if we make a certain decision, we correlate both our reputation and a nontrivial portion of the U.S. economy with the whims of one of the most volatile personalities in industry, and we should likely pay attention to this trial balloon that shows such anticipatory fear of the decision that we might lose our reputation as an index altogether."


> absolutely does balance interests - namely, its own - beyond simply being an academic vehicle for communication of a stable thesis

As a business, sure. As a committee, it’s still a deeply technical process. I can say with a lot of confidence that optics weren’t considered in any of this, possibly to a fault.

> and a nontrivial portion of the U.S. economy

This vastly overstates the amount of assets tied to the S&P 500. It’s a lot. But it’s a strong minority of equity exposures.


> I can say with a lot of confidence that optics weren’t considered in any of this, possibly to a fault.

How can you possibly know that? Do the people on that committee have a cast-iron tenure guarantee?


> How can you possibly know that?

I know folks who have been on these. They don’t have tenure. But they’re basically emeritus. If S&P wanted to do something that would cause chaos, it would be fucking with those folks because they made a decision that looks bad.


It’s a public benchmark fund that has much of its value based on its decisions being publicly stable and publicly consistent.

Who would want to invest in a benchmark fund with arcane(the literal term as opposed to mundane) rules that were privately decided? If your statement is accurate it sounds like moving out of such a fund would be prudent. I feel like it’s not accurate since they are sticking to their guns and not changing the rules to benefit oligarchs like Musk such as Nasdaq is doing.


> Who would want to invest in a benchmark fund with arcane(the literal term as opposed to mundane) rules that were privately decided?

There are lots of rules-based funds. S&P is transparently committee based. It’s why dual-class new entrants are banned, but Google and Berkshire are grandfathered in.

There is a genuine debate on rules versus committees in the index world. But S&P has stuck to its guns as a bastion of the latter. And it works. Everyone picking the S&P 500 over its competitors chooses that.


> Everyone picking the S&P 500 over its competitors chooses that.

I'm fairly confident most people deciding to allocate to s&p trackers have no idea about rules-based vs committee-based governance. They just pick the default. And that default can quickly change if the S&P starts making weird/unpopular decisions in a highly publicized situation.


> most people deciding to allocate to s&p trackers have no idea about rules-based vs committee-based governance. They just pick the default

A lot of retail goes into S&P lookalikes. And at the end of the day, they've consistently picked one over the other.

> that default can quickly change if the S&P starts making weird/unpopular decisions in a highly publicized situation

Unlikely. Nobody has dropped NASDAQ 100-tracking funds. If anything, these guys will see long-term net inflows due to this move. S&P probably would have if they’d changed rules—this was an econometric, not business, decision.


Just a FYI, S&P rolled back the dual-class rule. It was in place from 2017 to 2023.

Doesn't https://advisors.vanguard.com/investments/products/vfiax/van... have $1.7T in AUM alone? That single fund is already a significant part of the $20.8T in AUM across index funds.

https://www.ici.org/research/stats/combined_active_index_042...


There's overlap between strong minority and nontrivial, so not sure how it can be vastly overstated. Do you have numbers you can add to this, or any explanation of equity exposure etc?

> Facts are the meaning pulled out of each episode, stored as subject-predicate-object records with a plain summary and timestamps for when the fact was introduced and when it was invalidated (subject=person, predicate=works_at, object=company). Facts form a graph with typed edges between them: X is in tension with Y, A is derived from B, J supersedes K.

I've always thought that knowledge graphs/expert systems, and even the broader concept of entity-attribute-value storage, got an unfairly bad reputation because of the 1970s/1980s "AI Winter."

And I think that perhaps this reputation is why so much of the oxygen in the RAG space has been consumed by the notion that "RAG = retrieval of fragments by vector similarity."

The difference now from decades ago, of course, is that now LLMs can do both the job of maintaining that graph at scale, and being able to agentically run successive queries to explore for best practices in any situation! And these have reached the scalability where any small business can build and use their own expert system.

I really want to see this approach win, because I think there's such an opportunity to explore even more data structures and approaches from the past and how their impact can be reimagined. If LLMs do indeed approach AGI, it will be in large part due to the ability to use tools (there's some evolutionary irony there, too) - and we should be trying every kind of underlying storage for those tools that we can, standing on the shoulders of giants.

(And curious what database you use for the knowledge graph - those are also a place where we stand on the shoulders of giants!)


really great perspective. A lot of techniques from the past aren't conceptually wrong, we just have the tools today to make them efficient. The intuition behind them was always reasonable, if you could amortize the cost of making them work at scale. Appreciate the vote of confidence!

And re: the graph -- Postgres stays king here. There are a lot of fancy database mechanisms for building systems like this, but the convenience of a SQL data structure that can tie the graph into structured metadata is pretty unbeatable. This may evolve with time as well.


Yay for Postgres! Curious if you find yourself using recursive queries in Postgres to traverse the graph - or is there an LLM in the mix that's looking at the "frontier" of relevant facts and choosing whether to go deeper, and whether an entity has an alias?

(Along those lines, I recall lots of this getting messy in a pre-LLM project the moment someone said "merge these two CRM accounts and their histories, but oh whoops turns out they were different all along, and only some of the updates should have applied" - there's a whole set of interesting challenges around attributing EAV when the very notion of object identity evolves over time. Whether a fact is relevant is really a judgment that can only be made with full context - but we now have tools that eat context for breakfast!)


very sharp questions, love 'em. Yes, your intuition is correct. We by default will gather information k layers removed from the "frontier", and then have a shallow agentic step that can determine if we need to go further and at what nodes (essentially doing a graph traversal without a fixed termination condition). Relevance detection is a hard problem; we think we have something good, and we're experimenting/iterating towards something great

On top of this, there's a vast difference between "what do you mean that team spent $1000 on AI in their expense report, what did we get for that?" vs. "oh, the company-wide AWS bill went up by a few percent, let's look into that when we have time." The latter makes projects far more viable.

But note that this difference is the result of bad accounting.

Well, as framed its bad accounting.

OTOH, the other form is that instead of generic AI spend going up it is total spending for a particular AWS account within the umbrella of the firms AWS organization, so that the spending is attributed to a specific project whose use case, other costs, and (presumably) benefit and/or revenue can be considered.

Of course, if your AWS stuff is just one undifferentiated bucket, that’s a problem, but AFAICT AWS (like GCP) is much better set up for tracking use and costs by project than OpenAI (or Anthropic), because its an enterprise cloud provider where fitting into how large organizations track things at multiple levels is as much a core competency as any technical feature, whereas OpenAI and Anthropic are AI technology providers that are much less mature as enterprise vendors.


There are so, so many things that NPM could do.

It could require a 48 hour cooldown period on any package update that wants to add an install script that didn't have one before, and has a certain number of downloads. And it could publish the list of these so security researchers have an opportunity to scan them.

It could add an optional key to package.json that allows someone to whitelist which packages can run install scripts.

It could add a Hardened Security program where (1) package maintainers could opt into a program where multi-factor confirmation by maintainers is required on every publish, even those triggered by CI; (2) this hardened package status would be public, and (3) a developer could set a flag in their package.json that causes any npm action to act as if all non-hardened packages had frozen versions.

And so much more.


You realize that "dependency cooldowns" as a popular concept are extremely new, right? npm manages the installation of dependencies for millions upon millions of users across the globe.

> It could add a Hardened Security program where (1) package maintainers could opt into a program where multi-factor confirmation by maintainers is required on every publish, even those triggered by CI;

Great, they did this.

> And so much more.

This shit takes time. Yes, they should have done this on day 1. Acting like any of this is easy to retrofit is just nuts though.


What is being said is that a new flag like '--minimum-release-age' would take, realistically speaking, tops 4 hours to implement (without AI assistance), plus a good 1 week of thorough testing, and maybe a 1 month period of progressive deployment. Come on, let's give it a total of 1.5 months, for good measure.

Of course this should have been started since the beginning of the major recent stream of supply chain attacks, circa 2024 or 2025... but even assuming the most backwards calendaring possible -starting after the last bug compromise (Axios, on March 31st)- that new flag should have already been shipped a couple weeks ago.

Shit does take time, but where there's a will there's a way, and nobody buys that this shit would take that much time.


Have you ever managed software as critical and ubiquitous as npm?

Not infra, but final product. I know, corporations move slow. But when there is a critical issue, and an actual desire to solve it from someone in a suit, suddenly turns out that the cogs were always able to speed up and move fast...

Pranks aside, this becomes remarkably scary when you think about all the ways that a malicious/compromised device could cause chaos.

The last screenshot in the OP article mentions that "a browser extension... adding random noise to canvas data" can be detected. Which isn't to say this perfectly detects all such randomization, but it's certainly an active part of the arms race.

Yes but the idea is that the protection should be part of the browser itself, then it becomes the expected norm AND isn't really "detectable" since there's no extension to redefine javascript variables. Scraper-friendly solutions like Camoufox or CloakBrowser make such changes to avoid having the same fingerprint every time while still appearing normal.

Or it can run as part of a checkout wizard's "verifying your browser and processing your payment, don't close your tab" step.

At which point it exists solely to punish real human users? What scraper bot is going through checkout?

The credit card tester bots go through the checkout process.

PoW wouldn't be a big issue for them though since their volume is much lower.


Arguably, these vehicles do exist... in the form of 501(c)(3) university endowments. They endow professorships and graduate fellowships, pay for facility buildouts and infrastructure, and provide a strong pipeline of financial aid to allow talented undergraduates to pursue research rather than needing to repay debt immediately after graduation. And unused funds are invested in public and private markets, ensuring minimal waste and sustainable capital growth. And non-profit universities have strong and time-tested governance rules on many if not all of the dimensions specified.

But these very endowments have been special cased as additionally taxable, despite that status, under the 2025 OBBBA, resulting in research budget cuts [0].

Would independent endowments as you describe them be more immune?

[0] https://www.pbs.org/newshour/education/college-endowment-tax...


The real question is: can you incentivize a non-tokenmaxxing Uber to spend the same amount on AI as they were when tokenmaxxing, just with fewer tokens and higher per-token costs? Even with plateauing improvement in frontier models? I think the answer may be yes.

And part of my reasoning for this is: the only system capable of actually fixing bugs in vibe-created code is an LLM. If we humans couldn't write it without assistance, we certainly won't be able to debug it without assistance. So there's a real stickiness here.

We're signing pacts with demons - we have to, if we want to outcompete the other warlocks - and those pacts are written in the very size of our codebases.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You