For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | softwaredoug's commentsregister

Increased defense spending actually makes the US less, not more, safe. Everyone we're going to fight is prepared for an asymmetric, cheap war. We're vulnerable in how much they can make us spend to wage that war. A million dollar patriot missile to shoot down a cheap drone, etc.

I agree to a point.

But also look at Ukraine. They are punching well above their weight with asymmetrical tactics, but Russia is not defeated.

Drones and other autonomous, cheap weaponry changes a lot. Smaller states and non-state actors can inflict much more serious and expensive damage now more than ever.

Large weapons still matter though. If we ever were to enter an existential battle you would quickly see how big, expensive systems can still be advantageous. I am sure people will take issue with this comment but look at the relative restraint of Russia in Ukraine or the US in Iran vs, say, WWII. Modern morality prevents such scale and tactics until it does not. Then suddenly what matters are big weapons and the huge supply chains powering a war machine.

Both the US and Russia are also pivoting heavily towards drones, and they've been developing them for decades. Yes we have big, expensive weapons programs but we also have a lot of stuff ready or soon to be ready which is much, much cheaper.


> I am sure people will take issue with this comment but look at the relative restraint of Russia in Ukraine [...] vs, say, WWII.

They have been bombing civilian infrastructure, abducting children, torturing and executing civilians and POWs, executing deserters or wannabe deserters the entire fucking Ukraine war. See https://en.wikipedia.org/wiki/War_crimes_in_the_Russo-Ukrain...

Restraint, my unbleached asshole.


No one is fire-bombing cities yet, despite Ukraine pulling a WWII Japan and distributing weapons production amongst residences.

Russia is keeping their expensive equipment in the back since years now because they're afraid to lose it. They would be fire bombing cities if they could. Russia already used white phosphorous in this war. The only reason they're not killing more civilians with missiles and drones is because they can't build more of them.

> No one is fire-bombing cities yet

That was mainly the Americans, British, and the Germans, not the USSR.

Also, what makes you think they could in this war? Do you think they can send bombers over Ukranian cities and drop a shitton of ordnance?

The Russians aren't deploying nukes; that is the only actual 'restraint' to date.



Russia has been attacking Ukrainian cities with missiles and drones since the beginning of the conflict. But Russia simply lacks the capacity to fire-bomb cities on a large scale. They only have a handful of operational heavy bombers left and no real ability to manufacture more so they're unwilling to risk them.

Civilian to military casualty ratio is 1:20 for Russia-Ukraine war and 2:1 for WWII. The difference is huge. Whether this is actual restraint I have no knowledge but if it quacks like a duck ...

The uncomfortable truth is that is absolutely restraint.

For the areas of Ukraine Russia controls, the Russians have distributed food and in some cases identifying documents.

From what I can tell Russia’s goal is to assimilate, not annihilation.

After all is said and done, I suspect some Ukrainians will become Russian citizens or be granted the opportunity to leave.

Or the war will continue forever


Yep, apparently Ukraine still cannot affect fuel production in Russia to any significant point. Drones with less than 100 kg of explosives do not do particularly significant damage. One really need to deliver like a ton or more of explosives and for that one needs bombers that can penetrate air defenses or very expensive stealth cruise missiles or big ballistic missiles.

Of course it has had a significant impact. The reason Russia has repeatedly turned off fuel exports every couple of months for the past couple of years despite high global prices because Ukraine keeps disabling enough of their refining capability to cause shortages.

Hah. Ukraine has cut Russian petroleum production 40%...

Ukraine dramatically reduced Russian fuel export revenue, and the sanctions did so even more.

It was really coming to the point of urgent existential threat to the Putin regime this spring, before Trump and Netanyahu bailed him out, first by doubling the global oil price and then by relaxing sanctions.

And Ukraine's drone / cruise missile portfolio includes things like the Flamingo, more than twice the payload and range of a Tomahawk.


If Ukraine had access to Tomahawks, Russian oil industry would not exist at this point. With drones after two and halve years of attacks with multiple hits at the same refineries Ukraine reduced Russian fuel production at best by 20%.

Flamingo is still mostly vaporware. For precise strikes against Russian factories Ukraine uses either Storm Shadow or domestic Neptun.

But that just shows again that drones are not particularly effective against most industrial targets and even against oil installations the damage is not lasting.

Or consider how US was able to destroy the bridge in Iran yet Crimea bridge and bridges in Rostov that are absolutely vital to Russian war logistics still stands.


Why do you think this bridge is vital when there is a land bridge (Kherson) with multiple rail links all in Russian controlled territory containing the entry and exit points of the bridge?

That bridge is A) incredibly expensive and something a postwar Ukraine would prefer to exist for economic reasons, B) extremely overbuilt in certain ways, and C) not strictly required if Russia can keep rail going on the landbridge.

It might be in play if the land bridge fell.

It would be almost trivial in terms of range to make it a target of any number of strike munitions. If you can hit the Baltic ports or factories in the Urals...

As for drones vs cruise missiles - at this point every missile strike is associated with drone accompaniment, it's part of the counter SHORAD proposition.


I’m surprised that they are not dropping thermite on oil refineries. Most things there will burn if hit enough.

You think “morality” is what’s preventing the US or Russia to drop atomic bombs on their smaller targets?

> Modern morality prevents such scale and tactics until it does not.

In the sense that the tide of geopolitics means that if someone tried that they'd mark themselves as a defector in the current scheme of morality and would stand to lose a lot when the rest of Europe inevitably treats that as an example of how they are about to be treated.


Shot exchange is indeed a problem, but it's far more complex than this makes it sound. The opportunity cost of _not_ shooting down the drone isn't the cost of the drone, it's the cost of whatever it's going to destroy if you don't shoot it down.

Sometimes it makes sense to use a million dollar missile to destroy a $5,000 drone, if that drone would otherwise destroy an even more expensive air defense radar or energy production facility. This says nothing about the cost and value of the lives that might be lost in an enemy strike.

We would not be safer if the enemy had cheap drones and we had no weapons capable of fighting back.

The main problem is that air defense interception is incredibly challenging and expensive primarily because a mid-course defensive interceptor needs considerably greater capabilities than the weapon it is intercepting, because it needs to catch up to the incoming missile or drone mid-flight.

Sure, this can lead to massive overkill problems. Yes, the US should invest more in the low end of the high/low mix. But no, this does not mean there's no place for the high end, or that they should never be used to destroy lower end targets if that's all that is available.

A more interesting challenge, if you ask me, is in the naval domain. Imagine a capital ship has two options for defending against incoming threats - either fire an expensive and limited stock interceptor missile with a 99% kill chance, or wait until the threat is inside the range of a cheap cannon or laser system with a 95% kill chance. There's a real command level tradeoff to be made here. If you shoot every drone with interceptors, you lose shot exchange badly, and you just run out of interceptors. But if you let every target through into the engagement range of your close range systems, you run the risk that one makes it through to your ship, potentially causing damage and casualties.

The future of war is going to be wild one way or the other.


>Sometimes it makes sense to use a million dollar missile to destroy a $5,000 drone, if that drone would otherwise destroy an even more expensive air defense radar or energy production facility. "

If that $5000 drone was alone then sure. However if they launch 200 drones (money equivalent of one missile) you'd be looking at totally different picture. Also they usually launch combo. Few missiles and whole bunch of drones. even worse


I disagree on air defense inherently being very costly.

Old school was guns. Price per round was cheap. But the expensive missile kills the platform holding the cheap gun, you have to go with missiles. But the drone war is a different beast entirely. Drones can't shoot back. Thus the answer is guns. How well will their light drones fare against a Cessna armed with an automatic shotgun? How would the jet drones fare against a WWII warbird?

Lots of cheap, mobile guns. No meaningful self defense but doctrine is to always depart after shooting.

The naval one is much harder because you're not free to disperse your ship into many pieces. But, still, consider your cannon. Let's step down a bit, cheaper cannon with a 90% kill rate--but you put several of them.


There are videos on the internet of drones being shot down with an assault rifle out of a 50 year old training plane, 1914 style.

Also seems that having a very capable military that lets you project power around the world also invites that power to be used. See for instance the Iran war. Quite pointless by all accounts and wouldn't have happened if the US didn't have aircraft carriers to send around the world.

So perhaps thriftiness in defense spending would also invite a prioritization in actual defensive capabilities?


I think the likely result would be more war. It wouldn't be with america, but without anerica providing protection to its allies in the region, the various countries in the region would probably be emboldened to fight it out themselves (im assuming in this scenario that russia and other great powers are also incapable of force projection. Obviously russia is busy right now, but historically they were knee deep in the middle east and much of us involvement now is a legacy of the cold war)

Even in a hypothetical situation where the USA had no aircraft carriers our military probably would have conducted some raids to delay Iran building nuclear weapons. The initial strikes against nuclear facilities were done with B-2 bombers launched from Missouri.

Not to mention US air bases dotted all over the Middle East, the near East, Europe, the Indian Ocean, the Pacific Ocean the Arctic…

Iran wouldn't have started to work in nuclear weapons if Bush didn't credibly threaten to invade them.

Hell, Iran didn't actually work into building them before Trump decided to attack them.


This is laughably false. The leadership in Tehran is smart, and they know that nukes are the most assured path to the regime's self-preservation. (see: North Korea). They've been trying to build nukes since the 80s.

The threat that President Bush issued in 2002 was due to Iran being a state sponsor of terrorist groups, which was true then and is still true today. Historians can argue over whether that threat was a good idea at the time but it's too late to retract it now. We have to deal with the actual situation as it obtains today.

As for what Iran's leadership decided and when, we really have very little visibility into that so don't believe anything you hear. We're not even certain which faction is really in control of nuclear weapons policy. (This isn't an endorsement of the recent attacks.)


That's bullshit. He denounced half of all developing countries for sponsoring terrorism. And forgot to denounce all the ones that sponsored the terrorists that had just attacked the US.

> As for what Iran's leadership decided and when, we really have very little visibility into that so don't believe anything you hear.

The had elections at the time, and voted in the candidate promising nuclear weapons at the next year. So no, that's lying propaganda again.


Half? There were approximately 133 developing countries (depending on how you count) during the George W. Bush presidency so please to give us a list of 66 that he "denounced" for sponsoring terrorism.

Of course the reality is that going back to 2001 the US government has only ever designated seven countries as state sponsors of terrorism. Those were: Iran, Syria, North Korea, Cuba, Sudan, Libya, and Iraq.

Elections in Iran don't necessarily mean much in terms of nuclear weapons policy. It's not clear whether Mahmoud Ahmadinejad actually had much power to impact weapons development one way or another. The real decision making authority appears to lie elsewhere.


> Also seems that having a very capable military that lets you project power around the world also invites that power to be used.

I assure you that is a much better problem than the alternative.


Thanks for the assurance!

To be fair the US is making steps into this realm and it's definitely a known issue. Their Shahed derivative, laser weapons becoming more ubiquitous. I'm surprised how many drones countries are starting to manufacture. e.g the UK delivered 150k drones to Ukraine recently, based on the current state of the UK armed forces that kind of surprised me and definitely shows a change in ethos on how modern first world militaries will wage war in the future.

There’s credible evidence that the Shahed is itself a derivative of a late 20th century German drone designed as a loitering anti-radar munition.

Which has roots that can be traced to the V-1.

> A million dollar patriot missile to shoot down a cheap drone...

I guess it is a good thing then that this isn't something they actually do.

They use cheap weapons to shoot down cheap drones. Their primary anti-drone missile was developed in the 2010s and costs less than a Shahed.


Yet these cheap and effective weapons failed to protect high value targets, esp. radars.

That's a question of deployment, not capability. They've been used widely in the Middle East against drones since the 2010s with considerable success.

Which system are you talking about?

APKWS.

The US took the old Vietnam-era unguided rocket pods (Hydra 70), of which they produce hundreds of thousands every year, and slapped a dirt-cheap guidance kit to the front of each rocket. Supposedly 90-95% effective. A bunch of countries are developing their own clones of the concept.

A single F-16 can carry 42 missiles. They've been rapidly expanding the number of platforms they can attach these to.


Yes, APKWS is a good solution, but it really wasn't used much for C-UAS before VAMPIRE was shipped to Ukraine.

Yes a 99% success rate versus like 600 incoming still means some of them will get through.

Which is the same reason no level of military power is going to keep the Strait of Hormuz open (or at least, no level beyond a truly absurd one and even then - see the Kerch bridge in Crimea).


Iran's stuff is short range.

But Orange Dementia didn't even think about that.


> A million dollar patriot missile to shoot down a cheap drone, etc.

Except this is more propaganda than truth. In general america does not use patriots to shoot down drones except in exceptional circumstances.

Not that the ecconomics of missile defense isnt a problem. It can be. But some of it has been highly exagerated.


The US just blew through a large percentage of its PAC-3/PAC-2 inventory fighting Iran. Other than Patriot, the US doesn't really have much GBAD anymore. A few Avenger systems, some Stinger MANPADs, etc. It's either Patriot or THAAD; and hopefully they're not dumb enough to be using THAAD against drones.

I'm sure they burned through quite a few AMRAAM and Sidewinders doing intercepts as well. Patriot is much more expensive than $1M (try $4M), Stinger is around 250K depending on who the customer is ($750K if you're non-US). AMRAAM is over $1M, Sidewinders $500K.

Even APKWS is $40k, and Shaheed prices are around $30k? So even that low cost option is losing.


There are many different flavors of APKWS, that is the most expensive type. There is real tension between reducing unit costs by maximally leveraging whatever systems they are attached to instead of putting everything in the guidance package and the overhead cost of the customization that entails.

They've been experimenting with variants for many years. There is some belief that they may be able to get unit costs down to $5k for some common variants. Everyone believes $10k is achievable.


> Increased defense spending actually makes the US less, not more, safe.

It just makes us spend more money on defense, which is the entire point.

The industry obviously wants more and more profits.

They are never going to recommend getting rid of $200m F22s and replacing them with 30 $300k drones that would be more effective and cost 5% as much money.

That's 5% as much profit for them. They're not interested.

They are interested in profits, not national security.

And as you pointed out, they'd prefer a LESS secure world that inherently demands more money going to security.

You could spend more on security to actually be more secure. It's just that no one with any power is interested in that world.

They're only interested in making more money.


There are so many companies working on this now (cheap anti-drone tech, cheap cruise missiles, cheap missile interceptors), what you're saying is kind of moot.

Is location sharing something you can disable in iOS?

Yes. You can turn it off for Camera if you don't want the geotag to be included in the photo when taken. You can also, as part of the share media picker, opt to include or exclude location data on the photo.

You can also for example just by voice ask the phone to turn off location services, then take your photos.

As one can imagine, even when turning location services back on, the photo will never contain location data.


Yeah, toggling in any manner you see fit (a Shortcut would be useful in this case) the location services in its totality or in the context of the Camera app would accomplish the same result.

So glad I just pay by the token.

Highly recommend people learn the history of the Industrial Revolution. I recently discovered the Industrial Revolutions Podcast[1] and have been enjoying it. What's happening today isn't unprecedented. The pace of change that's happening IS similar to periods of the industrial revolution.

For example, the flying jenny, overnight, basically put an entire craft industry of weaving into question. Probably more dramatically than anything Claude Code ever did.

It took A LOT and several world wars for brief periods of normalcy post WW2 - probably the exception, not the rule.

1 - https://industrialrevolutionspod.com/


At the start I think everyone thought the industrial revolution could be a useful historical reference.

But what AI is selling is the obliteration of human knowledge work.

It just isn’t informative for that.


This is the key point. It threatens nearly everything in the limit, not one particular industry. There will be no "leveling up" into higher-order jobs, because the machines will be better at those too.

They thought that too in the industrial revolution. You can look back and see the jobs that came out of it. But at the time, it wasn't obvious to the people effected that there would be jobs again.

We may have hindsight bias in evaluating something that happened, but to the people that it happened to it was terrifying.


MIT's motto is mens et manus: mind and hand. These are, basically, the two primary attributes of human labor. They're the reason almost anyone gets hired to do anything. Our brains and our opposable thumbs are what set us ahead of the animal kingdom.

The industrial revolution first attempted to replace our hands. But the labor that was displaced had places to go: into smaller-scale manual work, where mass-production machinery was too expensive, and into knowledge work.

Now the AI is coming for knowledge work, and robots are getting better at small-scale work. We're not at that point yet, but looking down the road I'm not sure there will really be anything competitive left flesh-and-blood humans can offer to an employer.

The only exceptions I can think of are, maybe, athletics, live music performances, and escort services. But with only a few wealthy people as customers, I don't think there will be many job opportunities even in those fields. And I'm not sure that robots won't come for those too.


Why do you need a job in the scenario where machines have replaced all human labor?

You're forgetting that work is a means, not an end, for humanity.


You don't need a job, if you can maintain purchasing power without selling your labor. I didn't forget that, I just didn't take such an outcome for granted.

In other comments I have expressed support for UBI, as well as for paying parents to stay home and spend time with their children. I think the more automated our society gets, the less people should need to work to earn a living. But I look around and I just don't see anyone implementing such policies.


> But I look around and I just don't see anyone implementing such policies.

Because we have low unemployment. As long as we have jobs for people, you shouldn't expect UBI.


Again, this betrays a strong hindsight bias.

Nobody had any idea what was coming with the industrial revolution. There wasn't obviously other work for people. And for long periods of time nobody had an answer to that question for large percentage of the population.

In hindsight, we know the answers NOW, but then they did not know what was going to happen. We also don't know what's going to happen, it could go as you hypothesize. Or the Jevon's paradox people might be right and there's way more work to do.

The uncertainty is the historical lesson, not that "it'll all work out"


Your comment betrays a lazy survivorship bias.

I guess the people in Wall-E didn't really seem unhappy so perhaps you're right. My gut instinct though, is that there is a qualitative difference in the level of abundance and concentration of wealth, power, and influence we have today that needs to be taken seriously on its merits and not hand-waved away with tenuous historical analogies.

Yes, two hundred years ago, many people thought reading was a dangerous distraction for young people, just as film, radio, TV and the internet became later. But there is a qualitative difference to having social media in your pocket with vibrating notifications. Pretending its just more of the same honestly feels like slightly willful blindness at this point.


A guild craftsmen weaving / making firearms pre Industrial Revolution would see their work as much about “knowledge” as manual labor.

Much of that got obliterated by automation.

History doesn’t repeat itself, but it certainly rhymes


Nah, it's obliterating the distinction -- made by middle class folks and only temporarily true -- between physical labour and intellectual labour.

You as a blue collar machine operator, shoving punch cards in and getting answers out, is precisely what your boss always saw you as, or wanted you to be.

Our necessity as pseudo-craftsmen holding an intellectual high ground and wizardly/magical skills was always resented by investors, owners, and sometimes customers.

Blacksmithing and leather tanning and shoe making and seamstressing and furniture making was human knowledge work, too.

The Alvin Toffler stuff was always bullshit, but it's even more bullshit now.


The jobless French artisans who couldn't compete with English industrial imports was one of the causes of the French Revolution.

Thomas Picketty does indeed argue in Capital in the 21st Century that the post World War 2 period is indeed an exception in terms of inequality being lower while historically it is not, and it is reverting back to the mean of there being more inequality these days, yet people bemoan the idea of not being able to live off a single job when in reality that was never guaranteed.

Still, that's no reason to accept a return to that state of affairs. Progress happens slowly and imperceptibly, but on a grand scale it does happen.

Much as we'd like that to be true ideally, does it happen (in terms of inequality reducing)? I see no evidence of that, it ebbs and flows in various time periods and civilizations. One can try to resist that reversion to the mean but they'd historically be proven wrong.

For a start by not tearing down the systems that kept inequality in check in the past. Like union membership or banking regulation etc. just to name some examples.

The tearing down of those institutions IS the ebb.

That's like accepting burnt coffee is the average, so why try to make a good tasting cup. Nonsense.

The circumstances are similar, but the people are different. Look outside: this is WALL-E.

I recently heard the saying “if you wait to the last minute, it’ll only take you a minute!”

For some tasks I have, I need to live this way. Obsessing over a task next week takes away from what I’m doing now. I have to trust in my ability to pull it together efficiently. And when it’s due, the work often needs to be fresh in my mind - not something from weeks ago.

Sometimes that means unexpected late nights. But it’s mostly worked out.


This is why I just use OpenCode and see the dollar amount of tokens next to my session. I can decide if the task is worth the cost. Or if I should hand code / use a cheaper model.

The real thing I think people are rediscovering with file system based search is that there’s a type of semantic search that’s not embedding based retrieval. One that looks more like how a librarian organizes files into shelves based on the domain.

We’re rediscovering forms of in search we’ve known about for decades. And it turns out they’re more interpretable to agents.

https://softwaredoug.com/blog/2026/01/08/semantic-search-wit...


Someone simply assumed at some point that RAG must be based on vector search, and everyone followed.


It’s something of a historical accident

We started with LLMs when everyone in search was building question answering systems. Those architectures look like the vector DB + chunking we associate with RAG.

Agents ability to call tools, using any retrieval backend, call that into question.

We really shouldn’t start RAG with the assumption we need that. I’ll be speaking about the subject in a few weeks

https://maven.com/p/7105dc/rag-is-the-what-agentic-search-is...


Right. R in RAG stands for retrieval, and for a brief moment initially, it meant just that: any kind of tool call that retrieves information based on query, whether that was web search, or RDBMS query, or grep call, or asking someone to look up an address in a phone book. Nothing in RAG implies vector search and text embeddings (beyond those in the LLM itself), yet somehow people married the acronym to one very particular implementation of the idea.


Yeah there's a weird thing where people would get really focused on whether something is "actually doing RAG" when it's pulling in all sorts of outside information, just not using some kind of purpose built RAG tooling or embeddings.

Now, the pendulum on that general concept seems to be swinging the opposite direction where a lot of those people just figured out that you don't need embeddings. That's true, but I'd suggest that people don't overindex on thinking that means embeddings are not actually useful or valuable. Embeddings can be downright magical in what you can build with them, they're just one more tool at your disposal.

You can mix and match these things, too! Indexing your documents into semantically nested folders for agents to peruse? Try chunking and/or summarizing each one, and putting the vectors in sidecar files, or even Yaml frontmatter. Disks are fast these days, you can rip through a lot of files indexed like that before you come close to needing something more sophisticated.


> yet somehow people married the acronym to one very particular implementation of the idea.

Likely due to the rise in popularity of semantic search via LLM embeddings, which for some reason became the main selling point for RAG. Meanwhile keyword search has existed for decades.


I'm still using the old definition, never got the memo.


That’s OK. Most got ReST wrong, too.


You seem like someone who knows what they're doing, and I understand the theoretical underpinnings of LLMs (math background), but I have little kids that were born in 2016 and so the entire AI thing has left me in the dust. Never any time to even experiment.

I am active in fandoms and want to create a search where someone can ask "what was that fanfic where XYZ happened?" and get an answer back in the form of links to fanfiction that are responsive.

This is a RAG system, right? I understand I need an actual model (that's something like ollama), the thing that trawls the fanfiction archive and inserts whatever it's supposed to insert into one of these vector DBs, and I need a front-facing thing I write, that takes a user query, sends it to ollama, which can then search the vector DB and return results.

Or something like that.

Is it a RAG system that solves my use case? And if so, what software might I go about using to provide this service to me and my friends? I'm assuming it's pretty low in resource usage since it's just text indexing (maybe indexing new stuff once a week).

The goal is self-hosting. I don't wanna be making monthly payments indefinitely for some silly little thing I'm doing for me and my friends.

I am just a stay at home dad these days and don't have anyone to ask. I'm totally out the tech game for a few years now. I hope that you could respond (or someone else could), and maybe it will help other people.

There's just so many moving parts these days that I can't even hope to keep up. (It's been rather annoying to be totally unable to ride this tech wave the way I've done in the past; watching it all blow by me is disheartening).


In the definition of RAG discussed here, that means the workflow looks something like this (simplified for brevity): When you send your query to the server, it will first normalise the words, then convert them to vectors, or embeddings, using an embedding model (there are also plain stochastic mechanisms to do this, but today most people mean a purpose-built LLM). An embedding is essentially an array of numeric coordinates in a huge-dimensional space, so [1, 2.522, …, -0.119]. It can now use that to search a database of arbitrary documents with pre-generated embeddings of their own. This usually happens during inserting them to the database, and follows the same process as your search query above, so every record in the database has its own, discrete set of embeddings to be queried during searches.

The important part here is that you now don’t have to compare strings anymore (like looking for occurrences of the word "fanfiction" in the title and content), but instead you can perform arbitrary mathematical operations to compare query embeddings to stored embeddings: 1 is closer to 3 than 7, and in the same way, fanfiction is closer to romance than it is to biography. Now, if you rank documents by that proximity and take the top 10 or so, you end up with the documents most similar to your query, and thus the most relevant.

That is the R in RAG; the A as in Augmentation happens when, before forwarding the search query to an LLM, you also add all results that came back from your vector database with a prefix like "the following records may be relevant to answer the users request", and that brings us to G like Generation, since the LLM now responds to the question aided by a limited set of relevant entries from a database, which should allow it to yield very relevant responses.

I hope this helps :-)


I think the example you give is a little backwards — a RAG system searches for relevant content before sending anything to the LLM, and includes any content retrieved this way in the generative prompt. User query -> search -> results -> user query + search results passed in same context to LLM.


Honestly, just from this question, I think you know enough that I’d go spend $20/month for a subscription to Codex, Claude Code, or Cursor, and ask them to teach you all this. I bet if you put in your comment verbatim with Opus 4.6 and went back and forth a bit, it could help you figure out exactly what you need and build a first version in a couple hours. Seriously, if you know the fundamentals and can poke and prod, these tools are amazing for helping expand your knowledge base. And constraints like how much you want to pay are excellent for steering the models. Seriously, just try it!


You don't need to pay an external crowd for that.

You can run Claude Code using a local instance of ~recent Ollama fine, and it'll do the teaching job perfectly well using (say) Qwen 3.5.

Doesn't even need to be one of the large models, one of the mid-size ones that fit in ~16GB of ram when given 128k+ context size should be fine.


> Honestly, just from this question, I think you know enough that I’d go spend $20/month for a subscription to Codex, Claude Code, or Cursor, and ask them to teach you all this.

Paying $20/m sounds like overkill. I have tabs open for all of the most well-known AI chatbots. Despite trying my hardest, it is not possible to exhaust your free options just by learning.

Hell, just on the chatbots alone, small projects can be vibe-coded too! No $20/m necessary.


Yeah, but when it comes to actually building stuff, using Codex is night and day different from using ChatGPT.


> Yeah, but when it comes to actually building stuff, using Codex is night and day different from using ChatGPT.

Sure, but that wasn't what you recommended Codex for, was it?

>>> Honestly, just from this question, I think you know enough that I’d go spend $20/month for a subscription to Codex, Claude Code, or Cursor, and ask them to teach you all this.


Stuck it on my calendar, looking forward to it.


We were given a demo of a vector based approach, and it didn't work. They said our docs were too big and for some reason their chunking process was failing. So we ended up using a good old fashioned Elastic backend because that's what we know, and simply forwarding a few of these giant documents to the LLM verbatim along with the user's question. The results have been great, not a single complaint about accuracy, results are fast and cheap using OpenAI's micro models, Elastic is mature tech everyone understands so it's easy to maintain.

I think this turned out to be one of those lessons about premature optimization. It didn't need to be as complex as what people initially assumed. Perhaps with older models it would have been a different story.


> They said our docs were too big and for some reason their chunking process was failing.

Why would the size of your docs have any bearing on whether or not the chunking process works? That makes no sense. Unless of course they're operating on the document entirely in memory which seems not very bright unless you're very confident of the maximum size of document you're going to be dealing with.

(I implemented a RAG process from scratch a few weeks ago, having never done so before. For our use case it's actually not that hard. Not trivial, but not that hard. I realise there are now SaaS RAG solutions but we have almost no budget and, in any case, data residence is a huge concern for us, and to get control of that you generally have to go for the expensive Enterprise tier.)


I agree it makes no sense. The whole point of chunking is to handle large documents. If your chunking system fails because a document is too big, that seems like a pretty glaring omission. I just chalked it up to the tech being new and novel and therefore having more bugs/people not fully understanding how it worked/etc. It was a vendor and they never gave us more details.

Not all problems have to be solved. We just fell back to using older, more proven technology, started with the simplest implementation and iterated as needed, and the result was great.


That's good. I think if you can get the result you need with a technology that's already familiar to you then, in cases where that tech is still supported, that's going to be a win.

RAG worked well for us in this recent case but, in 3+ years of developing LLM backed solutions, it's the first time I've had to reach for it.


Doesn't have to be tho, I've had great success letting an agent loose on an Apache Lucene instance. Turns out LLMs are great at building queries.


I don't think this was a simple assumption. LLMs used to be much dumber! GPT-3 era LLMS were not good at grep, they were not that good at recovering from errors, and they were not good at making followup queries over multiple turns of search. Multiple breakthroughs in code generation, tool use, and reasoning had to happen on the model side to make vector-based RAG look like unnecessary complexity


It was the terminology that did that more than anything. The term 'RAG' just has a lot of consequential baggage. Unfortunately.


Certainly a lot of blog posts followed. Not sure that “everyone” was so blinkered.


RAG is like when you want someone to know something they're not quite getting so you yell a bit louder. For a workflow that's mainly search based, it's useful to keep things grounded.

Less useful in other contexts, unless you move away from traditional chunked embeddings and into things like graphs where the relationships provide constraints as much as additional grounding


My intuition is that since AI assistants are fictional characters in a story being autocompleted by an LLM, mechanisms that are interpretable as human interactions with language and appear in the pretraining data have a surprising advantage over mechanisms that are more like speculation about how the brain works or abstract concepts.


This is also why LLMs get 80% of the way there and crap out on logic. They were trained on all the open source abandonware on GitHub.


Similar effort with PageIndex [1], which basically creates a table of contents like tree. Then an LLM traverses the tree to figure out which chunks are relevant for the context in the prompt.

1: https://github.com/VectifyAI/PageIndex


I spent a while working on a retrieval system for LLMs and ended up reinventing a concordance (which is like an index).

It's basically the same thing as Google's inverted index, which is how Google search works.

Nothing new under the sun :)


This kind of circles back to ontological NLP, that was using knowledge representation as a primitive for language processing. There is _a ton_ of work in that direction.


Exactly. And LLMs supervised by domain experts unlock a lot of capabilities to help with these types of knowledge organization problems.


Exactly. Traditional library science truly captured deep patterns of information architecture.

https://x.com/wibomd/status/1818305066303910006

Pixar got this right in Ralph Wrecks The Internet.

https://x.com/wibomd/status/1827067434794127648


> Our documentation was already indexed, chunked, and stored in a Chroma database to power our search, so we built ChromaFs

It's obvious by that sentence that these guys neither understand RAG nor realized that the solution to their agentic problem didn't need any of this further abstractions including vector or grep


I got to say people also seem to be missing really simple tricks with RAG that help. Using longer chunks and appending the file path to the chunk makes a big difference.

Having said that, generally agree that keyword searching via rg and using the folder structure is easier and better.


> I got to say people also seem to be missing really simple tricks with RAG that help. Using longer chunks and appending the file path to the chunk makes a big difference. > > Having said that, generally agree that keyword searching via rg and using the folder structure is easier and better.

It depends on the task no? Codebase RAG for example has arguably a different setup than text search. I wonder how much the FS "native" embedding would help.


I think it's cool that LLMs can effectively do this kind of categorization on the fly at relatively large scale. When you give the LLM tools beyond just "search", it really is effectively cheating.


Aren’t most successful RAGs using a combination of embedding similarity + BM25 + reranking? I thought there were very few RAGs that only did pure embedding similarity, but I may be mistaken.


Inverted indexes have the major advantages of supporting Boolean operators.


And next, we’ll get to tag based file systems


Yep, I was using RAG for all sorts of stuff and now moved everything to just rg+fd+cd+ls, much faster, easier, etc.


more and more often you see "new discoveries" that are very old concepts. the only discovery that usually happens there is that the author discovers for himself this concept. but it is essential nowadays to post it like if you discovered something new


Turns out the millions of people in knowledge work arent librarians and they wing shit everywhere


[flagged]


I built tilth (https://github.com/jahala/tilth) much for this reason. Couldn't bother with RAG, but the agents kept using too many tokens - and too many turns - for finding what it needed. So I combined ripgrep and tree-sitter and some fiddly bits, and now agents find things faster and with ~40% less token use (benchmarked).


There a lot of methods in IR/RAG that maintain structure as metadata used in a hybrid fusion to augment search. Graph databases is an extreme form but some RAG pipelines pull out and embed the metadata with the chunk together. In the specific case of code, other layered approaches like ColGrep (late interaction) show promise.... the point is most search most of the time will benefit from a combination approach not a silver bullet


Just like the approach in the article.

Everything is based on the metadata stored with chunks, just allowing the agent to navigate that metadata through ls, cd, find and grep.


> Switched to just letting the agent browse the directory tree and read files on demand -- it figured out the module structure in about 30 seconds

You guess what's the difference between code and loosely structured text...


[flagged]


Parent may or may not be AI generated or AI edited. As such it MAY breach one of the HN commenting guidelines

Your comment however definitely breaches several of them.


Understood. I’m willing to defy guidelines and take the consequences. I still think it’s worth pointing out slop so people don’t waste their time talking to a machine.

indeed. moltbook vibes


I'd rather read a hundred comments like that than one more like yours.


Each to their own. I appreciate you writing that comment yourself.

It’s not the noncompetes that’s the problem, it’s confidentiality agreements with extremely broad language.

Learn about the legal principle of “inevitable disclosure”. It’s the idea you can’t work for a competitor because you can’t help yourself but violate an NDA


I haven't heard much about it, but I am incredibly curious about how this is currently shaking out in the AI craze.

It seems these labs are revolving doors, and any kind of breakthrough knowledge would immediately make you incredibly valuable to other labs or incredibly valuable as a spinoff start-up. Never mind these researchers all knowing each other and certainly having more than a few common spaces (digital or IRL). And the excitement of working in a fresh field still littered with low hanging fruit.

I can't help but feel that a large part of the reason why the labs are neck and neck is because everyone is talking to everyone else.

I can't substantiate any of this though, it seems to have largely dodged anything besides internal conversation.


They're all in California where the law is very pro-employee. As long as you're not taking actual documents or code with you, there's nothing your former employer can do about what's in your head.


This is a huge part of how SV as a whole works. People figure out what works and point out how to do things better at their next roles. It's mostly a good thing. The main downside is that it exacerbates tendencies to cargo cult apply solutions for problems that come from a particular organizational scale to orgs without them.


Inevitably, it's just the need for lawyers to intervene in "common sense" negotiations. It's never legal to do X, Y, Z, but if the business has all the lawyers and the employee has non, then it doesn't really matter whats legal; it's whose willing to exhaust the cash to fight the issue.

Which of course, is why unions are what's needed to properly negotiate employee-employer relationships, the same way a strong government is needed to negotiate corporate-civil relationships.

Americans, however, have decided that "individual freedom" is _soooooo_ valuable, that it only exists for people with enough cash to defend it.


Have fun trying that in CA.


“The Great Depression: A Diary” is a great day by day first person account of someone living through the depression. It’s a great reminder how we don’t have a monopoly on insane politics

https://www.goodreads.com/book/show/6601224-the-great-depres...


I read this more than 10 years ago, so I don't remember a lot, but I do appreciate it for being the only account of the crash that doesn't have historical hindsight. It was interesting to hear someone trying to make sense of things on a near daily basis during the fog of uncertainty. It makes me want to find other such accounts of historical events without the inevitable-seeming cause and effect sequence of events you normally read about history.


Check out Demons of Unrest, which covers the few months before the American civil war. It covers the stories of Lincoln, other union leaders and Confederate leaders and their understanding and misunderstandings about what the other side thought.

It's remarkable about what assumptions people can make without talking to people from other places.


George Orwell's Homage to Catalonia is about his experience in the Spanish Civil War, published in 1938. He was there between '36 and '37 I think. It's pre WWII, and I found it very interesting for the same reason you say here: his account doesn't have the benefit of hindsight. The civil war wasn't even over when the book was published. It's very interesting to see his perspective, what things he saw coming, and what things he didn't.


The Wind is Rising, by HM Tomlinson. It's a diary of the first year or so of the second world war. It has an unforgettable first line: "All we hear from Berlin is the music of marrow bones and cleavers," and is similarly vivid throughout.

It looks like you can borrow it from archive.org, but I suggest buying a physical copy. It was printed in 1941 - and I don't believe ever had a second edition - so it's on thin, wartime paper, which adds to the experience of reading it. It's like something pulled out of a time-capsule, a tangible relic of the time it covers.


Interesting how there is so little information about this book online. It’s a good reminder of how a ton of stuff basically still isn’t on the internet and is still only accessible in old books.


(in no small part due to copyright law)


> It has an unforgettable first line: "All we hear from Berlin is the music of marrow bones and cleavers," and is similarly vivid throughout.

A nice example of the power of media to bring something to life in the reader.


Commenting to save this for later.


There's the diary 1660-1669 of Samuel Pepys, which covers as the Great Plague of London, the Second Anglo-Dutch War and the Great Fire of London.


Good recommendations here, thanks. I was aware of Orwell's account of the Spanish civil war, so maybe I'll start there.


A book along similar lines: https://www.amazon.com/Not-Nickel-Spare-Sally-Cohen/dp/04399...

(haven’t read it yet so I can’t vouch for it)


I have some algorithms I absolutely must know. So I’m hand coding them and asking the agent to critique me.

I do a very similar thing in writing - I need feedback, don’t rewrite this!

In both cases I need the struggle of editing / failing to arrive at a deeper understanding.

The future dev will need to know when to hand code vs when to not waste your time. And the advantage will still go to the person willing to experience struggle to understand what they need to.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You