For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | historyregister

My observation has been that there are a lot of personal styles to engaging with the LLMs that work, and "hold the hand" vs "in-depth plan" vs "combination" doesn't really matter. There is some minimum level of engagement required for non-trivial tasks, and whether that engagement comes mid-development, at the early design phase, or after isn't really that big of a deal. Eg; "Just enough planning" is a fine way of approaching the problem if you're going to be in the loop once the implementation starts.

Any higher yield comes from higher risk. If any startup feels the startup is not risky enough and really wants to have higher yield for higher risk just put the money in a Bond ETF that suits your risk appetite. Crazy that YC funds things that make a simple thing more complex and more costly for zero upside.

Most prompt engineering is done by changing words, running the model, and squinting at the output. Over the past few weeks I've built this toolkit which lets you measure what's actually happening inside the model instead.

You define regions of your prompt (instructions, examples, constraints, whatever), run the pipeline on any HuggingFace model, and get back per-layer attention heatmaps, cooking curves showing how attention to each region evolves through the network, and logit lens snapshots. Supports Llama, Qwen, Mistral, Gemma out of the box. Self-contained engine script you can scp to a GPU box and run with no dependencies beyond transformers. The repo is designed so that Claude can handle the whole pipeline end-to-end including interpreting results in a grounded domain-specific way.

I built it to tune system prompts for another project and realized the general approach was useful enough to extract. The "before and after" comparison tooling ended up being the part I use most.


I have been having a lot of success with Cursor. I like being able to switch between Anthropic and OpenAI models. Claude Code does gives way more tokens/$ than Cursor right now though.

They are two orthogonal issues. One doesn't make the other irrelevant.

> Software is everywhere and thus the value of maintaining software and the value of software engineering remains high.

This is an unfinished argument. What if we get coding agents to maintain software? What if frequent rewriting becomes cheap enough? Something that's a tenth or one hundredth of your salary doesn't have to be good to make for a good business decision. Why do you think every native application has been replaced by slop made up of 10 layers of JS frameworks on top of electron? Nothing matters as long as the product is cheap and fast to pump out, barely works on modern hardware, and makes dough.

> AI does not reduce software, it increases the amount of software.

There's not infinite demand for software. If AI inference costs take 50% of the prior payroll expenses, while making a company twice as efficient, that means we need 4 times as much demand in software engineering at the same salary for everyone to keep their job. What new or improved subscription, app, website, device, or other software product does the world need right now? 99.9% of people use the same 5 apps. Most of their free time, attention, and disposable income has already been captured by trash that is unbeatable due to network effects. Are we all going to sell shitty LLM frontends to businesses until they notice they could have done the same thing themselves? There might be an explosion in new software, but no one there to care about using it.

> I believe there is a huge chasm that will likely never be crossed between the human intent of systems and their implementation that only human engineers can actually bridge.

Maybe, or the AI might just be missing context. Think of all the unwritten culture, practices, and conversations the LLM hasn't been made aware of.

> In short they want a throat to choke.

You're responsible for those under you anyway, this doesn't help. Banking on those in charge being irrational forever in a way that is bad for business, and without ever noticing, is a bad gamble.

> The other factor is that while AI can clearly replace rote coding today [...], X is not something that can be solved for without a lot of knowledge and guardrails.

I'm talking about the world the AI-maximalists predict is rapidly approaching, not where we are today. None of that knowledge and none of those guardrails are hard to grasp intellectually, compared to advanced mathematics for example. Put your institutional knowledge in a .md file and add another agent that enforces guardrails in a loop. The only way out I see is a situation where there are complex patterns that we intuitively grasp, but can't articulate. Patterns that somehow span too much data or don't have enough examples for LLMs to pick up on.

> There will be engineers who maintain all the same code, they'll just cover more scope with LLM assisted tools.

So fewer jobs with lesser qualifications?

> Ultimately I don't see the boundary the same way you do, as software engineers we have always had to justify our systems by their real world interaction.

I've seen the way engineers design products, and I like products designed by engineers, but no layperson does. Laypeople don't want power, privacy, or agency. They care about how things work, and they lie to themselves and others about what they really want. They don't want a native desktop app that streams high-quality audio from a self-hosted collection, they want a subscription that autoplays algorithmic slop through a react native app on their iPhone. Do you really think you're better at appealing to/fleecing customers than people with actual UX, marketing, and behavioral psychology experience? This example only applies to mass-market software, but I'm sure it's not much different in other fields. Engineers keep thinking they could everyone else's job, but they don't do so well in practice.


I can’t tell if this is a genuine quote or not. Can you provide a citation?

(I think something like this comes up in the Phaedrus)


Maybe I'm reading the graph wrong, but the decrease comes after years on continuous growth, so total employment numbers in tech should still be absolutely massive, compared to 18 years ago?

If it continues, then yes it could be bad, but so far it seems like a correction for over-hiring in 2021 - 2023. Seems a little weird to be focusing on a decline in 2024 - 2026, without addressing the large increase right in the years before.


That’s a really interesting way to frame it — showing the flow of prompts and responses rather than just the final result.

I’ve mostly been using it for demos and sharing sessions with teammates, but the training / best-practices angle is a great point.

On navigation: you can already step through turns with the arrow keys or jump around the timeline, so you don’t have to sit through long generations. But I agree that smarter defaults (skipping or collapsing long runs) could make it smoother.

And the Loom comparison is interesting — I hadn’t thought about the workspace/permission side yet since this started as a small CLI tool for sharing sessions, but that’s a good direction to think about.


Core memories of carefully setting my fisherman on a boat with `ezmacro` before I got ready for school. I'd come home to either a boat full of fish (to later cook into fish steaks), or be dead from a player killer who found my boat and killed my macroing guy to try and steal the boat.

> Exactly - go give someone you love a hug, that's worth infinitely more than flexing an expensive watch.

Hugs are massive signals of status (who, where, initiator, awkwardness, yada yada).

My fascination with the politics of hugs might be called autistic by some.

I wonder whether my own status avoidance is on account of being bad at playing at ladders.


During the Paleogene, the terrestrial plants and animals were very different from those of today.

Now on all continents and islands most of the big animals and plants are humans, domestic animals and cultivated plants. The wild animals and plants, even if they are much more varied, with many thousands times more species than the domestic ones, are much smaller in quantities, with only a few kinds that are non-negligible, e.g. ants, termites, rodents.

So if we will return in a short time to the Paleogene climate, the main question is how this will affect the few dominant animal species, like chicken, humans, pigs, sheep, cattle, dogs and the main cultivated plants, all of which are not adapted to a Paleogene climate and which will not be able to adapt in such a short time.

It is likely that places like Canada, Alaska, Greenland, Siberia, Antarctica might become nicer places where to live and practice agriculture, but the few people who live now there would not welcome invaders coming from places that are no longer habitable.


It's like the evil twin of "code is data"

The perception seems to be that AI is only causing security vulnerabilites (see: openclaw injection in npm (Clinejection)). But the article's optimistic tone much reflects my own, and if it were all bad, then nobody would be using AI. But it's mostly good, and with the benchmarks, it's a statistical fact that it helps more than it hurts. It's just math at a certain point.

Sure, those things can happen. A lot of younger people will decide to just accept the risk, and then if they get hit by a bad and expensive health issue then they'll go to the ER anyway. Due to EMTALA, most hospitals have to treat them regardless of ability to pay. This is one of the factors causing the US healthcare financing system to collapse.

hah, point well taken. In hindsight, my use of "actual job" to mean "job that contributes to the economy rather than simply speculating on it or skimming off the top", wasn't very clear.

Not the OP but curious why you think so?

If this gives an extra 1% per se, I imagine that is more worth it to a company fresh off a large fundraise with a ton of cash in the bank.

Startups otherwise are lean and won't hold enough cash to get a meaningful return from the 1%.


Agreed. Which is why I think the OS level is dumb. Kids can just live boot or launch a vm or keylog their parents' account.

If it's windows, they can just live boot into the OS and get access to pretty much all the files anyway, if the parent didn't encrypt things.

My point is, if the implementation is trivial to bypass, why do we need this legislation? Just let the parents use the existing tools we have and parent.


>it is unthinkable for some professional with a master's degree to become a warehouse sorter.

I mean I think you do actually have a salient point, but I also think there's a material difference between telling someone that's maybe been paid not a lot for half a of labor that they need to change industries and someone who's tens of thousands of dollars into debt that the implied social contract encouraging that debt was a house of cards and they need to start from scratch with 0 experience even ever being employed.


Yes, backups are great but a 'dumb robot' or a 'mistaken junior' shouldn't have access to prod.

And a sleep-deprived senior? Even then. They shouldn't have access to destructive effects on prod.

Maybe the senior can get broader access in a time-limited scope if senior management temporarily escalates the developers access to address a pressing production issue, but at that point the person addressing the issue shouldn't be fighting to stay awake nor lulled into a false sense of security as during day to day operations.

Otherwise it's only the release pipeline that should have permissions to take destructive actions on production and those actions should be released as part of a peer reviewed set of changes through the pipeline.


I think you’re right about the impetus for the piece. I’ll say that branding is for things consumed in public, I expect vendor lock in before branding. But as you know this site is awash in speculation about how the lack of differentiation will play out

Getting seed funding from Fluke is a very PNW detail. RIP to a founder from a different age.

Nah, the analogy for your argument is:

Two Americans and ten Chinese are on a lifeboat. The Americans are each eating two sandwiches a day and the Chinese are eating one. Supplies are low. You do the math and note that the Chinese sure are eating a lot of sandwiches.


Great news. Haven't played UO in forever. What kind of client are people using on modern systems, these days? Is there a client working well on linux?

I can’t wait for ChatGPT to control the autonomous weapons, screw it put it in charge of the nukes!

Yep, you're not insane, they were amateur.

As the tool gets better, people trust it more. It's like Tesla's self-driving: "almost" works, and that's good enough for people to take their hands off the wheel, for better or for worse.

The "almost" part of automation is the issue + the marketing attached to it of course, to make it a product people want to buy. This is the expected outcome and is already priced in.


Oh, arguably the best part (which I forgot to mention) is that our terminals continue running in the cloud, so dev work isn't blocked by our computers going to sleep.

Time for Remy to make another video: https://www.youtube.com/watch?v=YQJ7E140-SQ

Is this available for Non Profits?

I've had an easy time setting up treasury accounts with Rho & Mercury for 2 co's, but the latter gave me a no-go on an account for a non profit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You