I can still remember trying Python 1.5 on a first-generation Mac (which was probably already a dinosaur at the time) and not being able to get it to do anything (even obvious things to try at the REPL would just crash). I don't know if there was a problem with that build or what.
Or they tried some simple alternatives and didn't find clear benefits?
> The key is to give the agent not just the ability to pull things into context, but also remove from it.
But then you need rules to figure out what to remove. Which probably involves feeding the whole thing to a(nother?) model anyway, to do that fuzzy heuristic judgment of what's important and what's a distraction. And simply removing messages doesn't add any structure, you still just have a sequence of whatever remains.
What I'm thinking is: When the agent wants to open more files or open more messages, eventually there will be no more context left. The agent is then essentially forced to hide some files and messages in order to be able to proceed. Any other commands are refused until the agent makes room in the context. Maybe the best models will be able to handle this responsibility. A bad model will just hide everything and then forgot what they were working on.
For a few years I worked in the team that wrote software for an embedded audio DSP. The power draw to do something was normally more important than the speed. Eg when decoding MP3 or SBC you probably had enough MIPS to keep up with the stream rate, so the main thing the customers cared about was battery life. Mostly the techniques to optimize for speed were the same as those for power. But I remember being told that add/sub used less power than multiply even though both were single cycle. And that for loops with fewer than 16 instructions used less power because there was a simple 16 instruction program memory cache that saved the energy required to fetch instructions from RAM or ROM. (The RAM and ROM access was generally single cycle too).
Nowadays, I expect optimizations that minimize energy consumption are an important target for LLM hosts.
Sibling posted a good example. But I know of (without details) things where you have to insert nops to keep peak power down, so the system doesn't brown out (in my experience, the 68hc11 won't take conditional branches if the power supply voltage dips too far; but I didn't work around that, I just made sure to use fresh batteries when my code started acting up). Especially during early boot.
Apple got in a lot of trouble for reducing peak power without telling people, to avoid overloading dying batteries.
In a clockless cpu design you'd indeed expect xor to be faster. But in a regular CPU with a clock you either waste a bit of xor performance by making xor and sub both take the same number of ticks, or you speed up the clock enough that the speed difference between xor and sub justifies sub being at least a full tick slower
Even if they take the same number of ticks, shouldn't xor fundamentally needing less work also mean it can be performed while drawing less power/heating less, which is just as much an improvement in the long run?
> but xor took a slightly lead due to some fluke, perhaps because it felt more “clever”.
Absolutely. But I can also imagine that it feels more like something that should be more efficient, because it's "a bit hack" rather than arithmetic. After all, it avoids all the "data dependencies" (carries, never mind the ALU is clocked to allow time for that regardless)!
I imagine that a similar feeling is behind XOR swap.
> Once an instruction has an edge, even if only extremely slight, that’s enough to tip the scales and rally everyone to that side.
Network effects are much older than social media, then....
It'd be nice if it described up front what kind of information is available per panel.
For that matter, I'd be interested in details of how "a team of researchers including alumni from NOAA, NASA and the USGS" (from the previous article) actually collected the data.
In the abstract: “We use these newly compiled and delineated solar arrays and panel-rows to harmonize and independently estimate value-added attributes to existing datasets including installation year, azimuth, mount technology, panel-row area and dimensions, inter-row spacing, ground cover ratio, tilt, and installed capacity.“
Honestly, this has to just be rage bait.
You can't honestly not understand the difference between a car being in large parking lots and not being able to see if the door is locked or not by looking at it from the outside, compared to a laptop being in your office or on your lap and 0% of the time being in a pile of 500 other laptops.
I've heard that story a few times (ironically enough) but can't say I've seen a good example. When was over-architecture motivated by an attempt to reduce duplication? Why was it effective in that goal, let alone necessary?
I think there is often tension between DRY and "thing should do only one thing." E.g., I've found myself guilty of DRYing up a function, but the use is slightly different in a couple places, so... I know, I'll just add a flag/additional function argument. And you keep doing that and soon you have a messed up function with lots of conditional logic.
The key is to avoid the temptation to DRY when things are only slightly different and find a balance between reuse and "one function/class should only do one thing."
For sure. I feel I need all of my experience to discern the difference between “slightly different, and should be combined” and “slightly different, and you’ll regret it if you combine them.”
One of my favorite things as a software engineer is when you see the third example of a thing, it shows you the problem from a different angle, and you can finally see the perfect abstraction that was hiding there the whole time.
Buy me a beer and I can tell you some very poignant stories. The best ones are where there is a legitimate abstraction that could be great, assuming A) everyone who had to interact with the abstraction had the expertise to use it, B) the details of the product requirements conformed to the high level technical vision, now and forever, and C) migrating from the current state to the new system could be done in a bounded amount of time.
My view is over-engineering comes from the innate desire of engineers to understand and master complexity. But all software is a liability, every decision a tradeoff that prunes future possibilities. So really you want to make things as simple as possible to solve the problem at hand as that will give you more optionality on how to evolve later.
I’ll give a simplified example of something I have at work right now. The program moves data from the old system to the new system. It started out moving a couple of simple data types that were basically the same thing by different names. It was a great candidate for reusing a method. Then a third type was introduced that required a little extra processing in the middle. We updated the method with a flag to do that extra processing. One at a time, we added 20 more data types that each had slightly different needs. Now the formerly simple method is a beast with several arguments that change the flow enough that there are a probably just a few lines that get run for all the types. If we didn’t happen to start with two similar types we probably wouldn’t have built this spaghetti monster.
> Then a third type was introduced that required a little extra processing in the middle. We updated the method with
A callback to do the processing?
> a flag
Oh.
> Now... several arguments... probably just a few lines that get run for all the types
Yeah, that does tend to be where it leads when new parameters are thought of in terms of requesting special treatment, rather than providing more tools.
Yes, yes, "the complexity has to go somewhere". But it doesn't all have to get heaped into the same pile, or mashed together with the common bits.
I saw a fancy HTML table generator that had so many parameters and flags and bells and whistles that it took IIRC hundreds of lines of code to save writing a similar amount of HTML in a handful of different places.
Yes the initial HTML looked similar in these few places, and the resultant usage of the abstraction did not look similar.
But it took a very long time reading each place a table existed and quite a bit longer working out how to get it to generate the small amount of HTML you wanted to generate for a new case.
Definitely would have opted for repetition in this particular scenario.
I can still remember trying Python 1.5 on a first-generation Mac (which was probably already a dinosaur at the time) and not being able to get it to do anything (even obvious things to try at the REPL would just crash). I don't know if there was a problem with that build or what.
reply