More

captainkrtek · 2025-12-03T21:34:19 1764797659

As a customer of GitHub actions, anecdotally feels like Github experiences issues frequently enough to make this not a problem.

captainkrtek · 2025-12-03T21:29:18 1764797358

I've lived in Seattle my whole life, and have worked in tech for 12+ years now as a SWE.

I think the SEA and SF tech scenes are hard to differentiate perfectly in a HN comment. However, I think any "Seattle hates AI" has to do more with the incessant pushing of AI into all the tech spaces.

It's being claimed as the next major evolution of computing, while also being cited as reasons for layoffs. Sounds like a positive for some (rich people) and a negative for many other people.

It's being forced into new features of existing products, while adoption of said features is low. This feels like cult-like behavior where you must be in favor of AI in your products, or else you're considered a luddite.

I think the confusing thing to me is that things which are successful don't typically need to be touted so aggressively. I'm on the younger side and generally positive to developments in tech, but the spending and the CEO group-think around "AI all the things" doesn't sit well as being aligned with a naturally successful development. Also, maybe I'm just burned out on ads in podcasts for "is your workforce using Agentic AI to optimize ..."

captainkrtek · 2025-12-03T02:46:45 1764730005

I think the obvious things are:

- Deviation in consistency/texture/color/etc.

- Obvious signs related to the above (eg: diarrhea, dehydration, blood in stool).

Ultimately though, you can get the same results by just looking down yourself and being curious if things look off...

tldr: this feels like literal internet-of-shit IoT stuff.

captainkrtek · 2025-11-28T23:11:28 1764371488

Do you think they're using the guise of "its solar radiation" as cover to do a software update to fix a more problematic "bug", and perhaps tangentially there are some changes in said-update to improve some error correcting type code (eg: related to detecting spurious bit flips).

maximilianburke · 2025-11-28T23:56:32 1764374192

Not in aviation.

superxpro12 · 2025-11-29T08:08:21 1764403701

Counterpoint: Boeing MCAS tho

Bratmon · 2025-11-29T03:13:41 1764386021

Does the 737-Max not count as aviation anymore?

edoceo · 2025-11-29T07:47:46 1764402466

It does. It is but the Max issue was well different to this one.

aunty_helen · 2025-11-28T23:52:53 1764373973

No, that would be straight to jail.

vlovich123 · 2025-11-29T01:07:12 1764378432

Remind me who from Boeing went to jail?

stackghost · 2025-11-29T01:22:19 1764379339

Airbus is in Europe where the Rule of Law still exists

vlovich123 · 2025-11-29T01:32:07 1764379927

That’s what we naively thought here too.

kakacik · 2025-11-29T08:39:39 1764405579

Look at how US government treats financial behemoths which actively harm whole mankind vs how EU treats them. There is way more to this topic obviously (who wants to harm their local company), but generally US is pro-companies while Europe is pro-people.

vlovich123 · 2025-12-01T09:00:46 1764579646

Deutsche Bank and HSBC, two major European banks, have repeatedly admitted they have engaged in money laundering activities for Russia, drug cartels and terrorists and have consistently failed to meet their AML obligations. The US is the only entity that’s going after these banks for these issues winning significant judgments and even with that backdrop you don’t see any EU enforcement.

fooker · 2025-11-28T23:36:13 1764372973

Yeah I don't buy it either.

If it was really 'solar radiation' there would be more small details.

dboreham · 2025-11-29T02:28:29 1764383309

Reading the Airbus press release, I wonder if this is what happened:

Solar radiation event led to alpha particle induced data corruption in a flight control computer memory (could be DRAM, SRAM, on-chip cache, registers...). These failures are supposed to be transient (reboot and all is well).

This is an anticipated failure mode. Only one (of three?) computers should be affected by such a failure and therefore the remaining two keep on running the plane.

But what happened is <something> went wrong with the failover/voting mechanism (as often happens with one-off seldom-executed failover code). The result was no flight control computer functionality until the entire system was rebooted. Hence the emergency landing.

The fix is to address that software error, with perhaps a secondary fix TBD to harden the hardware (add some shielding perhaps).

The fact that they talk about data corruption and not just a malfunction suggests alpha bit flip rather than latch-up.

Then send the whole statement through a French to English translator to make it a bit more confusing.

chasing0entropy · 2025-11-29T01:05:08 1764378308

I would say its pretty detailed -an unknown interference caused a single crc protected 32 bit word to be corrupted simultaneously, by timestamp, in both the flight controller hardware and the black box data recorder.

My concern would be what error correction mechanism did or did not catch the corruption in memory and why did it not recover without critical impact to operations?

fooker · 2025-11-29T01:17:47 1764379067

> corrupted simultaneously

This sounds like a software bug.

Something like - {copy a to b, checksum a--b}

Instead of - {copy a to t, checksum a--t, copy t to b, checksum a--b}

I bet the fix is along these lines, with the caveat of real time systems/etc.

londons_explore · 2025-11-29T06:40:31 1764398431

My guess is they haven't managed to point to the single memory bit which was flipped to cause this result.

The software update is probably more along the lines of 'lets just introduce a watchdog task which resets the system if the output deviates too far from the input for too long'.

schmuckonwheels · 2025-11-29T05:25:50 1764393950

No, because aerospace is not garden-variety Silicon Valley webshittery.

There is a slightly different level of discipline and engineering ethics at play.

captainkrtek · 2025-11-28T17:34:29 1764351269

This is excellent and aligns with my own experience.

During my day I try to minimize interruptions by batching them. I will largely ignore Slack, and as notifications come in I glance and determine quickly if it really is urgent or if it can wait. If it can wait, I will punt all of those messages to a "remind me later" of a few hours, and get back to my task. I think this keeps my "recovery time" small as I'm not looking too close at these messages. It's not perfect, but definitely helps over pausing my "real work" to fully dive into each notification or ask.

skirmish · 2025-11-28T21:40:17 1764366017

Then in your next performance review you get dinged as "not responsive", "not a team player". Trying to work in peace is a in instant loss nowadays, just play the visibility performative game as all the quickly promoted people in office do. Why do you think your management cares about getting things done? If they did they would reward it.

captainkrtek · 2025-11-28T23:09:07 1764371347

This has not been my experience at least at the more remote-friendly places I worked. However, I can see this at companies with different culture / pace / attitude.

My most recent role the entire company of ~200 was remote, and so there was rarely the expectation of immediacy in a response. If something was truly urgent you'd be paged.

cancan · 2025-11-29T17:17:55 1764436675

I have to agree — in general, most people have a good sense of what's urgent or not and with a few kind nudges, they align quickly.

captainkrtek · 2025-11-28T16:29:28 1764347368

No no, we just need to put even more money in.

captainkrtek · 2025-11-26T16:43:12 1764175392

It's so sad how much money leaders will effortlessly pump into something like this, when we still have existential threats of climate change, incurable diseases, poverty, housing, and so on.

Meanwhile ungodly amounts of money are being used so some boomer can generate a AI video of a baby riding a puppy.

captainkrtek · 2025-11-24T05:06:43 1763960803

A company I worked at also did this, though there was no limits. Some folks would choose to spend the whole week working on a larger refactor, for example, I unified all of our redis usage to use a single modern library compared to the mess of 3 libraries of various ages across our codebase. This was relatively easy, but tedious, and required some new tests/etc.

Overall, I think this kind of thing is very positive for the health of building software, and morale to show that it is a priority to actually address these things.

captainkrtek · 2025-11-24T05:00:50 1763960450

I've struggled throughout my life with anxiety, ADHD, and bouts of depression.

I've done years of therapy, and use some medication to help with my ADHD. I will say though, singlehandedly, the best (and hardest) thing I've had to do is fight my own phone/internet/computer usage.

I grew up with computers and still work professionally with them day to day, but have made a serious effort in the last year to cut down my usage in an extreme manner:

- Using a 'brick' device to control which apps work on my phone, requiring I physically tap my phone to the brick to lock/unlock the restricted mode. This is always on.

- Blocking tons of sites via multiple means (iOS screen time, eero network profiles)

- Turning my iPhone into a very very basic phone: in "bricked" mode, which I will find myself using for continuous days at a time, I can only use: gmail, photos, notes, weather, maps, spotify, telephone, imessage. No news, no internet browsing, no social media.

- I've deleted all social media accounts (except LinkedIn, but this too is blocked on my phone).

From all of this, the initial realization was, "wow, I'm bored..", which is hard at first to sit with as a feeling, when normally my first instinct to that feeling was "let's open some app/youtube/etc." Then you slowly find positive things creeping in to occupy that boredom time: reading, calling friends/family, getting chores done, etc. And for anything "restricted" that I can't do on my phone, I largely can do on my computer (though I still block sites like reddit, youtube). But this is much healthier as I'm much less likely to pick up my laptop for hours on end, vs. opening my phone at every moment I'm bored.

captainkrtek · 2025-11-24T04:53:18 1763959998

> It would be a good thing, if it would cause anything to change. It obviously won't.

I agree wholeheartedly. The only change is internal to these organizations (eg: CloudFlare, AWS) Improvements will be made to the relevant systems, and some teams internally will also audit for similar behavior, add tests, and fix some bugs.

However, nothing external will change. The cycle of pretending like you are going to implement multi-region fades after a week. And each company goes on continuing to leverage all these services to the Nth degree, waiting for the next outage.

Not advocating that organizations should/could do much, it's all pros/cons. But the collective blast radius is still impressive.

chii · 2025-11-24T05:26:32 1763961992

the root cause is customers refusing to punish these downtime.

Checkout how hard customers punish blackouts from the grid - both via wallet, but also via voting/gov't. It's why they are now more reliable.

So unless the backbone infrastructure gets the same flak, nothing is going to change. After all, any change is expensive, and the cost of that change needs to be worth it.

MikeNotThePope · 2025-11-24T05:48:14 1763963294

Is a little downtime such a bad thing? Trying to avoid some bumps and bruises in your business has diminishing returns.

Xelbair · 2025-11-24T07:20:21 1763968821

Even more so when most of the internet is also down.

What are customers going to do? Go to competitor that's also down?

It is extremely annoying, will ruin your day, but as movie quote goes - if everyone is special, no one is.

throwaway0352 · 2025-11-24T12:50:19 1763988619

I think you’re viewing the issue from an office worker’s perspective. For us, downtime might just mean heading to the coffee machine and taking a break.

But if a restaurant loses access to its POS system (which has happened), or you’re unable to purchase a train ticket, the consequences are very real. Outages like these have tangible impacts on everyday life. That’s why there’s definitely room for competitors who can offer reliable backup strategies to keep services running.

mallets · 2025-11-24T13:46:37 1763991997

Those are examples where they shouldn't be using public cloud in the first place. Should build those services to be local-first.

Using a different, smaller cloud provider doesn't improve reliability (likely makes it worse) if the architecture itself wrong.

esseph · 2025-11-25T05:19:15 1764047955

It makes credit card transactions risky (offline)

mallets · 2025-11-25T13:06:04 1764075964

Talking more about some unrelated function taking down the whole system, not advocating for "offline" credit card transactions (is this even a thing these days?). Ex: If the transaction needs to be logged somewhere, it can be built to sync whenever possible rather than blocking all transactions if the central service is down.

Payment processor being down is payment processor being down.

Xelbair · 2025-11-28T13:03:13 1764334993

Cash exists, physical tickets exists.

those things shouldn't be fully tied to the internet/intranet anyways.

wongarsu · 2025-11-24T14:12:43 1763993563

Do any of those competitors actually have meaningfully better uptime?

From a societal level, having everything shut down at once is an issue. But if you only have one POS system targeting only one backend URL (and that backend has to be online for the POS to work) then cloudflare seems like one of the best choices

If the uptime provided by cloudflare isn't enough then the solution isn't a cloudflare competitor, it's the ability to operate offline (which many POS have, including for card purchases) or at least multiple backends with different DNS, CDN, server location etc.

immibis · 2025-11-24T08:50:47 1763974247

They could go to your competitor that's up. If you choose to be up, your competitor's customers could go to you.

dewey · 2025-11-24T09:02:12 1763974932

If it’s that easy to get the exact same service / product as another vendor the maybe your competitive advantage isn’t so high. If Amazon would be down I’d just wait a few hours as I don’t want to sign up on another site.

MikeNotThePope · 2025-11-24T10:23:46 1763979826

I agree. These days it seems like everything is a micro-optimization to squeeze out a little extra revenue. Eventually most companies lose sight of the need to offer a compelling product that people would be willing to wait for.

immibis · 2025-11-25T18:59:35 1764097175

Why can't we just take pride in doing a good job?

krige · 2025-11-24T06:14:11 1763964851

What's "a little downtime" to you might be work ruined and day wasted for someone else.

bloppe · 2025-11-24T09:07:50 1763975270

I remember a Google cloud outage years ago that happened to coincide with one of our customers' massively expensive TV ads. All the people who normally would've gone straight to their website instead got 502. Probably a 1M+ loss for them all things considered.

We got an extremely angry email about it.

fragmede · 2025-11-24T07:24:51 1763969091

It's 2025. That downtime could be be difference between my cat pics not loading fast enough, or someone's teleoperated robot surgeon glitching out.

cactusplant7374 · 2025-11-24T14:42:47 1763995367

I have a lot of bad days every year. More than I can count. It's just part of living.

aaron_m04 · 2025-11-24T06:08:36 1763964516

Depends on the business.

tjwebbnorfolk · 2025-11-24T18:40:54 1764009654

> the root cause is customers refusing to punish these downtime.

ok how do I punish cloudflare -- build my own globally-distributed content-delivery network just for myself so that I can be "decentralized"?

Or should I go to one of their even-larger competitors like AWS or GCP?

What exactly do you propose?

niutech · 2025-11-25T11:31:29 1764070289

Why not just boycott CDNs like Cloudflare and instead host your website on a decentralized network like Bluesky (https://danielmangum.com/posts/this-website-is-hosted-on-blu...) or IPFS (https://pinme.eth.limo/) for free?

chii · 2025-11-25T03:08:11 1764040091

you are not a customer of cloudflare.

You need to be punishing the services you "paid" to use, but had downtime. So did you terminate any of those services for downtime, or had any sort of punishment done to them as a result?

tjwebbnorfolk · 2025-11-25T18:09:27 1764094167

Ok but the price I am paying includes some % of downtime in the SLA, and I am ok with that.

If I wanted 100.00000% uptime, I would have to pay much more, but I don't want to

whatevaa · 2025-11-24T06:54:44 1763967284

Grid reliability depends on where you live. In some places, UPS or even a generator is a must have. So it's a bad example, I would say.

LoganDark · 2025-11-24T12:21:06 1763986866

> Checkout how hard customers punish blackouts from the grid - both via wallet, but also via voting/gov't.

What? Since when has anyone ever been free to just up and stop paying for power from the grid? Are you going to pay $10,000 - $100,000 to have another power company install lines? Do you even have another power company in the area? State? Country? Do you even have permission for that to happen near your building? Any building?

The same is true for internet service, although personally I'd gladly pay $10,000 - $100,000 to have literally anything else at my location, but there are no proper other wired providers and I'll die before I ever install any sort of cellular router. Also this is a rented apartment so I'm fucked even if there were competition, although I plan to buy a house in a year or two.

heartbreak · 2025-11-24T13:25:06 1763990706

The hyperscalers definitely vote with their wallets.

mopsi · 2025-11-24T06:15:28 1763964928

Downtimes happen one way or another. The upside of using Cloudflare is that bringing things back online is their problem and not mine like when I self-host. :]

Their infrastructure went down for a pretty good reason (let the one who has never caused that kind of error cast the first stone) and was brought back within a reasonable time.

tracker1 · 2025-11-25T00:24:51 1764030291

And even in multi-region, you experience a DNS failure and it all goes up in flames anyway. There's always going to be something.

HN For You