Oh wow, a 23-page write up about how the author misunderstood AWS Lambda's execution model [1].
> It emits an event, then immediately returns a response — meaning it always reports success (201), regardless of whether the downstream email handler succeeds or fails.
It should be understood that after Lambda returns a response the MicroVM is suspending, interrupting your background HTTP request. There is zero guarantee that the request would succeed.
It took me a couple reads of the PDF but I think you're right. The author creates an HTTP request Promise, and then immediately returns a response thereby shutting down the Lambda. They have logging which shows the background HTTP request was in the early stages of being sent to the server but the server never receives anything. They also have an error handler that is supposed to catch errors during the HTTP request but it isn't executed either. The reason for both seems quite obvious: it's completely expected that a Lambda being shutdown wouldn't finish making the request and it certainly wouldn't stick around to execute error handling code after the request was cancelled.
As an aside I find it strange that the author spent all this time writing this document but would not provide the actual code that demonstrates the issue. They say they wrote "minimal plain NodeJS functions" to reproduce it. What would be the reason to not show proof of concept code? Instead they only show code written by an AWS engineer, with multiple caveats that their code is different in subtle ways.
The author intends for this to be some big exposé of AWS Support dropping the ball but I think it's the opposite. They entertained him through many phone calls and many emails, and after all that work they still offered him a $4000 account credit. For comparison, it's implied that the Lambda usage they were billed for is less than $700 as that figure also includes their monthly AWS Support cost. In other words they offered him a credit for over 5x the financial cost to him for a misunderstanding that was his fault. On the other hand, he sounds like a nightmare customer. He used AWS's offer of a credit as an admission of fault ("If the platform functioned correctly, then why offer credits?") then got angry when AWS reasonably took back the offer.
I don’t know about node but a fun abuse of this is background tasks can still sometimes run on a busy lambda as the same process will unsuspend and resuspend the same process. So you can abuse this sometimes for non essential background tasks and to keep things like caches in process. You just cant rely on this since the runtime instead might just cycle out the suspended lambda.
Absolutely, I do this at $dayjob to update feature flags and refresh config. Your code just needs to understand that such execution is not guaranteed to happened, and in-flight requests may get interrupted and should be retried.
TiTiler does exactly that. Geospatial rasters are stored in S3, and the lambda retains a cache in memory of loaded data from S3. So if the same lambda execution is used it can return cached data without hitting S3.
Uhhmm… it seems both you and @frenchtoast8 are misrepresenting the code flow. The author doesn’t return early, the handler is async and explicitly awaits an https.request() Promise. The return only happens after the request
resolves or fails. Lambda is terminating mid-await, not post-return.
That’s the whole point and it’s not the behavior your comments describe.
Page 5: "It emits an event, then immediately returns a response — meaning it always reports success (201), regardless of whether the downstream email handler succeeds or fails."
This is not the right way to accomplish this. If you want a Lambda function to trigger background processing, you invoke another lambda function before returning. Using Node.js events for background processing doesn't work, since the Lambda runtime shuts down the event loop as quickly as possible.
The comments here are focusing on the code description by the author, of code he did not post, and the confusing explanations of the author, while ignoring the incorrect but demonstrative test case from AWS.
If you talk about documentation, and in this thread there have been many mentions of it, without citing specific paragraphs, we could start with this one from here [1]:
"Using async/await (recommended)"
"...The async keyword marks a function as asynchronous, and the await keyword pauses the execution of the function until a Promise is resolved..."
Or
"Using callbacks"
"...The function continues to execute until the event loop is empty or the function times out. The response isn't sent to the invoker until all event loop tasks are finished. If the function times out, an error is returned instead. You can configure the runtime to send the response immediately by setting context.callbackWaitsForEmptyEventLoop to false..."
It actually seems, the most likely that happendd here, were random disconnects from the VPC, that neither the author or AWS support were able to contextualize.
> I won’t get sucked into the session ID vs. JWT argument, but honestly, using JWTs in cookies is a win because you don’t have to fuss with storing session data on the server.
Okay, but then you implement storage of the refresh token and build this bespoke JWT re-issuance logic. So where's the win here? Just use sessions.
1. Cookies remain valid on logout (unless you implement a session revocation mechanism). This can be mitigated by making the JWT live a very short time (e.g. 5 minutes), but you still don't get immediate expiry. This may not be acceptable if you need to comply with certain standards such as PCI/DSS, SOX or HIPAA.
2. Cookie size could grow much larger. It's generally not a huge concern if you don't stuff too much data into your JWT claims, but if you want to optimize traffic for slow networks, JWT is not a great choice.
3. Since 50% of the JWT standard (and 90% of the JWE standard) is made of broken
and semi-broken algorithms, you need to be careful on what keys and libraries you use. Unfortunately, we cannot see the implementation for this article, so we cannot judge how safe it is, but JWT just requires extra care compared to Session IDs.
4. If you're using refresh tokens because your JWT is short lived, you're not really saving yourself any work, because you'd still need a database to store the refresh tokens. This is what GP was referring too.
> using JWTs in cookies is a win because you don’t have to fuss with storing session data on the server.
This is the author's claim, but the code clearly DOES store data on the server, for the refresh token.
Using JWT for "simplicity" is unfortunately common mistake. With JWT you can choose between simple (no storage required, no token revocation) and secure (immediate or delayed session revocation at the price of requiring storage). You cannot have both. OP chose security (by using a refresh token database), but this is no longer a simple solution. If you've already deployed a DB, Session IDs would always be simpler:
1. You could remove the refresh and redirect logic entirely.
2. You would only need one type of cookie.
3. You can remove dependency on JWT libraries, choosing the right cryptographic algorithm and configuring (and securing!) JWT signing keys.
This comes at no cost, since the refresh token DB is already there.
There can be good reasons to use JWT (although there are better standards out there for stateless tokens[1]), but simplicity is not one of them. Stateless tokens can be useful in the following cases:
1. Performance (increasing throughput or reducing latency by skipping the database for most calls).
2. Reliable cross-regional tokens (you don't need to replicate session IDs across regions and suffer from inconsistencies during failovers).
3. Decentralized verification (without relying on a central database).
4. Offline attenuation without network calls (with Macaroons or Biscuits only).
Unfortunately it seems like in 99% of the cases JWT is chosen as the access token format, it's chosen by its perceived simplicity and popularity, and the result is either a toy implementation that doesn't support token revocation, or an implementation that is more complex than it would have been with classic Session IDs.
> This is the author's claim, but the code clearly DOES store data on the server, for the refresh token.
You're right, but I was making a different point about session IDs versus JWTs.
With session ids, each user request requires a server-side lookup to validate the session. For that reason, their ideal place would be something in-memory and that's what I was referring to with "data on the server".
While storing session IDs in a database is an option, in my case it would introduce noticeable latency because I self-host my projects on a cluster of Pis at home and even though I have a fast connection, a roundtrip to my external (I don't self-host it) database still takes a few milliseconds under low load.
JWTs allow me to avoid frequent server-side lookups. I can trust the client's data without hitting the database, except when issuing a new JWT - but even then, that happens every 2-3 minutes per user. While verifying JWT signatures and decoding claims does consume some CPU cycles, this overhead is minimal on my setup compared to the latency of database roundtrips.
Nothing against session ids, but I feel JWTs are better suited for my resource constrained setup.
1. What if you store the JTI in your database and have the ability to immediately mark them as invalid, thus so that the next user request makes them logged out?
2. Storing just the JWT and things like user id should not be that big of deal for user performance. If you're refering to for example Apple sending massive jwt payloads for their IAP service, then i can see what you mean.
3. A standard has broken algorithms? This is news to me.
4. Don't most apps have databases? I don't see why this is a bad thing.
If JWT in httponly cookies are bad, what do you suggest inplace of it? For companies running multiple mobile + web apps
The major benefit of this approach is that you can avoid a database lookup when you have a valid JWT available. Only in cases where it's missing or expired would you need to hit the database for a refresh token.
This might not matter for trivial apps, but for very high-concurrency web apps this will matter a lot.
Yes, that's why long time ago we figured out how to store those in some extremely fast memory store like memcached or redis. I hope people that work at scale where it's not a viable option don't need to read such blog posts to set this up.
To store 1M sessions, let's say each is 2 KB, you going to need 2GB. Somewhere around 15K session look ups per second will saturate the server (+/- 5K ops/s depending on the CPU) . If you want more sessions - use more RAM, if you want more ops/s - use more servers (cache cluster).
Another reason to JWT - cross-region authentication (replicating giant session store across the globe isn't a viable option)
Linking to a collection of context-less clips from a streamer with an active sexual harassment suit[1], whose community is engaging on a harassment campaign against the subject in question[2] isn't the unbiased source you think it is.
> Yes it is. But the the 1st amendment does not say that any speech can have zero consequences for you.
The Government taking action against a citizen for voicing opinions it does not agree with falls pretty clearly into what the 1st amendment covers. This is not a "yelling fire in a crowded theater" type scenario where speech is being restricted for the benefit of many.
A large fraction of UHG profits come from their Optum subsidiary selling software to other payer and provider organizations. This is separate from the health insurance business. If broken out separately they would be one of the 20 largest US tech companies.
Unfortunately that is rolling a boulder uphill, and even if you fire all of Congress (on this issue, you'd need to get rid of all the Republicans and at least a third of the Democrats) and replace it with people who give a crap, all it takes is one executive to stop enforcing the rules.
> you'd need to get rid of all the Republicans and at least a third of the Democrats
And then most voters: “71% of U.S. adults consider the quality of healthcare they receive to be excellent or good, and 65% say the same of their own coverage. There has been little deviation in these readings since 2001” [1].
> Quality of healthcare and quality of insurance experience are not the same statistic
The question was specifically about “the quality of healthcare you receive/your healthcare coverage.” Coverage doesn’t cover the insurer on cost, but it does on claims denials.
Most Americans like their coverage. If you want to reform the system, you have to start with that fact and convince them they aren’t risking what they have unnecessarily.
Sure. We want the same system but cheaper. That’s an important difference for someone advocating to scrap the system to make it cheaper. We want those lower costs. But loss aversion marries us to the good enough.
Correct. You've identified why this debate has been frozen in American politics for decades. One man's leech is another's adored Cadillac insurance policy, trusted provider or prescribed placebo. Healthcare reform keeps dying on the rocks of conspiracy theories about the Congress of whatnot. The problem is surfacing a solution the electorate trusts and endorses.
> One man's leech is another's adored Cadillac insurance policy, trusted provider or prescribed placebo.
No, the plans aren’t leeches, the people using the plans aren’t leeches, the entire administrative / leadership staff at health insurance orgs are the leeches.
A majority of the electorate wants government provided healthcare.
> A majority of the electorate wants government provided healthcare
No. (Come on, read your own source.)
A majority say "it is the responsibility of the federal government to make sure all Americans have healthcare coverage." There is a 10-point preference for a "system based on private insurance" versus a "government-run system."
79% of Democrats want a government-run system. But only 46% of independents and 13% of Republicans. Which explains the gridlock. If one side only proposes government-provided healthcare as its solution, it will waste a bunch of energy on it and then be predictably shot down.
> In 2017, 23% of the company’s insurance revenue went toward the provider unit called Optum Health, and 69% went toward OptumRx. So far in 2022, 38% of that money went toward Optum Health, while 56% was captured by OptumRx.
#1 on the list is Walmart, which has a similarly low on-paper 2-3% profit margin, but I don't think anyone is deluded into thinking the company, the Waltons, or its investors are barely scraping by.
#2 is Amazon. Again, low profit margin. Again, plenty of profit.
Nobody is arguing that you can't buy a luxury yacht or whatever with UNH's profits. It's pretty obvious from the original comment[1] is the argument is that even if all the profits were plowed back into approving more claims, that it would only only increase the approval rate by 5-6%, which is a totally minor amount.
Walmart is amazingly efficient and basically the case for how economies of scale benefit the consumer. If you broke it up, prices would go up, not down.
Walton's do great because of the scale, but is a very lean and efficient organization. Take them entirely out of the picture and the consumer would hardly notice.
Does AWS need to build anything on the site as part of this deal, or are they simply leveraging the power added to the grid in their existing us-east-1 footprint? The distance from Ashburn is too far for this to be an extension of a region.
> Transamerica is getting worse service because the contractor that they prefer wanted to look at documentation for a system they payed for. This is why enterprise IT is a legally mandated mess.
No, I don't believe that's at all what was happening. I really recommend reading the original compliant[1]. TCS was leveraging their employees with access to CSC's documentation and source code to glean information about how a particular feature was implemented, _not_ for supporting Transamerica, but for reimplementing the feature in their own product.
From paragraph 29 of the compliant:
> A TCS employee, who upon information and belief is part of the U.S. BaNCS development team, wrote in an email: “Quite honestly, I’m not sure how VTG [Vantage] does this today, so maybe we should engage [TCS employees with access to the Vantage source code] if we want to emulate that?”
The complaint goes on to describe the engineers sending the actual source code to the team. This is pretty clear cut theft IMO.
Yes, I have stated that I understand it is legally a no-no because they are violating the terms under which they have access to that documentation. But ethically, I think it is a lot less clear cut, because it allows CSC/DXC to hold their platform hostage and not really have to try for re-competes because they know that they can try their luck in court. They gave TCS access to the documentation, is it really a good use of the American justice system to enforce what they do with it? Is it a matter of ethical concern? I don't think so.
This is why enterprise IT is a mess. Every time you need to do anything, you have to triple check that you aren't violating some clause buried in some obscure legal agreement that you probably don't have access to. Or else, you could cost your firm a quarter billion dollars or more. So you end up with reams of dead software that is unusable by design since it is being held hostage by various different commercial interests. I understand that is just the world we live in when millions of dollars are involved, but we can do better.
On the other hand I get the concern: you would want some legally enforceable agreement with TCS that they won't steal confidential information you share with them in good faith to steal business. Nor go and hire a bunch of staff to poach your contract. But documentation of the software that Transamerica paid for is secret, are you kidding me? They would have been fine, legally, if they got the materials directly from Transamerica. Because they got it from someone who merely used to work there, it is a quarter of a billion dollar mistake. Seems like a pretty narrow difference to me, hardly some kind of grand ethical quandary.
> I mean, you didn't even consider implementing a simple fetch of an already cloned repository in your mirroring server code. So yeah, I'd argue that the bad faith part is actually justified.
> We did consider caching clones, but it has security implications and adds complexity, so we decided not to. It is certainly not trivial to do and not something we are likely to do based on this issue.
Drew continues to act as though he is always correct, and any viewpoint that isn't his is just moronic. I've repeatedly seen this behavior from him in multiple venues over the years, and
I'm happy to see the wider community start calling this out as childish.
I don't particularly care for Drew, but the issue he's reported here seems totally valid. And if he requested that he be excluded from getting hit by the crawler, wouldn't that mean it would be impossible for people to use packages from sr.ht unless they change their config?
Plus, it does seem reasonable to think that only one of the crawlers needs to hit the site. The global replication can happen at the FS level or, heck, the crawlers can just perform pulls from each other.
No. According to the Go project, adding his site to the exclusion list would reduce traffic to his site at the cost of freshness of the data the proxy collects; it would not make it "impossible" for people to use packages from sr.ht.
This is all in the thread that DeVault linked to from his post.
Drew can often be very abrasive, but does it really matter in this case? His site is basically being DDoS'd.
Yes, there are decent arguments why the golang infra doesn't cache or respect typical norms like robots.txt, but they don't change the unreasonableness of the underlying situation. Surely some mitigation could have been worked out in the year since the ticket was filed?
That doesn’t seem like a solution at all and is actually kind of punative as that would make srht bad for hosting go.
I think this is just an example of Google being a jerk and not caring enough to do proper software engineering.
Go seems really interesting but I have avoided using it because it’s so tied to Google. And I don’t trust Google to make good decisions for developers or users.
Can you articulate why it isn't a solution, and how it would be punitive? There are people on this thread who appear to believe Google's workaround would mean that repositories hosted on sr.ht would be unusable as Go modules, which is not at all the case.
A full git clone just to DDOS a hoster to check if the user-experience is still first-class, and filling a proxy is not an acceptable solution for a module hoster who has the pay the hosting bills by himself.
If they want to know if their proxy is still uptodate, a cheap latest change request 8x/hour would be appropriate.
> Have you considered the robots.txt approach, which would simply allow the sysadmin to tune the rate at which you will scrape their service? The best option puts the controls in the hands of the sysadmins you're affecting. This is what the rest of the internet does.
> Also, this probably isn't what you want to hear, but maybe the proxy is a bad idea in the first place. For my part, I use GOPROXY=direct for privacy/cache-breaking reasons, and I have found that many Go projects actually have broken dependencies that are only held up because they're in the Go proxy cache — which is an accident waiting to happen. Privacy concerns, engineering problems like this, and DDoSing hosting providers, this doesn't looks like the best rep sheet for GOPROXY in general.
You didn't answer my question. What's the problem with the Go team's workaround? I get that DeVault would like to redesign the Go modules system to suit his own preferences, but that's not on the table.
The issue isn't the Go module system but rather their proxy which is not part of Go and should respect server resources but doesn't. The workaround makes any 3rd party host for go modules a bad choice as packages will always be stale.
The issue has nothing to do with DD other than him raising the issue and being ignored. The fault is with google.
What would be the problem with the go team doing a quick `git ls-remote` instead of jumping to a full clone? All it would take is tracking the last `ls-remote` result in any of the Google's many options for databases, and only doing a clone when the remote updates.
Maybe there's no problem? It's totally fair to critique the design of the current module proxy. The odds of them developing precisely the right proxy were low; of course we'll be able to come up with things they can do better. That's how open source works.
It's when we turn this into a morality play that we go off the rails.
Considering they don't want caching, la what is considered basic rudimentary politeness, (and noting that the change increased traffic, so the proxy was intentionally doing busywork that no-one wanted or needed), The odds of them developing an unacceptable proxy were 100%
Its akin to putting up an open exploitable DNS resolver in 2022 despite, with 2 seconds of research, the entire world telling you not to do that
no one needs to full clone 2 times a minute just to check if a security update exists
one of the top money generating machines does not need a devil's advocate
Yeah so basically none of this is true? "Caching" is not considered basic rudimentary politeness, and what they did is not at all like putting up an "open exploitable DNS resolver" (also: putting up an "open exploitable DNS resolver" is pretty much still an industry norm). The "money generating machines" stuff doesn't add to the credibility of this argument.
I get that you feel like you could design a better Go module proxy. It would be weird if you couldn't, because you have the benefit of seeing what happened when we deployed this one†. Congratulations? It's an achievement, I guess?
For my part, I'd be thrilled if just a single person could articulate exactly what the impact to DeVault's code hosting service would be if he took the Go project's offer up on just not having them clone modules hosted on his service so often. Anybody at all, if you could just explain what the problem would be here, I'd be forever grateful.
† (to wit: Go got a lot better, and apparently just 2 source hosts on the entire Internet had a problem with it, one of whom was fixed directly and the other of whom is apparently still mulling whether they want to burn the extra bandwidth for a cacheing benefit that literally nobody on the Internet seems to be able to describe).
> I get that you feel like you could design a better Go module proxy.
I add "sleep 1" at minimum to all my scraping loops, and refuse to use clients that don't keepalive and cache (if I'm hitting the same endpoint), I've done this always, so yes I could have designed a better proxy today, and a year ago, and two years ago, and three years ago, et, al.
> Money generating
They can pay people, I'm not doing it for free for a language I don't use
> if he took the Go project's offer up on just not having them clone modules hosted on his service so often.
Do you know that he hasn't tried? considering how well google responds to emails
Some people have principles that you might not understand, I too would not ask someone to not DDoS me, because I feel that is an absurd request.
So not a impact to the technical aspects but to the administrative aspects. People are paying devault to run a service as devault would, it would be improper for him to not run the service as expected of him.
> burn the extra bandwidth for a cacheing benefit that literally nobody on the Internet seems to be able to describe
bandwidth and CPU time both cost money
- - -
There are alternative approaches that have existed before 1.16, Deno with its pull once then pull only when told to approach, has not had a crisis of security, Every other language works on demand, NPM at most checks a dedicated security issues API designed for mass consumption. These things existed, none of them were referenced.
Do you know that he hasn't tried? considering how well google responds to emails
You can't reasonably build an argument by just making stuff up. Maybe, against all evidence, Google offered to stop hitting sr.ht with Go proxy cache refresh requests and then ignored DeVault when he took them up on it. But I'm not required to assume that very weird situation is what actually happened.
In this case, it's really hard to see thrashing other people's servers relentlessly to collect data you already have as anything but incredibly, incredibly poor engineering. Y'all should write him a check for that much resource waste.
>At Google we were told to stop thinking about all this stuff, that the storage hardware and software people were responsible for hiding things like wearout from application developers.
Something tells me this team was told to "stop thinking about all this stuff, that the network people were responsible for hiding things like speed, latency and cost from application developers." aka network is infinite, keep pounding that repo and we will scale accordingly (our side of the equation, sucks to be other people)
without knowing anything about this situation outside of this thread and the post it links to, it comes across as willful negligence to screw over someone who was a bother in past community transgressions
The 4G daily was a different user who hosted a go module where he was the single user on his own server, this was not DeVault.
I'd be pretty pissed if I hosted a go module essentially for myself and suddenly I have a $200 dollar bill, because google decided to clone my repository 500 times a day. If it doesn't bother you, how about you donate $200 a month to a charity of my choosing, because it doesn't matter to you.
Should every language be responsible for paying the bandwidth bills for dependencies?
You might look at the most recent comment from the Go team on the issue, there have been no additional requests or events since they last resolved it for both of the effected parties
Plenty of bootstrapped businesses have better things to spend $200 / month on, let alone the time spent trying to figure out where all the anomalous traffic is coming from. As I understand it, it's not simple file fetches either. It's cloning a repo, which involves two-way communication, consumes CPU and RAM, and causes disk seeks. You're not slapping it on CloudFront and calling it a day. Finally, it looks to me like the costs are going to scale the more people he has using sourcehut and writing Go modules.
I don't really understand turning this around on him. Why should he have to subsidize Google? If it's not a problem, why do we have robots.txt at all? Just let bots hammer your site and cope with it.
The current situation can't be the optimal solution. It wasn't even present prior to Go 1.16. Only one company has the ability to change that. What should he do differently here? Why should he have to spend any of his time or money working around an issue he didn't create?
That was a different user. The fact that a user not running a git hosting service is potentially eating $200 a month should queue you into the fact that the cost to Drew is likely drastically higher than that.
Google should be sending reimbursement checks for the damage done here on this issue.
Drew is running a code hosting business and this is a cost of providing a feature to the users. He can pass the costs on if it is a problem. He has lots of options and his competitors are not making a big deal out of this.
I suspect he's drawn his line in the sand and wants to keep it going rather than finding a solution that works without requiring upstream changes.
If I provide a paid service and someone abuses it I must deal with it because my larger competitors deal with it? It's good to know that small businesses have no place in the modern world.
> but it has security implications and adds complexity
Read: we prefer to use your servers for caching. Not good enough. Maybe the issue is people making silly evasive arguments like these while the server load piles on?
There's a _lot_ of assuming happening on this post so far. My understanding was the Twitter account previously had different handles, including her real name. I don't think that level of sleuthing requires internal access, using old/archived tweets can unmask the account.
You’re correct but even if Twitter is not responsible here, I’d still like to hear the reporter clarify why she thought the real name of the owner of the account was newsworthy (if that’s indeed what happened, it seems corroboration is scarce thus far)
reply