The Netherlands is arguably one that heavily embraces the 4-day workweek. In this country, employees can request a shorter work week (4 days) for proportionality less wages (so eg for 4 days get 80% of wages).
More than 80% of working moms utilise this option and around 10% of dads. [1] Among my friends I have parents where both of them work 4 days, taking the 5th day off on separate days, and their kid goes to daycare 3 days a week.
Note that most government subsidies (for childcare) are set up to encourage working at least 32 hours per day.
The outrageous cost of daycare means that as a young parent switching to 32h, you're giving up 20% of your pay but saving a few hundred euros a month in daycare costs. So the pay cut ends up being closer to 10%, depending on their income.
Then the kids grow up and go to school and the parents, if they have enough money, have essentially no incentive to work more hours.
Lazy ass. Back in my day, before walking up hill both ways to school, we had to wake up before we went to sleep and our parents would kill us every night!
In the Netherlands, they're also looking at high labor immigration, and a (probably at least in part) related housing shortage. There are real questions about the sustainability of this part-time "culture".
I used to work for a Dutch bank (in London) ~20 years ago and pretty much all of our colleagues in The Netherlands worked a four day week although with full pay and hours, just working longer days over the four day week.
In France primary schools are traditionally off on Wednesdays and when I was there it wasn't uncommon for mostly women to have that arrangement: 80% pay but Wednesdays off.
However, they also traditionally had school on Saturday, right? For some reason, that made me irrationally upset when I thought of my kids going to school on Saturdays. I get the argument behind it (it let's the parents take care of things outside of work), but it seems like it also would prevent family outings.
Best I can tell, most schools no longer have classes on Saturday, but they do have Wednesday morning at least. And, at least at my daughter's school, most are there (or at another _recreation center_) in the afternoon.
I've been using Kagi as my default search engine for about 2 months now. I love it, and feel so far that it's well worth the $108 per year: just by not having to spend mental energy to scroll through the first several sponsored results; and try and decide if a result is paid or not.
I set Kagi to be my default search in my browser (Chrome). For specific searches like stocks, maps or restaurant reviews, I still use the "!g" bang to go to Google.
Never thought I'd pay for search: but very happy with my choice so far. Great to see others agree on this page - as I hope they can maintain this as a viable business, and stick to the principles of users being their customers: and not advertisers.
I was recently in Palo Alto, and bumped into a newly founded startup (I don't remember the name unfortunately) who set themselves the grand the vision of exactly this: winning a gold medal on the international Olympiad using AI. Their plan was to build mostly on LLMs as a start, and iterate as they go. In their barebones office space, they had a poster with a countdown of the number of weeks till the event: it was 36 at the time.
It sounded interesting to wonder how far they could go with this kind of approach. I thought they were aiming for the moon: but also respected the boldness and determination. They had the funding to operate for at least a year, and were very focused to get there.
Seems like this prize will supr hundreds (or thousands) of teams competing in exactly this space. Perhaps it will have a similar effect like the $1M Netflix Prize in 2009 for recommendations algorithms!
Well math solving is exactly what the rumored Q* is aiming towards too.
I don't think it'll take more than 2 years before some LLM + RL system can take the gold medal.
I think companies like OpenAI are aiming for something far more ambitious, like solving a millennium prize problem (even with human assistance). That's the kind of news release that'll add another $100 billion to your market cap.
Their current ambition is to be able to solve school math, which is quite far away from solving unsolved conjectures or math olympiads. I really doubt that any of this is within LLM/transformer scope, except maybe in some auxiliary sense to other, much different architectures.
I don't disagree, as in that if we achieve school math in a way that is not mere overfitting over a language-based training set, going to more advanced mathematics is definitely conceivable. My problem is that people are talking about solving unsolved conjectures while we are not even in a point where we know how to tackle math at all.
Imo we are not currently in the beginning of an exponential curve re solving math with AI, and def not on the path of AGI. I understand that if one believes that we are on the path to AGI soon then we shall have these math-AI advancements quite soon, but I disagree with the premise.
Art isn't an easier problem than math. An artbot would have sounded more sci-fi than a mathbot only 2 years ago.
Yet it only took the AI world 1.5 years to go from drawing child scribbles to replicating top artists with like 90% similarity (I can barely tell the difference between AI and human drawn art anymore with the new NovelAI model). It won't be long before AI starts to go superhuman in art skills.
It won't take long from a school-math model to math olympiad model (I'd say 1 year is enough), and going to unsolved conjectures won't be that long either (2-3 years?). We know from AlphaGo that its possible to make AI systems far superhuman at solving some abstract math problem.
Not sure. Art is about approximate pattern recognition and if you have a large enough dataset it seems that you can definitely reproduce some of that.
For math... it involves consistent reasoning from A to Z - which does not allow for any kind of mistake in the way. In Art you won't feel too bad if the shadows or the lights are a little weird or if a character has 7 fingers instead of 5 on one hand, but this kind of mishaps break everything in Math.
Fingers are already mostly solved. The latest models draw hands better than most human artists.
I think artists would disagree with your assessment of 'approximate pattern recognition'. Its more like:
1. Given a set of words describing what the user wants.
2. Arrange pixels in a grid
3. That maximizes the user rating
On one hand it is tolerant of small errors. On the other hand its an extremely broad problem. Also to get a good user rating, it has to do 99 things right for every 1 thing it draws wrong.
Well this thing about fingers, etc in drawings. Lets put it this way - for us mere mortals the generative images look very much okay. To artists and people who actually draw something, well ... they very often spot inconsistencies in the whole production, including how fingers, arms, overall body posture, etc is presented. So it is exactly what we can expect - good enough on average, but actually a mediocre result of commonality. Thing is the wide audience chews in mediocrity all the time, and nobody seems to have been able to change this for ages...
> Mediocrity is exactly what we'd expect from such an approach.
No no, you see I prompted the AI with "masterpiece, photorealistic, 35 mm photography, cinematic, dslr, volumetric lighting, trending on artstation, 4k, 8k, hyper-detailed, epic digital painting by Greg Rutkowski" /s
Someone can tell you when a math problem is solved. Someone else can't tell you when you've successfully art'd with remotely the same degree of confidence. In so far as they can, however, many experts claim that AI cannot, indeed "do an art".
Also, it's easier to formulate a hard math question (there are plenty of unsolved problems), but that's (IMHO) harder to do for art. Sure, you may think this is the first time the phrase "Astronaut riding a Llama and holding an avocado" was writ, but those are all well represented concepts in the dataset. For more abstract prompts, there really isn't a way to verify "correctness".
I think art is much easier for LLM-style AI models to do compared to writing.
To make a nice picture you just need to place pixels near each other in a way that looks good, and we all know LLMs are phenomenal at this.
Good text on the other hand is not just text that has a good flow and fits the prompt. It must follow a line of thought, and LLMs don’t do that by design, even though we could argue wether they have that capability as an emergent one, but I don't believe that at all.
e. object returned by a python function with no return
...
I could just submit an empty sheet of paper, & an artist would argue that my empty sheet of paper represents any/all of the above.
Now, if I turn in the same empty paper at a math qualifier and argue that it represents the infinite set of real and complex numbers, ergo the answer to the posed qual problem must be in there, I'll get kicked out of that phd program in a jiffy.
In a way it is, in a way it isn't. You have to remember what is easy for machine isn't going to correlate to what is easy for us humans. Look at AI art. Closely. No, closer than that. All the detail is fucked up. Not just the hands, but the tiniest of things. Strokes, lighting, reflections, and consistency, and all that. But can I turn my friend into a convincing werewolf? Yes. Can I turn my cat into a human or Wonder Woman? No. The system isn't a "fancy copier" but it is a compression algorithm and the aforementioned tasks were only possible because lots of work training LoRAs, textual inversions, control nets, and so on (you could seriously improve GANs, VAEs, hell, even Boltzman Machines could probably do pretty well were any of these given the same research investment that diffusion has received. GANs come close but nuances like GANs having a magnitude fewer parameters).
But let's look at math, can I consistently add numbers? No. The problem is that in math, all those tiny intricate details matter. Not only that, they matter at every single step. The thing here is that these are still pattern recognition machines. But they aren't generalized machines. You can't really derive out all of math from probability distributions (or at least cleanly, but still not convinced you can). The thing is that for math to work in AI we have to address the elephants in the room: math. Yeah, math. ML people don't like it. But we gotta address the axioms in the room that we're operating under. How do we move on from machines operating on manifolds? How do we make it so data are not distributional? How do we move away from a number of unmentioned axioms remains a large open problem in AI research. One that does not get anywhere serious enough of a conversation, especially within the community. Sure, maybe transformer circuits can learn some addition by learning how to do FFTs and add in the FFT space, but you're not going to get to Abstract Algebra that way. Ideally the AI can solve problems that have no algorithms, pun intended.
AlphaGo didn't solve go (Ie, can the first mover guarantee a win?). However, it understood go at a far, far superior level to any human.
A mathbot doens't have to solve math in general. It merely has to be better at solving math than any human mathematician to be considered ASI. And it only has to be better than the 'average' human mathematician to be extremely useful in accelerating math research.
> solve go (Ie, can the first mover guarantee a win?).
Solving Go means determining by what margin the first player can win,
or equivalently, at what komi for white the game is a theoretical draw.
That also makes it dependent on the exact rule set used.
E.g. 2x2 go is a +1 first player win with the Tromp-Taylor rules of Go, while the Japanese rules are not even sufficiently formalized to allow scoring a 2x2 game.
To nitpick a little further, it actually is not possible that the second player has a winning strategy. For that to be the case, P2 would need a winning path of play no matter what P1's first move is.
Suppose that P1 passes on their first move (which is a valid move). Then P2 has a winning path of play in which they put down the first stone. But P1 could have made that move and then they would be on the winning path.
I'm not a game theorist, in RL, nor a big Go player; but I am having a hard time finding this argument convincing. Isn't the whole reason Go is impressive is because the enormous set of possible moves? Like we know that no computer could run ever game in the lifetime of the universe were it to even perform millions of moves a second. So of the I understand this number to be north of 10^500 for possible legal and playable games. So the difference of 1 doesn't seem meaningful. Is there something I'm missing or a more convincing argument? Because even if player 1 always locks out 90% of those possible future moves, that's still an absurdly large search space and it doesn't seem like it is meaningfully different.
It's a proof by contradiction, like the halting problem proof. It doesn't rely on the actual playable games at all, but what the existence of a winning strategy would imply. If there was a guaranteed winning strategy for player 2 it would be contradictory because player 1 could execute it by passing their first turn then using the winning P2 strategy. In that game both players can't be guaranteed to win, so there must be some flaw in P2's supposed guaranteed win strategy. https://en.m.wikipedia.org/wiki/Strategy-stealing_argument
I am familiar with strategy stealing and things like tit-for-tat. But even the wiki article you linked suggests that Go is not a symmetric game, which is the requisite condition for strategy stealing to work (which was my underlying belief albeit (very) poorly worded). The wiki suggests both ladder and ko fights create an asymmetry as well as central control. Not to mention Komi explicitly making it asymmetric.
First player does not always have the advantage. Nim is the best example where the setup can either be the first player winning game (nim-sum of the sizes of the heaps is not zero) or the second. My understanding is also that Chess (another perfect information turn-based game) is not shown solved or even has proven first player advantage (though in practice it looks so).
So I get the argument, I just don't buy it. I would be inclined to lean towards that direction, but it's a tough claim theoretically and probably not meaningful in practice (unless a generalized strategy such as strategy stealing can be employed otherwise a lookup table is impractical as it'd contain more bits than atoms in the universe even for 100 move games).
I think we have to consider far more than strategy-stealing which is not even a generalizable strategy to two-person perfect information turn-based games.
The passing is the important part. In chess you can't pass so going first might be bad, you could be the first to reach a forced zugzwang. If you have the option of passing your first turn, then having a turn before the 2nd player gets to go is at worst neutral for you. Likewise in Nim you can't pass and taking your turn might be bad for you unlike Hex.
Sorry, I'm not quite following, there seems to be a disconnection.
First, I thought going first in chess is generally considered an advantage. Even the wiki article states that. Or at least says there's a 10% increased win rate.
Second, I still don't get why passing is the important aspect. I thought the important aspect is symmetry. I mean I can understand this in nim since that symmetry is that killer aspect that makes for the easy analysis of a solution.
When I said I'm not a game theory person I didn't mean I have no game theory experience but that's not what I study. I'm on the mathy side of ML but not so much in RL. You can use math with me if that makes things easier (in fact, I love math. Please do. RL notation doesn't scare me but rather weirds me out that it scares others) because I think we're getting lost in the conditions.
Oh yeah, in chess going 1st is 100% definitely an advantage. For example, pawn to a4 is a really bad opener because its so close to throwing away that advantage by passing your first turn. Its just that the rules permit that taking a turn can force you into a worse position than you started, and you can't prove the starting position isn't like that. So from a proof perspective the empirical truth that going first is really good is not particularly helpful. The passing thing is weird because it empirically is a bad idea, it just closes a theoretical loophole.
I'm not a mathematician myself, just got into this stuff when I was working on a boardgame solver. I find it difficult to map the 'symmetric game' definition "the payoffs for playing a particular strategy depend only on the other strategies employed, not on who is playing them" onto a turn based game, but if it can work for Hex it must be compatible.
If you consider a strategy to be "a decision tree for how to place stones, when I'm playing as 2nd", then there's perfect symmetry between being 1st and choosing to immediately pass, and being 2nd. The possible strategies and resulting payoffs are the same. You add on top the extra move, which the possibility of passing means is at worst neutral for 1st player, and they cannot be at a disadvantage.
More intuitively for me: being allowed to pass your first move is the same as getting to pick which side you want to play. There's no way the side who can pick to play 1st or 2nd at their option can be at a forced loss to the side who just has to accept their decision. The picker would just pick the other side and now they have a forced win.
(I'm always assuming above any kind of infinite-pass-standoff is a draw, and not some kind of weird other thing).
If I haven't expressed my thoughts clearly enough that's probably about as well as I can manage I'm afraid.
> I find it difficult to map the 'symmetric game' definition "the payoffs for playing a particular strategy depend only on the other strategies employed, not on who is playing them" onto a turn based game,
Yeah so probably the better way to think about it might be with the payoff matrix. Because symmetry is actually about the strategy. That's why there are the notes about the laddering in Go. But the payoff of a symmetric game is actually when A = -A^T. So if we have a 2x2 game a symmetric zero-sum one is where the payoff matrix might look like [[0, 1], [-1, 0]] Where we're like an inverse-identity matrix (actually anti-symmetric) but the diagonals are opposite. Maybe it is best to think about this from a geometric perspective, this symmetry here (in this specific example) is a rotation matrix. That's what it does when applied to another matrix. Recall our standard form is [[cos(theta), -sin(theta)],[sin(theta), cos(theta)]]. Pretty easy to get our matrix from there if you remember that cos(90)=cos(180)=0 and sin(90)=1 but sin(180)=-1. So our angle of rotation is 180 degrees (or pi radians). You could also see that if we made the two columns vectors we'd see they pointed in opposite directions. That's the symmetry! Okay, yeah, maybe that's confusing lol. But I find it helpful to see matrices as transforms and I wish this was stated a bit more clearly and often.
So now that we maybe understand that, symmetry is about a __strategy__, not a player. Because our payoff matrix is strategy based. For example, our strategy for rock-paper-scissors is to pick each outcome 1/3 of the time, which gives us this symmetric payoff. But if we pick rock every time we don't get that payoff, right? So it's actually not about who goes first or second but also includes the strategy aspect.
At least that's my understanding which a lot is prompted by this conversation (thanks!)
The reason I'm finding the go argument hard is thinking of a basic "entropy" based strategy (it'll serve you well in boardgames, especially when sight reading). The idea is if you don't know the best move, play the move that gives you the most future moves. It'll trick you into thinking that this strategy is actually simple, it isn't. So in the game of Go, this isn't reasonably different from making a random move! Because there are just so many. And realistically your strategy is going to be the composition of many different strategies. Like you said, pull out the decision tree but we can actually abstract this a bit more and have a decision strategy tree that's a superset to our strategy that's a response (e.g. a ladder is set up so we play the laddering strategy). The reason I'm not buying the argument isn't about the logic, it is about the possible move sets. Even with super-ko (the board cannot return to a state it has previously been at any time in the game (must be fun to keep track of...)). So forgetting about all the extras that are played in go, passing shouldn't result in a meaningful change in the number of possible strategies. But this argument might actually be an argument in favor of symmetry, not against it. Coming back to Chess, we know that game __is not__ symmetric. Why? Because white has different strategies than black. If instead the first "move" is to flip a coin and that decides who is white and who is black, then the game actually becomes symmetric. Kinda wild...
I didn't read this, but a glance suggests that black dominates in smaller games
I would assume this is possible for any sufficiently complex game. Would you mind answering a few questions from someone near-completely ignorant about Go?
Does second mover in Go have some sort of artificial benefit in scoring or playing? As in -- is there something to compensate for moving second?
On the face it seems like first mover would have an advantage in any turn-based game. But maybe, in some games, seeing an opponent's strategy is more helpful than executing the strategy.
Also, are there examples of real games where second mover can always win? (Real as in, not made up with weird rules just to demonstrate it's possible.)
I have a couple friends who did the Math tripos at Cambridge (so a pretty high level!) who work in tech and have unanimously said they have 0% expectations of an LLM doing a millennium problem anytime soon
Yeah, millennium problems almost certainly require truly novel nontrivial ideas to solve.
That's a tough thing for AI to do.
On the other hand, Terrence Tao had an interesting article on his blog a while back where he was trying to solve a problem and asked chatGPT about it in a high-level strategy sense. ChatGPT suggested several reasonable approaches, one of which turned out to work.
That's nowhere near solving a millennium problem, but it is very interesting and suggests fairly sophisticated conceptual understanding of mathematics nevertheless.
Current architecture and training methods I don't think are enough to get there. However, with enough compute, I can plausibly envision some sort of meta training of LLMs using an analogy to GANs where one network tries to synthesize new correct ideas and the other shoots them down as not novel, not correct, or not sufficiently interesting.
Such an approach I think could perhaps work, but the compute needed would probably be pretty high.
And also if you could combine that with some kind of representation software like Lean that can validate proofs, perhaps you can generate some kind of targeted search of the problem space and with a combination of a general high level strategy and maybe brute force of some sub-problems, find novel solutions. My understanding is that is the common human approach: gain some kind of intuition of problem, try a few things and then iteratively refine. Sometimes that works, sometimes you need to find a new starting point. It seems plausible we could automate that workflow, with likely mixed but still useful results.
I’m not trying to understate the stuff you can do with LLMs or formal proof tooling that could be attached to randomly try things till it reaches a solution to a novel problem, but some solutions you see to lesser problems are sometimes so pie-in-the-sky I think stumbling on one is barely better than a random walk. And I think mathematicians like Tao are far better at narrowing that down. As an aide I can see it’s use, I just don’t believe we’re going to have LLMs & tooling solve these grand problems until compute power is orders of magnitudes better, and even then I’m still sceptical. BUT I’m not a mathematician and this is based on my intuition from pub talks :)
Yeah, but none of that gets you a solution to a Millennium problem. One needs an AI to do something like what Peter Scholze did in inventing perfectoid spaces or Perelman in introducing his entropy function. You can treat axiom invention as a game, I suppose, but the space of possible moves is uhh rather large.
The field of LLM reasoning is far from stable atm. I think it's pretty hard for anyone to give confident predictions about what they can or cannot do in 5 years. In that light, I am skeptical about any claims that they cannot do something anytime soon.
What does an LLM really do? And can it create sometimes entirely novel math that goes against its training set to solve something, and know it’s right using a proof tool that may not even accept that? I agree but on a different order of magnitude of years.
That's 4-5 years for solving Olympiad problems. Those are just very tricky high school math problems. They have solutions and can generally be solved by applying some combination of standard tricks. It's very much the sort of thing an LLM should be good at.
Solving Millennium problems is a whole different ballgame. It's not known if these problems are solvable within ZFC axioms. (In one case, the Yang-Mills prize, stating the problem mathematically is part of the challenge.) All of the obvious applications of known tricks have been tried and failed. To solve such problems, one probably has to invent new and surprising mathematical definitions, building a framework in which the problem becomes solvable. This is something that LLMs will be crap at; the process of invention is not represented in any training data we have access to.
We can look at it this way: there are ~1000 chess Grandmasters and one World Champion. It took very short time for AI to go from beating an average GM to beating World Champion.
There are ~1000 MO winners and 1 (one) Millenial problem solver ...
We can. But doing so frames math research as the same sort of activity as math problem solving.. it's not. Many imo champions struggle to do any successful math research. And many successful math researchers (e.g., all of the most recent batch of Fields medallists) never did the Oympiad at all.
Here's what Andrew Wiles, the only other person to have solved a Millennium-class problem has to say of math competition: "Let me stress that creating new mathematics is a quite different occupation from solving problems in a contest. Why is this? Because you don't know for sure what you are trying to prove or indeed whether it is true."
I'm sure lots of people have been thinking/working along that direction. I think the idea of combining LLMs with formal verification/proof assistant tools (Lean, Coq, Isabelle,...) is particularly interesting. Anyone is aware of any major groups working on open source solutions for this?
your sense of humor is much appreciated. couldn't have said it better. hah :) basically i would suggest the study to expand to everyone doing anything titled AI at this moment.
While I get your point and agree, there apparently is no Dunning-Kruger effect. It was determined to be yet another case of faulty data analysis in behavioral psych.[1]
That paper was thoroughly rebutted when posted to HN.
But Gwern has written about some earlier debunkings of the D-K effect, ad furthermore D-K was never about the popular misconception of D-K ("incompetent people think they are more competent than competent people").
That first article cites more behavioral psych research, which has the absolute worst track record of data analysis. After so many retractions and scandals, I find it difficult to know what to believe anymore.
I don't think the actual winning algorithm itself was used, because real world systems have more constraints/requirements than what the recommender was trained on. But that was in 2009, pre deep-learning/AI summer, and $1 mil clearly helped stimulate interest in that area.
Today we see multiple billion dollar recommender systems, like Tiktok. Netflix ironically benefits the least from recommenders due to the nature of its dataset (Very expensive, low sample size).
My understanding was that their research on what drove engagement shifted quite a bit. Things like social proof, and product patterns like auto-loading the next episode to binge drove engagement metrics. Recently there were some articles about their team custom-identifying which cuts of a video to show as a trailer maximized engagement on a personal level. In some sense that is a recommendation, but it is a broader problem space.
I have a more cynical take; the recommendations declined when Netflix started producing their own content. Prior to this, what constituted a "good recommendation" was aligned between Netflix and the customer, but afterwards not so much.
Today Netflix is in the "how do we get our customers to use our service as little as possible but still pay us every month" phase of their mediacom hypocracy. From a business standpoint, that is their best optimization. They are AOL/TW from 20 years ago.
If I recall correctly, this was the same frustration that Lobsters[1] was created. In the case of Lobsters, moderation decision have carefully been considered and implemented.
My two cents is that it's when it comes to comments - and moderation - that things get challenging. It's also where both HN and Lobsters managed to find a (hopefully) sustainable model. Good luck with this!
Threads.com is owned by an ex-Facebook employee who launched a Slack alternative in 2017. I would be surprised if they were to give up a strong domain name (and brand that they surely have trademarks for) that easily!
people keep talking about threads like it's failing, but i use it every day and it seems alive to me. most of the people i followed on twitter are there, and there's enough content to keep me generally interested every time i open the app and to keep me coming back.
i think something like knowing whether or not they have a desktop app should be a prerequisite for having an opinion on whether they're failing or not.
As someone living in the EU who wants to use Threads but cannot - as Meta has blocked it here, while they work out GDPR compliance that has not been solved since launch - its amusing to me to read how Meta is planning to do various growth hacks.
My humble suggestion would be to first, perhaps, roll out to the EU? On one hand: sure, we are “only” talking about 450M potential users. On the other: anyone who has friends in this region or an interest in someone based here: well, that person needs to go to Twitter/Mastodon/BlueSky etc.
I am not saying this definitely explains all growth struggles: but surely doesn’t help true, global adoption?
If you feel like you're being left out on access to Threads, open a Google image search of some of your favorite brand logos - Wendy's, Dr Scholl's, etc. And then imagine some hilarious corporate-culture-friendly conversations that could happen between those brands as written between social media interns.
e.g.
Wendy's - "Boy, my feet are tired from doing Fortnight dances!"
Dr Scholl's reply - "We've got just the products to keep you hitting the griddy for hours! LMBO!"
The remainder of the experience is engagement farming.
I'm curious if anyone there has explained exactly what the blocker is. Like, why can BlueSky operate there but not Threads? What work are they doing to get launched in the EU?
surely the smart thing to do would have been just to roll the whole thing out without selling anyone's data at all, then implement GDPR-compliant monetisation post hoc?
“ According to the Amazon Prime Day blog post, DynamoDB processes 126 million queries per second at peak. Spanner on the other hand processes 3 billion queries per second at peak, which is more than 20x higher, and has more than 12 exabytes of data under management.”
This comparison seems to be not exactly fair? Amazon’s 126 million queries per second was purely for Amazon-related services serving Prime Day generating this on DynamoDB, and not all of AWS is my read.
What would have perhaps been a more fair comparison is to share the peak load that Google services running Cloud Spanner, and not the sum of all Spanner services across all of GCP and all of Google (Spanner on non-GCP infra).
I will say that it would show a massive of confidence to say that Photos, Gmail and Ads heavily rely on GCP infra: which would be brand new information for me! It would add to confidence to learn more on how they use it, and if Cloud Spanner is on the critical path for those services.
What is confusing, however, is how in this article "Cloud Spanner" is consistently used... except for when talking about Gmail, Ads and Photos, where it's stated that "Spanner" is used by these products, not "Cloud Spanner!". Like if they were not using the Cloud Spanner infra, but their own. It would help to know what is the case, and what the load of Cloud Spanner is: and not Spanner running on internal Google infra that is not GCP.
At Amazon, practically every service is built on top of AWS - a proper vote of confidence! - and my impression was that GCP had historically been far less utilised by Google for their own services. Even in this post, I'm still confused and unable to tell if those Google products listed use Cloud Spanner or their own infra running Spanner.
> DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers. Over the course of Prime Day, these sources made trillions of calls to the DynamoDB API. DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 126 million requests per second.
Amazon was very, very clear on this. For Google to use that number without the caveat is just completely underhanded and dishonest. Whoever wrote this is absolutely lacking in integrity.
I used DynamoDB as part of the job a few years ago and never got single-millisecond responses - it was 20ms minimum and 70+ on a cold-start, but I can accept that optimising Dynamo's various indexes is a largely opaque process. We had to add on hacks like setting the request timeout to 5ms and keeping the cluster warm by submitting a no-op query every 500ms to keep it even remotely stable. We couldn't even use DAX because the Ruby client didn't support it. At the start we only had a couple of thousand rows in the table so it would have legit been faster to scan the entire table and do the rest in memory. Postgres did it in 5ms.
If Amazon said they didn't use DAX that day I would say they were lying.
The average consumer or startup is not going to squeeze out the performance of Dynamo that AWS is claiming that they have achieved.
In fact, it might have been fairer in Ruby if they didn't hard-code the net client (Net/HTTP). I imagine performance could have been boosted by injecting an alternative.
What a cool lil side project/company! Going to circulate this among friends...
Little bit of well meaning advice: This needs copy editing -- inconsistent use of periods, typos, grammar. Little crap that doesn't matter in the big picture, but will block some from opening their wallets. :) ("OpenTeletry", "performances", etc.)
All in all this is quite cool, and I hope you get some customers and gather more data! (a 4k object size in S3 doesn't make sense to measure, but 1MB might be interesting. Also, check out HDRHistogram, it might be relevant to your interests)
Nice dash - if you don't mind a drive-by recommendation: I use Grafana for work a lot and it's nice to see a table legend with min, max, mean, and last metrics for these kinds of dashboards. Really makes it easy to grok without hovering over data points and guessing.
What is more important for me when using Grafana (though a summary is as well) is actually units, to know if it's second, millisecond, microsecond, and also if 0.5 is a quantile or what.
Numbers without units are dangerous in my opinion.
> We had to add on hacks like setting the request timeout to 5ms and keeping the cluster warm by submitting a no-op query every 500ms to keep it even remotely stable.
This sounds like you're blaming dynamo for you/your stack's inability to handle connections / connection pooling.
Been using DynamoDB for years and haven’t had to do any of the hacks you talk about doing. Not using ruby though. TCP keep-alive does help with perf though (which I think you might be suggesting.)
I don’t have p99 times in front of me right this second but it’s definitely lower than 20ms for reads and likely lower for writes. (EC2 in VPC).
They very well know that people don't read sh* anymore. Just throw numbers there, PowerPoint them and offer an "unbiased" comparison where Google shines - buy Google.
Worst case scenario, it's Google you're buying, not a random startup etc.
Just as a hand in the air...Be careful about what you're comparing here. # of API calls over a period of time is...largely irrelevant in the face of QPS. I can happily write a DDOS script that massively bombards a service, but if that halts my QPS then it doesn't matter. So sure, trillions of API calls were made (still impressive in the scope of the overall network of services, I'm not downplaying that), but ultimately, for DynamoDB and Spanner, it's the QPS that mattered to us in terms of comparisons of DB scaling and performance.
Google calls API calls “queries”… because of their history as a search engine. QPS == API calls/per second == Requests per second
That said, I can’t imagine these numbers mean much to anyone after a certain point. It’s not like either company is running a single service handling them. The scale is limited by their budget and access to servers because my traffic shouldn’t impact yours. I feel like the better number is RPS/QPS per table or per logical database or whatever.
Yes, but QPS vs. "queries to the API". The difference is the time slice. I should have been more explicit. The key here really is the time function between the numbers. That the AWS blog calls out trillions of API calls isn't relevant because there wasn't a specific time denominator. The 126M QPS is the important stat.
We shared some details about Gmail's migration to Spanner in this year's developer keynote at Google Cloud Next [0] - to my knowledge, the first time that story has been publicly talked about.
I tried to find it in this video, but failed. Could you please share a time stamp on where to look?
It’s a pretty big deal if Gmail migrated to GCP-provided Spanner(not to an internal Spanner instance) and sounds like he kind of vote of confidence GCP and Cloud Spanner could benefit from: might I suggest to write about it? It’s easier to digest and harder to miss than an hour-long keynote video with no time stamps.
And so just to confirm: Gmail is on Cloud Spanner for the backend?
It's almost certainly not the case that Gmail uses Cloud Spanner rather than Internal Spanner. I don't think Cloud Spanner (or most of Google's cloud products) have the featureset required to support loads like Gmail (both in terms of technical capability, and security/privacy features).
When I worked at Google I tried to get more services to migrate to the cloud but the internal environment that was built up over 25 years is much better at supporting billion+ users with private data.
And yet, if they do, that's probably one of the best sales pitches they could have - dogfooding. After all, isn't that also how AWS started, just reselling the services and servers they already use themselves?
It doesn't make much sense to have a 'better' version of a product you sell but keep it internal.
Yet Amazon Retail still don't use DynamoDb for the critical workloads. They still rely on an internal version of DynamoDb (Sable) which is optimized for Retail workload.
looks like it starts at 50:45. youtube recently made it so you can click "show transcript" in the description then ctrl-f takes you to all the mentions. very helpful for long videos like this.
In the timestamped video link shared downthread, the speaker does seem to strongly imply that gWorkspace doesn’t manage the infra, when he finishes explaining the migration he declares (around 55:18)“[…]we can focus on the business of gmail and spanner can choose to improve and deliver performance gains automagically[sic]” which would imply, to me at least, that it’s on GCP.
That's not what it implied to me. To me, it meant that they adopted an internal managed Spanner with its own SRE team, instead of running their own Spanner. In the past, Gmail ran their own [[redacted]]s and [[redacted]] even though there were company-wide managed services for those things.
Agree, but with the caveat that [[redacted]] and [[redacted]] were old and originally designed to be run that way. All newer storage systems I can recall were designed to be run by a central team after many years of experience doing it the other way. And many tears shed over migrating to those centralized versions.
Source: I was on the last team running our own [[redacted]].
Wow, almost content-free presentation! How obnoxious!
This wasn't the first time Gmail has replaced the storage backend in-flight. The last time, around 2011, they didn't hype it up, they called it "a storage software update" in public comms. And that other migration is the origin of the term "spannacle", because during that migration the accounts that resisted moving from [[redacted]] to [[redacted]] we called barnacles.
> I will say that it does show a vote of confidence to say that Photos, Gmail and Ads use GCP infra,
I'm not sure? I guess I'm mostly not sure what "gcp infra" means there. The blog post says
"Spanner is used ubiquitously inside of Google, supporting services such as; Ads, Gmail and Photos."
But there's google-internal spanner, and gcp spanner. A service using spanner at Google isn't necessarily using gcp. (No clue about photos, Gmail, etc)
Granted, from what I gather, there's a lot more similarity between spanner & gcp spanner than e.g. borg and kubernetes.
Which can be the difference between 99.99% availability and 99% availability with data corruption issues. Not saying that's the case here but one should not downplay the difference deployments can make.
Surely in a post about Google Cloud Spanner, all examples mentioned use Google Cloud Spanner? It would be moot listing them as examples if they would not: so my assumption is they are all using GCP infra already for Spanner.
I really want to give Google the benefit of the doubt: but it doesn't help that they did not write that eg Gmail is using "Cloud Spanner." They wrote that it uses Spanner.
This is putting a lot of faith in GCP advertising. I strongly doubt the idea that the Google workloads discussed are deployed on GCP instead of internal Borg infrastructure.
Years ago they did a reorg and moved all infrastructure services under Cloud even though they are not Cloud products. That would enable this kind of obfuscation because Cloud is literally responsible for both Cloud Spanner and non-Cloud Spanner and they can conflate these two in their marketing copy. They probably feel justified in doing so because they share so much code.
Infra and Cloud Spanner are the same stack. Having those services run on infra is more about the legacy of tooling to shift it rather than anything around performance or ability to handle it
Infra and Cloud Spanner are the same stack. Having those services run on infra is more about the legacy of tooling to shift it rather than anything around performance or ability to handle it.
>This comparison seems to be not exactly fair? Amazon’s 126 million queries per second was purely for Amazon-related services serving Prime Day generating this on DynamoDB, and not all of AWS is my read.
There's no indication that google is talking about ALL of spanner either? The examples they list are all internal google services, and they specifically say "inside google".
I'm also dubious that even with all of the AWS usage accounted for that DynamoDB tops Spanner if Amazon themselves are only at 126 million queries per second on Prime Day.
> At Amazon, practically every service is built on top of AWS - a proper vote of confidence!
Not only this, but practically most, if not all, of the AWS services use DynamoDB, including use cases that are usually not for databases, such as multi-tenant job queues (just search "Database as a Queue" to get the sentiment). In fact, it is really really hard to use any relational DB in AWS. I mean, a team would have to go through a CEO approval to get exceptions, which says a lot about the robustness of DDB.
Eh, this isn't accurate. Both Redshift and Aurora/RDS are used heavily by a lot of teams internally. If you're talking specifically about the primary data store for live applications, NoSQL was definitely recommended/pushed much harder than SQL, but it by no means required CEO approval to not use DDB
Edit: It's possible you're limiting your statement specifically to AWS teams, which would make it more accurate, but I read the use of "Amazon" in the quote you were replying to as including things like retail as well, etc.
When I was at AWS, towards later part of my tenure, DynamoDB was mandated for control plane. To be fair, it worked, and worked well, but there were times when I wished I could use something else instead.
> What would have perhaps been a more fair comparison is to share the peak load that Google services running on GCP generated on Spanner, and not the sum of their cloud platform.
Not necessarily about volume of transactions, but this is similar to one of my pet-peeves with statements that use aggregated numbers of compute power.
"Our system has great performance, dealing 5 billion requests per second" means nothing if you don't break down how many RPS per instance of compute unit (e.g. CPU).
Scales of performance are relative, and on a distributed architecture, most systems can scale just by throwing more compute power.
Yeah I've seen some pretty sneaky candidates try that on their resumes. They aggregate the RPS for all the instances of their services even though they don't share any dependencies nor infrastructure. They're just independent instances/clusters running the same code. When I dug into those impressive numbers and asked about how they managed coordination/consensus the truth comes out.
True, but one would hope that both sides in this case would be putting their best foot forward. Getting peak performance out of right sizing your DB is part of that discussion. I can't imagine AWS would put down "126 million QPS" if they COULD have provided a larger instance that could deliver "200 million QPS", right? We have to assume at some point that both sides are putting their best foot forward given the service.
The 126M QPS number was certainly parts of Amazon.com retail that powers Prime Day not all of DDB traffic. If we were to add up all of DDB's volume, it would be way higher. At least a magnitude if not more.
Large parts of AWS itself uses DDB - both control plane and data plane. For instance, every message sent to AWS IoT will internally translate to multiple calls to DDB (reads and writes) as the message flows through the different parts of the system. IoT itself is millions of RPS and that is just one small-ish AWS service.
Put yourself in the shoes of who they're targeting with that.
Probably dealing with thousands of requests per seconds, but wants to say they're building something that can scale to billions of requests per second to justify their choices, so there they go.
it does depend on what you mean. By 2020/2021, effectively everything was on top of AWS VMs/VPC and perhaps LBs at that point? Most if not all new services were being built in NAWS.
SPS was heavily MAWS and I got sick of being the NAWS person from years prior pushing for NAWS in our dysfunctional team, and quit. The good coworkers also quit.
Yet I still see the very deep stack of technically incapable middle manager sorts dutifully posting "come join us" nonsense on LinkedIn.
(I had the luxury of having worked in one of the inner sanctums of Apple hardware for years prior, so was immune to nonsense, and didn't need the job.)
It's pretty amusing how a bunch of teams are posting videos on how they built a specific, bespoke hardware/software component or two of MrBeast videos. This one was about the ~100 hardware button sets and LED strips required for the video. And here is another one [1] that is on how another team build 456 "detonator units" for the Squid Game video: also finished in the nick of time.
It seems clever for MrBeast to hire DIY YouTubers to do these bespoke jobs: it's a win-win even after the video, as they teams create their own "how we built this" video. It also gave me an appreciation for both how much work these massive videos are (we only saw one smaller part of the video, a hardware put together) and how chaotic it can all feel!
Unfortunately we are not going to see the behind the scene of the VFX of the recent videos, I saw the video about the train and the big hole, it's full of VFX but many people argue that the video is genuine so revealing the use of VFX would kinda ruin the video for many people. This wasn't true for the squid game video because the fun were the players challenging themselves, or at least I hope the viewers were not actually thinking that the players were doing tug of war on a 100m high cliff.
I agreed after watching Will Osman's video last year since Mr Beast was in it a few times, but watching this one Mr Beast didn't meet with the dude once even to say thank you?
I have to imagine he didn't get paid other than "exposure" (which is admittedly probably worth something from Mr Beast) so it feels a little scummy for the person who's getting rich to not give at least a little in person acknowledgement to the folks doing all the work behind the scenes.
I'm happy with the way things turned out. Obviously can't disclose detail on our arrangement, but I'll say there was nothing scummy here. I just hope the MrBeast org can get to a phase where they have more time for planning out any video requiring technical integration.
The 'struggle' and the lack of time is implicit in the entertainment value of watching what unfolds. I'd argue most of this type of content is specifically geared towards dramatising what should otherwise just be planned accordingly with a decent producer.
These folks are smarter than smart when it comes to gamifying the human psyche.
Note that a competitor to VanMoof, Cowboy Bikes, have also created an iOS app - with a beta on the App Store - that supposedly does the same (saving the encryption keys locally).
I say supposedly because the source is not available to inspect. However, the linked FAQ states that no data is stored outside the app.
(Note this is not en endorsement for Cowboy - whom I know nothing about - just pointing to the currently only iOS app I know of to save these same encryption keys for VanMoof owners).
Going from Vanmoof to Cowboy Bikes is like switching from the plague to cholera.
Same modus operandi: excessively VC funded, lots of hype, proprietary parts, mounting losses, loosing money with every sale, ... and it's getting worse every year. Cowboy Bikes is just another bankruptcy waiting to happen.
I'm riding a Cowboy right now, and I chose it because they didn't have the service and quality issues that Vanmoof had. I've been very happy with it and hope they will not go under. I was annoyed with the app but I created a bluetooth key fob so I don't have to use the app to unlock it. Now I'm even happier with my daily rides.
That being said: if I were in the market today I think I'd go for the Veloretti Ace. Less proprietary parts, great design and owned by a stable company.
They spent a lot of money on expansion, opening stores in places like NYC and Tokyo. I'm also betting that since all the parts in their bikes are custom-produced for them, the production cost of their bikes is quite high as well (that is, they're not making that much money per bike).
Then why do they do it? What is the supposed gain from producing your own parts, when mass produced parts like Shimano are available all over the world? Is it to increase sales of their repair services after the warranty ends?
They want to "re-invent the bike". They do (or did) produce some pretty new/innovative designs; I haven't really followed them since they went electric because I'm not that keen on electric bikes, but the original (non-electric) had the lock roll up inside the frame for example, which is pretty nifty. Lots of people also like the design from an aesthetic point of view.
"Use standard stuff" has a lot of value, obviously, but it also limits you. This is true for everything: from bike parts to the POSIX API or POSIX shell.
The team behind Cowboy also has a history of going bankrupt and not doing the right thing. Their previous venture was a Uber Eats competitor and they took the end user’s money until the end but never paid the restaurants nor the deliverers. That’s the main reason why I can’t get myself to buy one their bike. But it seems Van Moof wasn’t any better.
Author here. I noticed this tweet made it to Hacker News. (I didn’t even notice that another one did yesterday as well)
I don’t care about the views or clicks or “engagement” or “driving attention” or similar. I understand talk is cheap so I have asked dang to blacklist all tweets from my Twitter account (this URL) going forward on HN, which should significantly reduce such views and clicks.
The tweets are not editable, and I often type them out as I go. I shared this, for example, after talking with a current Googler who was very, very frustrated exactly because of this. I thought it’s an interesting angle, especially as I’m also a Domains customer.
“Sensationalist” is something I would definitely like to avoid. I used to “break” layoff news at tech companies the fall of 2022 (before or as they happened) which had very high “engagement” but sat increasingly poorly with me - and it did feel sensationalist - so I stopped doing all of this, regardless of anything I learn ahead of some other outlet sharing it. I’m happier for it.
I do have my own opinions and experiences with Google as a customer, going back all the way to the massive GAE price increases in 2011 when I was an early customer, and of course this contributes to my - necessarily biased - outlook.
There is also truth to Twitter takes often feeling sensationalist - brewity doesn’t help with nuance - and I don’t want to get more views/clicks/ “engagement” on any of these or contribute to “outrage.”
(By the way, thank you for an earlier criticism that I took to heart.)
You don't seem to have any knowledge at all about how personal data transfers work in the case of a merger or acquisition. Which is a radically different scenario from two standing independent companies exchanging personal data.
It's common practice and common sense that you get access to customer data when you take over a company, how else would you even run it? This has been true and normal for centuries. The concept still stands when applying the strictest of modern privacy laws, it's called "legitimate business interest". Meaning: there is no business without this data, the personal data is required to deliver the service at all.
So no, it's not a selling of your data, it's a change of control. And no, it does not require your consent.
If you don't want to see accusations of "sensationalist" you shouldn't confidently spread hot takes on topics you don't even seem to comprehend the basics of. And if you then go wrong and get corrective feedback as is happening on the Twitter spread, perhaps not double down.
With a company being sold: yes, customer data is also sold.
Google had not been in the business of selling its businesses until now (and thus was not selling customer data either). They were in the business of building own products (sometimes shutting them down) and buying other companies.
This is a first they shut off a division and also sell off customers’ data.
Taking control of a company and thereby getting ACCESS to customer data is not the same thing as selling customer data. They are radically different concepts.
As to whether Google has done this before or not is completely irrelevant.
But this is NOT a company being sold! And not even a division or a product sold. It’s not a merger or acquisition. It is, what you described as this: “two standing independent companies exchanging personal data.”
Google Domains and Google Cloud Domains is shutting down: no technology or people transfer happens. Customer accounts are what are sold, as laid out in black-and-white by Google:
“On June 15, 2023, Google entered into a definitive agreement with Squarespace, where they intend to purchase all domain registrations and related customer accounts from Google Domains.”
We can argue what it means to sell “customer accounts from Google Domains” from Google to a third party company.
My interpretation that this is selling customer data (that is needed for the buyer to operate them: it’s selling, none the less, to a third party who is not Google). To me, this sounds like a new era starting where customer accounts owned by Google can be sold (and this is the first). Which is fine: but then Google cannot claim they will not do this, going forward. This was the deal Google promised its customers - which implied they won’t sell off eg Gmail acccounts to a third party, YouTube accounts, or Google Domain accounts.
Google didn’t write that Squarespace is “getting access to customer data”. They wrote Squarespace is purchasing these accounts (likely this wording to avoid writing that Google is selling this: but same difference). Google’s wording: not mine.
>> But this is NOT a company being sold! And not even a division or a product sold. It’s not a merger or acquisition. It is, what you described as this: “two standing independent companies exchanging personal data.”
This is just semantics on your part. Whether it's a company being sold, a division, or only a product, then reasonable companies will consider the data handling in a similar way with multiple gap analyses to mitigate concerns. Context is going to dictate what happens. Lesser companies will not consider this heavily and you can end up with the data being more of the value for the transaction. Whether we like it, or not, we are left to trust that these companies will make efforts to enact good policies and follow them.
What I'd expect to happen here is that Google and Squarespace will familiarize themselves with the posture on both sides and Google will want the standards to at least be maintained to a level which removes liability to them. They are very aware of the scrutiny and a big player here so they can force the acquirer to step up to a certain degree.
I don't know enough about Squarespace's security, or complete business model, but they'll be trying to work out what gaps they may need to fill. Google may have clauses that require that this data can't be co-mingled, or must be handled in specific ways for certain countries. The actual handling and what is communicated can take some time as the teams work out how they deal with any gaps. It's also possible that Google find some gaps on their side and have to resolve them before any transfer could actually happen.
Given the implications of service continuity if domains aren't transferred or operational, I can't imagine that they would ask customers to take some action. It creates a support nightmare with confused customers talking to support and then being unhappy they still took the wrong action when you ask them to approve the transfer.
Thank you. After having gone through more of this - and receiving more long-form input - I am with you that this is, almost certainly, not "selling of customer data."
Thanks for your comments, I appreciate it. Now is the part when I would want to go back and edit all my comments here (and on Twitter), but cannot thanks to the append-only nature. I can delete: but then e.g. this post points to nowhere and the context is lost. I deleted the orginal tweet given it's clear it has incorrect information / missing information, replaced with: https://twitter.com/GergelyOrosz/status/1671959124337217536?...
> As to whether Google has done this before or not is completely irrelevant.
Isn't that the whole issue in discussion? Google never did this before, and as a result they could always claim "we will never, ever, sell your information". This just changed as now they decided to sell that data to Squarespace. So they can't claim that anymore, right?
I understand that Google had no other option. As far as I know, ICANN requires a registrar to transfer domains to another registrar if they are going out of business. So either they support the service forever, or sell their user data to another registrar and remove that statement as they no longer uphold that statement.
They got themselves into that business. No one forced them to sign an agreement with ICANN that could one day force them to transfer that information to someone else, but they did.
More than 80% of working moms utilise this option and around 10% of dads. [1] Among my friends I have parents where both of them work 4 days, taking the 5th day off on separate days, and their kid goes to daycare 3 days a week.
Note that most government subsidies (for childcare) are set up to encourage working at least 32 hours per day.
[1] https://money.cnn.com/gallery/news/economy/2013/07/10/worlds...