This is correct, but for substitution-permutation networks there are similar theoretical results about the minimum number of rounds under various assumptions, like for Feistel networks.
The arguments of OP are also applicable to this kind of ciphers.
Besides the fact that there might be some ways in which quantum computers might be able to accelerate attacks against iterated block ciphers with a number of rounds inferior to some thresholds, there exists also a risk that is specific to AES, not to other ciphers.
Recovering the secret key of any cipher when you have a little amount of known plaintext is equivalent with solving a huge system of equations, much too big to be solved by any known methods.
In order to ensure that this system of equations is very big, most ciphers that are built by composing simple operations take care to mix operations from distinct algebraic groups, typically from 3 or more algebraic groups. The reason is that the operations that appear simple in a group appear very complex in other groups. So if you mix simple operations from 3 groups, when you write the corresponding system of equations in any of those groups, the system of equations is very complex. This technique of mixing simple operations from at least 3 algebraic groups has been introduced by the block cipher IDEA, as a more software-friendly alternative to using non-linear functions implemented with look-up tables, like in DES.
An example of such algebraic groups are the 3 algebraic groups used in the so-called ARX ciphers (add-rotate-xor, like ChaCha20), where the 3 groups correspond to the arithmetic operations modulo 2^N, modulo (2^N-1) and modulo 2.
Unlike such ciphers, AES uses algebraic operations in the same finite field, GF(8), but instead of using only simple operations it also uses a rather complex non-linear operation, which is inversion in GF(8), and it relies on it to ensure that the system of equations for key recovery becomes big enough if sufficient rounds are performed.
Because of this rather simple algebraic structure of AES, it has been speculated that perhaps someone might discover a method to solve systems of equations of this kind. For now, it seems very unlikely that someone will succeed to do this.
Even if solving this system of equations seems unfeasible by classical means, perhaps one might discover a quantum algorithm accelerating the solution of this particular kind of systems of equations.
I have mentioned this risk for completeness, but I believe that this risk is negligible.
AES could be modified in a trivial way, which requires no hardware changes in most CPUs, but only software changes, in order to make that system of equations much more complex, so that it would defeat any possible quantum improvement. An example of such a change would be to replace some XOR operations in AES with additions modulo 64 or modulo 32. The only problem would be that there may be devices whose firmware cannot be updated and old encrypted data that has been recorded in the past will not benefit from future upgrades.
However, like I have said, I believe that this risk for AES to be affected by some equation-solving algorithm discovered in the future remains negligible.
As explained in an article linked at the bottom of TFA, the weights of a LLM have a normal (Gaussian) distribution.
Because of that, the best compromise when the weights are quantized to few levels is to place the points encoded by the numeric format used for the weights using a Gaussian function, instead of placing them uniformly on a logarithmic scale, like the usual floating-point formats attempt.
You can set x87 to round each operation result to 32-bit or 64-bit.
With this setting in operates internally exactly on those sizes.
Operating internally on 80-bits is just the default setting, because it is the best for naive users, who are otherwise prone to computing erroneous results.
This is the same reason why the C language has made "double" the default precision in constants and intermediate values.
Unless you do graphics or ML/AI, single-precision computations are really only for experts who can analyze the algorithm and guarantee that it is correct.
The Intel 8087 design team, with Kahan as their consultant, who was the author of most novel features, based on his experience with the design of the HP scientific calculators, have realized that instead of keeping their new much improved floating-point format as proprietary it would be much better to agree with the entire industry on a common floating-point standard.
So Intel has initiated the discussions for the future IEEE standard with many relevant companies, even before the launch of 8087. AMD was a company convinced immediately by Intel, so AMD was able to introduce a FP accelerator (Am9512) based on the 8087 FP formats, which were later adopted in IEEE 754, also in 1980 and a few months before the launch of Intel 8087. So in 1980 there already were 2 implementations of the future IEEE 754 standard. Am9512 was licensed to Intel and Intel made it using the 8232 part number (it was used in 8080/8085/Z80 systems).
Unlike AMD, the traditional computer companies agreed that a FP standard is needed to solve the mess of many incompatible FP formats, but they thought that the Kahan-Intel proposal would be too expensive for them, so they came with a couple of counter-proposals, based on the tradition of giving priority to implementation costs over usefulness for computer users.
Fortunately the Intel negotiators eventually succeeded to convince the others to adopt the Intel proposal, by explaining how the new features can be implemented at an acceptable cost.
The story of IEEE 754 is one of the rare stories in standardization where it was chosen to do what is best for customers, not what is best for vendors.
Like the use of encryption in communications, the use of the IEEE standard has been under continuous attacks during its history, coming from each new generation of logic designers, who think that they are smarter than their predecessors, and who are lazy to implement properly some features of the standard, despite the fact that older designs have demonstrated that they can in fact be implemented efficiently, but the newbies think that they should take the easy path and implement inefficiently some features of the standard, because supposedly the users will not care about that.
Intel 8087, which has introduced in 1980 the 80-bit extended floating point format, could store and load 80-bit numbers, avoiding any alterations caused by conversions to less precise formats.
To be able to use the corresponding 8087 instructions, "long double" has been added to the C language, so to avoid extra roundings one had to use "long double" variables and one had to also be careful so that intermediate values used for the computing of an expression will not be spilled into the memory as "double".
However this became broken in some newer C compilers, where due to the deprecation of the x87 ISA "long double" was made synonymous to "double". Some better C compilers have chosen to implement "long double" as quadruple-precision instead of extended precision, which ensures that no precision is lost, but which may be slow on most computers, where no hardware support for FP128 exists.
By definition, a document that is written is historic, not prehistoric.
Prehistoric information could be preserved by an oral tradition, until it is recorded in some documents (like the Oral Histories at the Computer History Museum site).
The C keywords "float" and "double" are based on the tradition established a decade earlier by IBM System/360 of calling FP32 as "single-precision" and FP64 as "double-precision".
This IBM convention has been inherited by the IBM programming languages FORTRAN IV and PL/I and from these 2 languages it has spread everywhere.
The C language has taken several keywords and operators from IBM PL/I, which was one of the three main inspiration sources for C (which were CPL/BCPL, PL/I and ALGOL 68).
So "float" and "double" are really inherited by C from PL/I.
A feature that is specific to C is that it has changed the default format for constants and for intermediate values to double-precision, instead of the single-precision that was the default in earlier programming languages.
This was done with the intention of protecting naive users from making mistakes, because if you compute with FP32 it is very easy to obtain erroneous results, unless you analyze very carefully the propagation of errors. Except in applications where errors matter very little, e.g. graphics and ML/AI, the use of FP32 is more suitable for experts, while bigger formats are recommended for normal users.
The argument in favor of Cato's cheesecake being of Greek origin is that it had a Greek name.
Cato's cheesecake is named in Latin "placenta", which comes from a Greek word whose approximate meaning is "flat cake".
It was called "flat" because it was made from stacked flat sheets of baked dough, between which a filling was put. In the recipe of Cato, the main ingredients mixed in the filling were cheese and honey.
The name "placenta", with various phonetic alterations, continues to be used until today in some European languages, for this kind of cake.
Nevertheless, a Greek name does not necessarily mean that this kind of cake came from Ancient Greece. Before the Romans conquered all Italy, there were many Greeks in Southern Italy and especially in Sicily. After the Romans also conquered the Greek peninsula, there were a lot of Greeks in Rome, including many slave Greek cooks.
So the name of the cake could have its origin in some Greek cook from Italy or Rome.
Your comment, like most in this thread, confuses ordinary bromine with semiconductor-grade pure bromine.
The semiconductor industry does not use ordinary chemical substances, but only special semiconductor-grade pure substances, which are many orders of magnitude more pure than the so-called "pure" substances that are used elsewhere in the chemical industry.
It is absolutely irrelevant that substances like ordinary bromine and ordinary silicon are very abundant and very cheap. The semiconductor industry cannot use them and the corresponding semiconductor-grade pure substances are far more expensive and their availability is limited by the production capacities of the very few producers that exist for them around the world.
If the few existing production plants for any semiconductor-grade pure substance were destroyed, semiconductor device manufacturing would be stopped for a few years, until new purification plants are built.
TFA argues that in order to avoid such risks, there should be more purification plants in geographically-diverse locations, for instance that one such purification plant should be built in USA, where there are local producers of ordinary bromine, that would provide the raw material.
TFA is not about ordinary bromine, but about semiconductor-grade pure bromine, which is very expensive and difficult to produce, so there are very few producers and apparently none in USA.
Nobody will ever run out of bromine or of silicon. But if the very few purification plants for silicon or for bromine were destroyed today, the semiconductor manufacturing would be suspended for a few years, until other purification plants would be built.
Your USGS article does not say a single word about semiconductor-grade pure bromine, so it is irrelevant for this discussion.
The arguments of OP are also applicable to this kind of ciphers.
Besides the fact that there might be some ways in which quantum computers might be able to accelerate attacks against iterated block ciphers with a number of rounds inferior to some thresholds, there exists also a risk that is specific to AES, not to other ciphers.
Recovering the secret key of any cipher when you have a little amount of known plaintext is equivalent with solving a huge system of equations, much too big to be solved by any known methods.
In order to ensure that this system of equations is very big, most ciphers that are built by composing simple operations take care to mix operations from distinct algebraic groups, typically from 3 or more algebraic groups. The reason is that the operations that appear simple in a group appear very complex in other groups. So if you mix simple operations from 3 groups, when you write the corresponding system of equations in any of those groups, the system of equations is very complex. This technique of mixing simple operations from at least 3 algebraic groups has been introduced by the block cipher IDEA, as a more software-friendly alternative to using non-linear functions implemented with look-up tables, like in DES.
An example of such algebraic groups are the 3 algebraic groups used in the so-called ARX ciphers (add-rotate-xor, like ChaCha20), where the 3 groups correspond to the arithmetic operations modulo 2^N, modulo (2^N-1) and modulo 2.
Unlike such ciphers, AES uses algebraic operations in the same finite field, GF(8), but instead of using only simple operations it also uses a rather complex non-linear operation, which is inversion in GF(8), and it relies on it to ensure that the system of equations for key recovery becomes big enough if sufficient rounds are performed.
Because of this rather simple algebraic structure of AES, it has been speculated that perhaps someone might discover a method to solve systems of equations of this kind. For now, it seems very unlikely that someone will succeed to do this.
Even if solving this system of equations seems unfeasible by classical means, perhaps one might discover a quantum algorithm accelerating the solution of this particular kind of systems of equations.
I have mentioned this risk for completeness, but I believe that this risk is negligible.
AES could be modified in a trivial way, which requires no hardware changes in most CPUs, but only software changes, in order to make that system of equations much more complex, so that it would defeat any possible quantum improvement. An example of such a change would be to replace some XOR operations in AES with additions modulo 64 or modulo 32. The only problem would be that there may be devices whose firmware cannot be updated and old encrypted data that has been recorded in the past will not benefit from future upgrades.
However, like I have said, I believe that this risk for AES to be affected by some equation-solving algorithm discovered in the future remains negligible.
reply