The earliest Cray models (starting with Cray-1 in 1976) had only 64-bit floating-point numbers. 128-bit numbers were a later addition and I do not think that they were implemented in hardware, but only in software. Very few computers, except some from IBM, have implemented FP128 in hardware, while software libraries for quadruple-precision or double-double-precision FP128 are widespread.
The Cray 64-bit format was a slight increase in size over the 60-bit floating-point numbers that had been used in the previous computers designed by Seymour Cray, at CDC.
Before IBM increased the size of a byte to 8 bits, which caused all numeric formats to use sizes that are multiple of 8-bits, in the computers with 6-bit bytes the typical floating-point number sizes were either 60-bit in the high-end models or 48-bit in cheaper models or 36-bit in the cheapest models.
> In ancient times, floating point numbers were stored in 32 bits.
This was true only for cheap computers, typically after the mid sixties.
Most of the earliest computers with vacuum tubes used longer floating-point number formats, e.g. 48-bit, 60-bit or even weird sizes like 57-bit.
The 32-bit size has never been acceptable in scientific computing with complex computations where rounding errors accumulate. The early computers with floating-point hardware were oriented to scientific/technical computing, so bigger number sizes were preferred. The computers oriented to business applications usually preferred fixed-point numbers.
The IBM System/360 family has definitively imposed the 32-bit single-precision and 64-bit double-precision sizes, where 32-bit is adequate for input data and output data and it can be sufficient for intermediate values when the input data passes through few computations, while otherwise double-precision must be used.
A few years after 1980, especially after 1985, the computers with coprocessors like Intel 8087 or Motorola 68881 became the most numerous computers with floating-point hardware, and for them the default FP size was 80-bit.
So the 1990s were long after the time when 32-bit FP numbers were normal. FP32 was revived only by GPUs, for graphic applications where precision matters much less.
Already after 1974, the C programming language made double-precision the default FP size, not the 32-bit single-precision size, for the same reason why Intel 8087 introduced extended precision. Single-precision computations for traditional applications are suitable only for experts, not for ordinary computer users.
While before C the programming languages used single-precision 32-bit numbers as the default size, the recommendations were already to use only double-precision wherever complicated expressions were computed.
I have started using computers by punching cards for a mainframe, but that was already at a time when 32-bit FP numbers were not normally used, but only 64-bit FP numbers.
The best chances of seeing 32-bit single-precision numbers in use was in the decade from 1965 to 1975, at the users of cheap mainframes or of minicomputers without hardware floating-point units, where floating-point emulation was done in software and emulating double-precision was significantly slower.
Before the mid sixties, there were more chances to see 36-bit floating-point numbers as the smallest FP size.
Yeah. I know. I'm not disagreeing with your diagnosis, I'm just trying to gently rib you that your correction is misaimed. It's a joke, ya know?
>Single-precision computations for traditional applications are suitable only for experts, not for ordinary computer users.
Lots of ordinary computer users did compute in single precision! The reason I picked the 1990s as 'ancient' and not 1980 (when the 8087 was taped out) or 1985 (when IEEE754 was finally approved) was because those microprocessors were now in the hands of users who weren't under the supervision of 'experts'. That, along with the lack of fast 64 bit registers + the desire for high throughput at low fidelity led to a lot of 32 bit code!
And, frankly, if you want to get real technical, the ability of non-experts to program in FP in 64 bit is enforced NOT ONLY by the doubled bits but by the implicit ability (absent now in many implementations) to use the 80 bit extended precision format for intermediate calcs. It's the added bits in that format for scratch that let lots of 64 bit programs just work.
The CPU of Strix Halo has good BF16 acceleration, like any other Zen 4/Zen 5 CPU (the future Zen 6 will add FP16 acceleration).
I do not know about its GPU, which might have only FP16.
So it is likely that the right inference strategy would be to run any BF16 computations on the Strix Halo CPU, while running the quantized computations on its GPU.
The GPU has INT4, INT8, BF16 and FP16. Notably no FP8 or FP4.The official GPTQ-Int4 release from Qwen is a great quant for this but custom kernels are still rare for this hardware.
One should no longer use the word "channel" because the width of a channel differs between various kinds of memories, even among those that can be used with the same CPU (e.g. between DDR and LPDDR or between DDR4 and DDR5).
For instance, now the majority of desktops with DDR5 have 4 channels, not 2 channels, but the channels are narrower, so the width of the memory interface is the same as before.
To avoid ambiguities, one should always write the width of the memory interface.
Most desktop computers and laptop computers have 128-bit memory interfaces.
The cheapest desktop computers and laptop computers, e.g. those with Intel Alder Lake N/Twin Lake CPUs, and also many smartphones and Arm-based SBCs, have 64-bit memory interfaces.
Cheaper smartphones and Arm-based SBCs have 32-bit memory interfaces.
Strix Halo and many older workstations and many cheaper servers have 256-bit memory interfaces.
High-end servers and workstations have 768-bit or 512-bit memory interfaces.
It is expected that future high-end servers will have 1024-bit memory interfaces per socket.
GPUs with private memory have usually memory interfaces between 192-bit and 1024-bit, but newer consumer GPUs have usually narrower memory interfaces than older consumer GPUs, to reduce cost. The narrower memory interface is compensated by faster memories, so the available bandwidth in consumer GPUs has been increased much slower than the increase in GDDR memory speed would have allowed.
Sadly motherboards, tech journalist, and many other places confuse the difference between a dimm and channel. The trick is the DDR4 generation they were the same, 64 bits wide. However a standard DDR5 dimm is not 1x64 bit, it's actually 2x32 bit. Thus 2 DDR5 dimms = 4 channels.
For some workloads the extra channels help, despite having the same bandwidth. This is one of the reasons that it's possible for a DDR5 system to be slightly faster than a DDR4 system, even if the memory runs at the same speed.
>However a standard DDR5 dimm is not 1x64 bit, it's actually 2x32 bit. Thus 2 DDR5 dimms = 4 channels.
Uh, surely that depends on how the motherboard is wired. Just because each DIMM has half the pins on one channel and the other half on another, doesn't mean 2 DIMM = 4 channels. It could just be that the top pins over all the DIMMs are on one channel and the bottom ones are on another.
I think there's a standard wiring for the dimm and some parts are shared. Each normal ddr5 dimm has 2 sub channels that are 32 bits each, and the new specification for the HUDIMM which will only enable 1 sub channel and only have half the bandwidth.
I don't think you can wire up DDR5 dimms willy nilly as if they were 2 separate 32 bit dimms.
Well, I don't know what to tell you. I'm not a computer engineer, but I assume Gigabyte has at least a few of those, and they're labeling the X870E boards with 4 DIMMS as "dual channel". I feel like if they were actually quad channel they'd jump at the chance to put a bigger number, so I'm compelled to trust the specs.
In computer manufacture speak dual channel = 2 x 64 bit = 128 bits wide.
So with 2 dimms or 4 you still get 128 bit wide memory. With DDR4 that means 2 channels x 64 bit each. With DDR5 that means 4 channels x 32 bit each.
Keep in mind that memory controller is in the CPU, which is where the DDR4/5 memory controller is. The motherboards job is to connect the right pins on the DIMMs to the right pins on the CPU socket. The days of a off chip memory controller/north bridge are long gone.
So if you look at an AM5 CPU it clearly states:
* Memory Type: DDR5-only (no DDR4 compatibility).
* Channels: 2 Channel (Dual-Channel).
* Memory Width: 2x32-bit sub-channels (128-bit total for 2 sticks).
Why are you quoting something that contradicts you? It clearly states it's a dual channel memory architecture with 32-bit subchannels. The fact the two words are used means they mean different things.
>In computer manufacture speak dual channel = 2 x 64 bit = 128 bits wide.
Yes, because AMD64 has 64-bit words. You can't satisfy a 64-bit load or store with just 32 bits (unless you take twice as long, of course). That you get 4 32-bit subchannels doesn't mean you can execute 4 simultaneous independent 32-bit memory operations. A 64-bit channel capable of a full operation still needs to be assembled out of multiple 32-bit subchannels. If you install a single stick you don't get any parallelism with your memory operations; i.e. the system runs in single channel mode, the single stick fulfilling only a single request at a time.
AM5 is the AMD standard, it's accurate, seems rather pedantic to differentiate between 2 sub channels per dimm and saying 4 32 bit channels for a total of 128 bit.
However the motherboard vendors get annoyingly hide that from you by claiming DDR4 is dual channel (2 x 64 bit which means two outstanding cache misses, one per channel) and just glossing over the difference by saying DDR5 dual channel (4 x 32 bit which means 4 outstanding cache misses).
> Yes, because AMD64 has 64-bit words.
It's a bit more complicate than that. First you have 3 levels of cache, the last of which triggers a cache line load, which is 64 bytes (not 64 bits). That goes to one of the 4 channels, there's a long latency for the first 64 bits. Then there's the complications of opening the row, which makes the columns available, which can speed up things if you need more than one row. But the general idea is that you get at the maximum one cache line per channel after waiting for the memory latency.
So DDR4 on a 128 bit system can have 2 cache lines in flight. So 128 bytes * memory latency. On a DDR5 system you can have 4 cache lines in flight per memory latency. Sure you need the bandwidth and 32 bit channels have half the bandwidth per clock, but the trick is the memory bus spends most of it's time waiting on memory to start a transfer. So waiting 50ns then getting 32bit @ 8000 MT/sec isn't that different than waiting 50ns and getting 64 bit @ 8000MT/sec.
Each 32 bit subchannel can handle a unique address, which is turned into a row/column, and a separate transfer when done. So a normal DDR5 system can look up 4 addresses in parallel, wait for the memory latency and return a cache line of 64 bytes.
Even better when you have something like strix halo that actually has a 256 bit wide memory system (twice any normal tablet, laptop, or desktop), but also has 16 channels x 16 bit, so it can handle 16 cache misses in flight. I suspect this is mostly to get it's aggressive iGPU fed.
The hertz is not dimensionally identified as 1/T, despite the fact that this has been voted in 1995 by the clueless delegates to the CGPM.
The hertz is a name for cycle per second and the physical quantity that is measured in hertz is plane angle per time a.k.a. phase angle per time, not reciprocal time.
I have explained in another comment why this is so.
This misconception about angles being dimensionless quantities is unfortunately taught in many school textbooks and it cripples the thinking about physics of many people.
The foundation of physics is the theory of the measurement of physical quantities. However, when teaching physics frequently this is taught either badly or not at all, instead of following classical examples, like that of James Clerk Maxwell, who started his Treatise on Electricity and Magnetism with an exposition of the theory of the measurement of physical quantities that was complete and up-to-date for that time.
The plane angle a.k.a. phase angle not only is not a dimensionless quantity, but its unit of measurement plays a crucial role in determining the other base units of any modern system of units of measurement.
The reason why the unit of plane angle is so important is that plane angle is the only fundamental continuous quantity for which it is possible to approximate measurement by counting, because the numeric value of a plane angle measured in cycles can be decomposed in a sum of a discrete quantity, the integer part of the numeric value, with a continuous quantity, the fractional part of the numeric value. The value of the discrete quantity that is the integer part of the numeric value of an angle measured in cycles can be obtained by counting.
This unique property is the cause that for the highest possible precision in measurements all continuous quantities are converted by various methods into phase angles, before the analog-to-digital conversion.
The fact that plane angle has a natural unit, which is the cycle (or its integer multiples or submultiples) is exploited in defining almost all units of other continuous quantities.
For instance, the unit of length is said to be the wavelength of a certain wave. Because wavelength is a physical quantity equal to the ratio between length and plane angle, the previous definition stated correctly says that the unit of length is equal to the unit of plane angle multiplied with a certain wavelength (i.e. 1 meter = 1 cycle multiplied by a wavelength measured in meter per cycle). Similarly for many other continuous quantities, whose units are also derived in one form or another from the unit of plane angle, because only it can be defined intrinsically, instead of being derived from other units.
The modern SI contains several serious mistakes, and the fact that they are the result of votes demonstrates that democracy is inapplicable in sciences like mathematics and physics, because the majority of the people are incompetent enough to vote things equivalent with saying that "2 + 2 = 5", but regardless if such things are voted by a majority of humans, they remain false.
The hertz is not "1/s" and everyone who has been brainwashed by school education to believe this misses some of the most important concepts on which physics is based.
Originally, "hertz" has been defined as a name for "cycle/s", not for "1/s", and that is the correct definition, which has always been the one used in practice, regardless of what is written in the SI brochure.
"cycle" is a unit for plane angle a.k.a. phase angle, so "hertz" is the SI unit for the physical quantity "angle/time", not for the quantity "reciprocal time". SI is an inconsistent system of units, because it uses 2 units for plane angles: cycles and radians. In practice, the right choice is to always measure angle in cycles, not in radians, because using cycles ensures both a higher accuracy in computations and faster computations (because the argument range reductions in trigonometric functions become exact and fast) and because all high-precision sensors and actuators must use cycles not radians, so when radians are used in computations that requires additional conversions.
The current wrong definition of the hertz is a consequence of an outrageous and shameful resolution voted in 1995: "Resolution 8 of the 20th CGPM", where it was voted that the units of plane angle and solid angle are not base units a.k.a. fundamental units, so these quantities should not be used in dimensional formulae.
This resolution is just an example of human stupidity. You cannot establish by vote whether a unit of measurement is fundamental or derived. Any unit of measurement that cannot be derived from other units is a base unit a.k.a. fundamental unit.
There are 3 fundamental units that are missing from SI (the units of logarithms, plane angles and solid angles), despite the fact that their use is extremely frequent in practice, and one of them, the unit of plane angle, is actually the most important fundamental unit, because for the highest precision in measurements all other continuous physical quantities are eventually converted into phase angles before the analog-to-digital conversion (because when phase angles are measured in cycles, the measurement can be reduced to counting, i.e. to cycle counting).
In theory, one could choose as fundamental only one of the 3 units of logarithms, plane angles and solid angles, and derive the others from it. The unit of solid angle can be derived from the unit of plane angle by setting to "1" the proportionality factor in Albert Girard’s theorem (by using this method, the steradian is derived from the radian, the hemisphere is derived from the cycle and the solid angle degree is derived from the sexagesimal degree, where a full sphere has 720 solid angle degrees; during the first 2 centuries after the beginning of the 17th cetury, when it was defined how to measure solid angles, the solid angle degree was the main unit of measurement). The unit of logarithms can be derived from the unit of plane angle by setting to "1" the ratio between the proportionality factors that appear in the formulae for the derivatives of exponential and trigonometric functions.
Choosing only 1 of these 3 units as base unit and deriving the other 2 from it results in the 3 units neper, radian and steradian. However, neper and radian are very inconvenient units, so in practice it is far better to choose independently the units of logarithms, plane angles and solid angles.
Regardless whether one chooses independent units for logarithms and solid angles, or they are derived from the unit of plane angle, it is impossible to derive the unit of plane angle from the currently official units of SI.
Some publications claim that the plane angle unit can be derived from the unit of length, supposedly because the measure of an angle is the ratio between the length of a corresponding arc and the radius of the circle.
This pseudo-argument demonstrates a complete lack of understanding of how physical quantities are measured. There are 2 variants of this pseudo-argument, which contradict each other and both are equally false. First, saying that this shows that the unit of angle is derived from the unit of length is obviously false. The reason is that when you divide two lengths, the unit of length is divided by itself, so the result of the division is independent of the unit of length, therefore this formula cannot derive anything from the unit of length.
The second formulation of the pseudo-argument is that this formula shows that the plane angle is an "adimensional quantity" a.k.a. "dimensionless quantity". This claim is equally ridiculous. As understood very well more than a century and a half ago (e.g. the theory of measurements was explained more clearly by James Clerk Maxwell than by most modern handbooks of physics), the measurement of any physical quantity is expressed by the product of 2 factors, a unit of measurement and a dimensionless numeric value.
The ratio between the length of an arc and the radius is not the complete value of a plane angle. It is just the dimensionless numeric value of an angle, in the special case when the angle is measured in radians. To obtain the complete angle value you must multiply that ratio by 1 radian. The "radius" in that formula is not truly the radius, but it is the length of the arc corresponding to the unit angle. When angles are measured in cycles, the length of a unit angle is the perimeter, so the dimensionless numeric value of an angle is equal to the ratio between the length of the arc and the perimeter.
Applying the pseudo-argument that plane angle is dimensionless to length "proves" that length is dimensionless, because the length of any object is obtained by dividing its length by the length of an 1 m ruler. Similarly for any other physical quantity.
The fact that the unit of plane angle is fundamental is demonstrated beyond any reasonable doubt by the fact that you can choose any angle unit you want, and many such units are really used in practice, e.g. cycle, radian, sexagesimal degree, centesimal degree, right angle, etc., and for each choice of a unit of plane angle you obtain a different system of units for the physical quantities. Between the numeric values measured in the different system of units there are conversion formulae that are constructed exactly in the same way as when changing for example the unit of length from meter to inch.
In order to be able to generate automatically the conversion formulae when the unit of plane angle is changed, you need to have the plane angle in the dimensional formulae of the physical quantities. The fact that most physics books usually omit the plane angle from dimensional formulae is a source of mistakes, especially when comparing some modern numeric values with values from old publications, where non-SI systems of units were used, which sometimes were based on implicit assumptions that were different from the implicit assumptions used by SI.
In SI, implicit assumptions are made everywhere. In most cases it is assumed that angles are measured in radians, and when other angle units are used the displayed formulae become wrong, despite the fact that it is claimed that they are written in a form that does not depend on which units are used. Nevertheless, wherever "hertz" is involved, it is implicitly assumed that plane angles are measured in cycles, not in radians.
The term "frequency" has 2 meanings in physics. Originally, "frequency" was the name for the ratio between the number of some random events that occur during some time interval and the duration of that time interval.
At earlier authors, before the last years of the 19th century, "frequency" was used only with this meaning. It was never used for periodic phenomena. Periodic phenomena were described using period and wave-length.
Some time around 1890, "frequency" began to be used with a second meaning, as the ratio between the number of cycles of a periodic phenomena and the duration of a time interval. After that time it has become more common to describe periodic phenomena using frequency and wave-number (i.e. number of waves), instead of using period and wave-length.
The two kinds of frequencies are distinct physical quantities, with distinct units of measurement. Frequency with the second meaning is the ratio between plane angle and time. Possible units are radian per second and cycle per second, where the latter is named hertz. Frequency with the first meaning is the ratio between a number of events (which is a discrete quantity, not a continuous quantity, like for the other kind of frequency) and time. SI has the name becquerel for this unit of measurement.
The author of the parent article is perfectly right that the becquerel is the appropriate unit of measurement for the frequency a.k.a. rate of any random events.
When you see "frequency" in a physics text, you must always pay attention to recognize which of its 2 meanings is used. Similarly, "phase" has 2 meanings, which must be distinguished. Originally, "phase" was the fractional part of a rotation angle measured in cycles. This meaning was more useful, but later "phase" began to be used for the integral of frequency with the second meaning, i.e. for the total rotation angle, not only for its fractional part.
English can be read in a different order than the normal order when the sentences contain words for which it is easy to guess whether they are agents or patients, e.g. when the agents are animate nouns and the patients are inanimate nouns, or when pronouns are used for the agents or patients.
Otherwise, the non-standard order can be understood incorrectly. While the distinction between agents and patients is the most important that depends on word order in English, there are also other order-dependent distinctions, e.g. between beneficiary and patient, when the beneficiary is not marked by a preposition, or between a noun and its attribute, e.g. "police dog" is not the same as "dog police" and unless there is a detailed context you cannot know what is meant when the word order is wrong.
English is one of the languages with the most rigid word order. There are languages, especially among older languages, where almost any word order can be used without causing ambiguities, because all the possible roles of the words are marked by prepositions, postpositions or affixes (or sometimes by accentuation shifts).
Originally, maths/mathematics meant "things that are taught", like physics meant "natural things" and similarly for other such names.
However, nowadays a word like physics is understood not as "natural things", but as an implicit abbreviation for "the science of natural things". Similarly for mathematics, mechanics, dynamics and so on.
So such nouns are used as singular nouns, because the implicit noun "science" is singular.
With -π to π radians you get absolute error of approximately 4e-16 radians. With -180 to 180 degrees you get absolute error of approximately 2e-14 degrees.
Even though the first number is smaller than the 2nd one, they actually represent the same angle once you consider that they are different units. So there's no precision advantage (absolute or relative) to converting degrees to radians.
Note that I'm not saying anything about fixed vs floating point, only responding to an earlier comment that radians give more precision in floating point representation.
Yep, it was a long time ago but I think that's exactly what we ended up with, eventually: An int type of unit 2π/(int range). I believe we used unsigned because signed int overflow is undefined behavior.
The Cray 64-bit format was a slight increase in size over the 60-bit floating-point numbers that had been used in the previous computers designed by Seymour Cray, at CDC.
Before IBM increased the size of a byte to 8 bits, which caused all numeric formats to use sizes that are multiple of 8-bits, in the computers with 6-bit bytes the typical floating-point number sizes were either 60-bit in the high-end models or 48-bit in cheaper models or 36-bit in the cheapest models.
reply