It is a particular sense of "nondeterminism", but it's not specific to functional programming, I think it's the usual one theoretical CS as a whole. It's the same sense in which "nondeterminism" is used in P vs NP, for example.
Think of a computation as a process of changing state. At a given point in time, the computer is in a certain state, the current state. The computation can be described in terms of a function that acts on the current state.
In a deterministic computation, the function takes in the current state, and produces a single state as output which will be the state the computer enters on the next tick.
In a non-deterministic computation, the function takes in the current state and produces a set of states as output. These states are all the possible states the computer might enter on the next tick. We don't know (or just don't care) which one of these states it will enter.
You can model a non-deterministic computation as a deterministic one, by using a list `currentStates` to store the set of all possible current states of the computation. At each "tick", you do `currentStates = flatMap(nextStates, currentStates)` to "progress" the computation. In the end `currentStates` will be the set of all possible end states (and you could do some further processing to choose a specific end state, e.g. at random, if you wish, but you could also just work with the set of end states as a whole).
It's in this sense that "a list is a non-deterministic result", although this is really just one thing a list can represent; a list is a generic data structure which can represent all sorts of things, one of which is a non-deterministic result.
As a non-arachnophobe, I don't find spiders of any kind to be cute. But I also don't find anything about their appearance or behaviour to be unpleasant or scary. They're just aesthetically unremarkable. Similar to, say, fish---I don't think most people will look at a fish and think either "ooh, so cute" or "weird alien thing, get it away from me", they just see a fish and don't really have any emotions about it.
As somebody with autism, one thing I'd say from my experience (I don't know how many people will agree) is that interviewing has felt like a much more severe stress test of my soft skills than anything I've had to do while actually being employed. While employed, the vast majority of my social interactions are oriented around some technical task that I need to work on with other people, and conveying information effectively so as to bring about the completion of this task. This is precisely the kind of social interaction that I feel most competent in--I feel like I'm pretty good at it, actually! What I struggle with are social interactions that are more open-ended, that are more about emotional connections and getting people to like you, and I feel like interviewing is an interaction of the latter type.
In this respect interviewing is a bit like LeetCode. LeetCode problems and writing code to satisfy business requirements are both "coding" but they're quite different kinds of coding; someone being able to do the former is probably good evidence they can do the latter, but there are also plenty of people who can do the latter without being able to do the former. So it is, in my view, with interviewing vs. interacting with people on the job.
I think the author made a typo, and they actually meant to refer to 2.92, not 2.29. The table preceding the part mentioning 2.29 only includes 2.92. And 2.29 doesn't include the text "This is not easy!”, whereas 2.92 does.
That makes a lot more sense. It also matches the times they record for the exercise. For context, 2.92 is:
> Exercise 2.92. By imposing an ordering on variables, extend the polynomial package so that addition and multiplication of polynomials works for polynomials in different variables. (This is not easy!)
2.29 is what I described in my other comment, it's a pair of functions walking a binary tree with conditional logic based on if it's an interior or exterior node, returning either a sum or a true/false value. Which is not hard at all and their time was only 83 minutes. Versus their claimed time of 2400 minutes for 2.92.
An spoken English sentence is a finite string of phonemes. The set of allowable phonemes is finite. Given a finite set X, the set of all finite strings of elements of X is countable.
(The latter statement holds because for any given n, the set X_n of all strings of length n is finite. So you can count the members of X_0, then count the members of X_1, and so on, and by continuing on in that way you'll eventually count out all members of X. You never run out of numbers to assign to the next set because at each point the set of numbers you've already assigned is finite (it's smaller in size than X_0, ..., X_n combined, for some n).
In fact, even if you allow countably infinitely many phonemes to be used, the X_n sets will still be countable, if not finite, and in that case their union is still countable: to see that, you can take enumerations of each set put them together as columns an a matrix. Even though the matrix is infinite in both dimensions, it has finite diagonals, so you can enumerate its cells by going a diagonal at a time, like this (the numbers reflect the cells' order in the numeration):
1 3 6 10 15
2 5 9 14
4 8 13
7 12
11
However if you allow sentences to be countably infinitely long, then even when you only have finitely many phonemes, the set of all sentences will be uncountable, because in that case each countably infinitely long sentence can be mapped to a real number represented as an expansion in some base, and you can apply Cantor's diagonal argument. The "just count out each X_n separately" argument doesn't work in this case because it only applies to the sentences of finite length.)
Probably what Abbott in Understanding Analysis calls the axiom of completeness: every set that is bounded above has a least upper bound.
Making this stipulation distinguishes the reals from the rationals, as e.g. the set of numbers whose square is less than 2 is bounded above by any number whose square is greater than or equal to two, but among the rationals there is no least upper bound: given any rational number whose square is greater than or equal to two we can always find a smaller such rational.
Assuming the axiom of completeness, we define the square root of two as the least upper bound of the set of numbers whose square is less than two
But that is an axiom, not a construction! The point of Dedekind cuts is that they give a construction of the real numbers, and one can prove that this satisfies the Axiom of Least Upper Bounds.
You don't need a construction for a calculus class. If you do need one Cauchy sequence completion is more generalizable and somewhat easier to work with.
I don’t really know what a “calculus” class is since here (the UK) that term isn’t really used for university-level mathematics; we’d usually say “analysis” instead, but I know that “analysis” is a class in the US too, so I don’t know if calculus is closer to what we would do just prior to university (a bit of limits, differentiation, Riemann integrals, a bit of vector calculus).
Virtually every first year UK undergraduate analysis course will start with a construction of the reals via Dedekind cuts, and this is about the level that this book is pitched at.
The original commenter suggested that “least upper bounds” is a simpler approach, and that Hardy’s book is outdated by using Dedekind cuts; it may be that constructing the reals is not something that would be done at “calculus”-level in the US, but clearly the book isn’t aimed at that level.
Dedekind cuts (or Cauchy sequences) are totally standard, and I don’t think it’s fair to criticise their use at all.
In the U.S., there is typically a separation between calculus and real analysis. Though, the amount of difference between the two depends on the university.
In calculus, there is more emphasis on learning how to mechanically manipulate derivatives and integrals and their use in science and engineering. While this includes some instruction on proving results necessary for formally defining derivatives and integrals, it is generally not the primary focus. Meaning, things like limits will be explained and then used to construct derivatives and integrals, but the construction of the reals is less common in this course. Commonly, calculus 1 focuses more on derivatives, 2 on integrals, and 3 on multivariable. However, to be clear, there is a huge variety in what is taught in calculus and how proof based it is. It depends on the department.
Real analysis focuses purely on proving the results used in calculus classes and would include a discussion on the construction of the reals. A typical book for this would be something like Principles of Mathematical Analysis by Rudin.
I'm not writing this because I don't think you don't know what these topics are, but to help explain some of the differences between the U.S. and elsewhere. I've worked at universities both in the U.S. and in Europe and it's always a bit different. As to why or what's better, no idea. But, now you know.
Side note, the U.S. also has a separate degree for math education, which I've not seen elsewhere. No idea why, but it also surprised me when I found out.
There's a POV that learning math and learning how to teach math effectively are two orthogonal things.
If one only took the method of teaching that is most common in US university lecture halls, and applied it to a small class of pre-teens or teenagers, it probably wouldn't be very effective.
I went to a University of California school which had 3 calculus tracks - one for life/social sciences students (eg biology, econ), one for physical sciences (chemistry, physics, math, ...), and an honors track.
High school went up through what we call Algebra II. Calculus is an Advanced Placement (AP) course that most students don't take.
I took physical sciences calc + multivariate calc (1 year including summer), an intro to proofs and set theory course, and then finally a rigorous construction of reals was taught in our upper division real analysis course. So somewhere in my second year as a math major. Though I had already researched the constructions myself out of curiosity.
Apart from the material being extraneous for anyone outside the major, I think they were in a sense trying to be more rigorous by first requiring set theory which included constructions of the integer and rational number systems.
Where we define the real numbers as the least upper bounds of special sets. There is a bijection between these sets and the set of real numbers which we commonly think of and that bijection is the least upper bound of such sets.
I haven't looked at Hardy's but the presentation in Spivak is also Dedekind cuts. Perhaps Hardy uses a different approach and OP misnamed it? Rudin's chapter 1 annex also use Dedekind's cuts.
It looks like Hardy used Dedekind cuts from starting with the second edition (1914), but not in the first edition (1908).
What's the advantage of Dedekind cuts over say equivalence classes of Cauchy sequences of rational numbers? Particularly if you start out by introducing the integers and rational numbers as equivalence classes as well.
The equivalence class of Cauchy sequences is vastly larger and misleading compared to those of integers and rational numbers. You can take any finite sequence and prepend it to a Cauchy sequence and it will represent the same real number. For example, a sequence of 0,0,0,...,0 where the number of dots is the count of all the atoms in the universe and then followed by the decimal approximations of pi: 3, 3.1, 3.14, 3.141, ... The key component is the error clause of getting close, but that can vary greatly from sequence to sequence as to when that happens. The cute idea of being able to look at a sequence and see roughly where it is converging just is not captured well in the reality of the equivalence classes.
More or less, one can think of a Cauchy sequence of generating intervals that contain the real number, but it can be arbitrarily long before the sequence gets to "small" intervals. So comparing two Cauchy sequences could be quite difficult. Contrast that with the rational numbers where a/b ~ c/d if and only if ad = bc. This is a relatively simple thing to check if a, b, c, and d are comfortably within the realm of human computation.
Dedekind cuts avoid this as there is just one object and it is assumed to be completed. This is unrealistic in general though the n-roots are wonderful examples to think it is all okay and explicit. But if one considers e, it becomes clear that one has to do an approximation to get bounds on what is in the lower cut. The (lower) Dedekind cut can be thought of as being the set of lower endpoints of intervals that contain the real number.
My preference is to define real numbers as the set of inclusive rational intervals that contain the real number. That is a bit circular, of course, so one has to come up with properties that say when a set of intervals satisfies being a real number. The key property is based on the idea behind the intermediate value theorem, namely, given an interval containing the real number, any number in the interval divides the interval in two pieces, one which is in the set and the other is not (if the number chosen "is" the real number, then both pieces are in the set).
There is a version of this idea which is theoretically complete and uses Dedekind cuts to establish its correctness[1] and there is a version of this idea which uses what I call oracles that gets into the practical messiness of not being able to fully present a real number in practice[2].
> The equivalence class of Cauchy sequences is vastly larger and misleading compared to those of integers and rational numbers. You can take any finite sequence and prepend it to a Cauchy sequence and it will represent the same real number. ...
This can be addressed practically enough by introducing the notion of a 'modulus of convergence'.
> The equivalence class of Cauchy sequences is vastly larger and misleading compared to those of integers and rational numbers. You can take any finite sequence and prepend it to a Cauchy sequence and it will represent the same real number.
What's the misleading part of this supposed to be?
The equivalence classes of integers: pairs of naturals with (a, b) ~ (c, d) := (a + d) = (b + c).
The equivalence classes of rationals: pairs of integers with (a, b) ~ (c, d) := ad = bc.
It’s “easy” to tell whether two integers/rationals are equivalent, because the equivalence rule only requires you to determine whether one pair is a translation/multiple resp. of the other (proof is left to the reader).
Cauchy sequences, on the other hand, require you to consider the limit of an infinite sequence; as the GP points out, two sequences with the same limit may differ by an arbitrarily large prefix, which makes them “hard” to compare.
We can formalise this notion by pointing out that equality of integers and rationals is decidable, whereas equality of Cauchy reals is not. On the other hand, equality of Dedekind reals isn’t decidable either, so it’s not that Cauchy reals are necessarily easier than Dedekind reals, but more that they might lull one into a false sense of security because one might naively believe that it’s easy to tell if two sequences have the same limit.
It is easy if you know the limits; if you don't, it's still true that two sequences {r_n}, {s_n} have the same limit if and only if the limit of the difference sequence {r_n - s_n} is zero, which conveniently enough is an integer and can't mess up our attempt to define the reals without invoking the reals.
That won't help you much if you don't know what you're working with, but the same is true of rationals.
I'm missing something as to this:
> equality of Dedekind reals isn’t decidable either
Two Dedekind reals (A, B) and (A', B') are equal if and only if they have identical representations. [Which is to say, A = A' and B = B'.] This is about as simple as equality gets, and is the normal rule of equality for ordered pairs. Can you elaborate on how you're thinking about decidability?
> Two Dedekind reals (A, B) and (A', B') are equal if and only if they have identical representations. […] Can you elaborate on how you're thinking about decidability?
Direct:
Make one of the sets uncomputable, at which point the equality of the sets cannot be decided. This happens when the real defined by the Dedekind cut is itself uncomputable. BB(764) is an integer (!) that I know is uncomputable off the top of my head. The same idea (defining an object in terms of some halting property) is used in the next proof.
Via undecidability of Cauchy reals:
Equality of Cauchy reals is also undecidable. The proof is by negation: consider a procedure that decides whether a real is equal to zero; consider a
sequence (a_n) with a_n = 1 if Turing machine A halts within n steps on all inputs, 0 otherwise; this is clearly Cauchy, but if we can decide whether it’s equal to 0, then we can decide HALT.
Cauchy reals and Dedekind reals are isomorphic, so equality of Dedekinds must also be undecidable.
Hopefully those two sketches show what I mean by decidable; caveat that I’m not infallible and haven’t been in academia for a while, so some/all of this may be wrong!
> BB(764) is an integer (!) that I know is uncomputable
I meant BB(748) apparently.
To elaborate on this point a bit, I specifically mean uncomputable in ZFC. There may be other foundations in which it is computable, but we can just find another n for which BB(n) is uncomputable in that framework since BB is an uncomputable function.
Your method for deciding whether two rationals are or aren't equal relies on having representations of those rationals. If you don't have those, it doesn't matter that there's an efficient test of equality when you do.
But you're arguing that equality of Dedekind reals is undecidable based on a problem that occurs when you define a particular "Dedekind real" only by reference to some property that it has. If you had a representation of the values as Dedekind reals, it would be trivial to determine whether they were or weren't equal. You're holding them to a different standard than you're using for the integers and rationals. Why?
Let's decide a question about the integers. Is BB(800) equal to BB(801)?
The important point for me is that the equivalence for Cauchy sequences are part of the definition of real numbers as Cauchy sequences. This ought to imply that one has to be able to decide equivalence of two sequences for the definition to make sense. For Dedekind cuts, the crucial aspect is being able to define the set and that is something that can be called into question. But if that is done, it is just a computational question in comparing two Dedekind cuts, not a definitional one.
The intuition of a sequence is that the terms get closer to the convergence point. Looking at the first trillion elements of a sequence feels like it ought to give one some kind of information about the number. But without the convergence information, those first trillion elements of the sequence can be wholly useless and simply randomly chosen rational numbers. This is an "of course", but when talking about defining a real number with these sequences, as opposed to approximating them, this gives me a great deal of unease.
In particular, it is quite possible to prove a theorem that a sequence is Cauchy, but that there is no way to explicitly figure out N for a given epsilon. The sequence is effectively useless. This presumably is possible, and common, with using the Axiom of Choice. One can even imagine an algorithm for such a sequence that produces numbers and eventually converges, but the convergence is not knowable. Again, if this is just approximating something, then we can simply say it is a useless approximation scheme. But defining real numbers as the equivalence class of Cauchy sequences suggests taking such a sequence seriously in some sense and is the answer.
In contrast, consider integer and rational number versions, it is quite immediate how to reduce them to their canonical form, assuming unlimited finite arithmetic ability. For example, 200/300 ~ 2/3 and one recognizes that 200/300 and 2/3 are different forms of what we take to be the same object for most of our purposes. There is no canonical Cauchy sequence to reduce to and concluding two sequences are equivalent could take a potentially infinite number of computations /comparisons. While that is somewhat inherent to the complexity of real numbers, it feels particularly acute when it is something that must be done in defining the object.
Dedekind cuts have the opposite problem. There is only one of them for an irrational number, but it is not entirely clear what we would be computing out as an approximation, particularly if the lower cut viewpoint is adopted.
Intervals, on the other hand, inherently contain the approximation information. By dividing them and picking out the next subinterval, one also has a method to computing out a sequence of ever better approximations. I suppose one could prove the existence of the family of containment intervals without explicitly being able to produce them, but at least the emptiness of the statement would be quite clear (nothing is produced) in contrast to the sequences that could produce essentially meaningless numbers for an arbitrarily large number of terms.
Wringing our hands so much about the art as Art. This thing in general where people feel the need to validate or justify AI art in the same terms one does other art is like the antithesis of anything you could remotely call the avant garde. To, in fact, take surveys about it, calculate conclusions... To care about what people think at all, about what they simply see with their retinas, as if art is somehow for them. It all smacks of precisely the kind of thing he hated, or at least tried to distinguish himself from.
Technically, `a :: Num` would be declaring, or defining that `a` is of type `Num`. After you see `a :: Num`, you can assume from then on as you're reading the program that `a` has type `Num`; if something is incompatible with that assumption it will result in a compiler error. This is different from `Num a`, which is making the assertion that `a` is of type `Num`, but that assertion may evaluate as true or false. It's similar to how assignment is different from equality, so that most programming languages with C-style syntax make a distinction between `=` and `==`.
There's also the fact that `Num` is technically not a type, but a type class, which is like a level above a type: values are organized into types, and types are organized into classes. Though this is more of a limitation of Haskell: conceptually, type classes are just the types of types, but in practice, the way they're implemented means they can't be treated in a uniform way with ordinary types.
So that's why there's a syntactic distinction between `Num a` and `a :: Num`. As for why `Num` comes before `a`, there's certainly a reasonable argument for making it come after, given that we'd read it in English as "a is a Num". I think the reason it comes before is that it's based on the usual function call syntax, which is `f x` in Haskell (similar to `f(x)` in C-style languages, but without requiring the parentheses). `Num` is kind of like a function you call on a type which returns a boolean.
As I understand it, the main advantage of having separate CUSTOMER and ORDER tables rather than just have a field on CUSTOMER which is a list of order IDs is that the latter structure makes it easy for someone querying the database to retrieve a list of order IDs for a given customer ID (just look up the value of the list-of-order-IDs field), but difficult to retrieve the customer ID for a given order ID (one would have to go through each customer and search through the list-of-order-IDs field for a match). The former structure, on the other hand, makes both tasks trivial to do with a JOIN.
What you're describing is a straightforward indexing problem. There's nothing stopping the database building an index which can look up customer IDs for a given order ID. You could even, if the database allows it, build a virtual ORDER table projection that you could query directly.
Think of a computation as a process of changing state. At a given point in time, the computer is in a certain state, the current state. The computation can be described in terms of a function that acts on the current state.
In a deterministic computation, the function takes in the current state, and produces a single state as output which will be the state the computer enters on the next tick.
In a non-deterministic computation, the function takes in the current state and produces a set of states as output. These states are all the possible states the computer might enter on the next tick. We don't know (or just don't care) which one of these states it will enter.
You can model a non-deterministic computation as a deterministic one, by using a list `currentStates` to store the set of all possible current states of the computation. At each "tick", you do `currentStates = flatMap(nextStates, currentStates)` to "progress" the computation. In the end `currentStates` will be the set of all possible end states (and you could do some further processing to choose a specific end state, e.g. at random, if you wish, but you could also just work with the set of end states as a whole).
It's in this sense that "a list is a non-deterministic result", although this is really just one thing a list can represent; a list is a generic data structure which can represent all sorts of things, one of which is a non-deterministic result.