I wasn't as convinced by the mathiness and speculation/explanation arguments as the others. Sure there may be a few examples of obfuscation via math, but most papers, in my view, don't do that. Adding the couple of math equations doesn't really hurt (it doesn't add anything either). I also interpret explanations in most deep learning papers as speculation, since short of rigorous experimentation or theory (not found in DL papers), there's no way to prove explanation is correct.
Comparatively, the other problems are orders of magnitude more important and troubling. Experiments which mislead people into thinking the technique matters more than the compute (like in OpenAI's Glow paper) and intentional anthropomorphization to get media coverage (OpenAI Dota bot's "cooperation and teamwork") are much more serious and could have used more attention in your paper. I think these are more serious because the former can fool even experts in the field, and the latter seriously misinforms the lay public on the topic of AI.
These are excellent questions. I often ask Google researchers, "How Deep is the Mind?" They don't seem to know. You see, it all comes down to democratizing the blockchain, and the only way to do that is with AI in each and everyone's hands. Together, we can make a more open AI, for everyone! And for you too! The only trust required for CryptoMinds is trust in proof of work, am I right?
In specific the field of homomorphic encryption. It's a somewhat new field. Some libraries and methods require hours per bit operation, whereas OpenMined and PySift are claiming to do similar things with orders of magnitude less overhead.
I ask because if their claims are valid they are doing some very interesting things.
Author here: Grisha Perelman's proof of the Poincaré conjecture has never been (by him, to my knowledge) submitted to or published in any journal. He decided, as is his right, that he could not care less about the professional community of publishing mathematicians or their protocols. Does not invalidate his achievement.
I should note, since I am not and do not expect to be the level of mathematician that Perelman is, I have not actually read his proof. So I defer to other superior mathematicians for this assessment and come by it as hearsay. :)
I think the logic that his work is unpublished itself is wrong, he published/posted his work for world to see in arXiv , He has very strong opinions about current scientific publishing status quo, hence did not go through usual route of submitting in journal, He did published/posted on arXiv. Who are we to decide that posting your work in Blog post , arXiv etc does not constitute published.
It becomes a problem if the people with the money want it. Grant proposals usually require you to list your important and relevant "published" papers.
There is actually two aspects about "published". One is archival, so people can expect to access the work decades later (if they have to pay for that is another discussion). The second aspect is peer-review aka quality control.
Personally, I once submitted a paper to a workshop. After submission, peer-review, and acceptance the workshop committee decided that they will not publish proceedings. I could have submitted the paper elsewhere, which I find weird. Instead I published it as a techreport. However, it is now unusable for proposals, because a techreport is "not published" even if it is properly archived and went through peer review.
> However, it is now unusable for proposals, because a techreport is "not published" even if it is properly archived and went through peer review.
IMO, the definition of 'published' is a huge issue. I have always read published as archived peer reviewed research. Your paper is both, but remains in an not published state which I think is wrong and hinders future research.
Hey Zack, are there any intermediate/advanced "mathematics for machine learning" books you'd recommend? I find the classic recommendations are not exhaustive enough to cover the kind of math recent papers have started getting into.
You should keep in mind that probably none of those machine-learning researchers has studied only math specific to that domain, so their papers are likely to include whatever math they have a background in, plus any new techniques they had to learn to get their results.
That said, everything I saw in the papers you linked was linear algebra, calculus or probability theory plus the usual smattering of background notation and set theory.
Once you have a solid background in those areas, it is likely more productive to look up the specific concepts mentioned in a paper (such as the Kullback-Leibler divergence or the Bellman equation), because by then you are probably too deep in the woods to find one resource that adequately covers all those different directions.
That's mostly linear algebra, probability theory and calculus. You're going to have a difficult time self-studying all of that if you haven't had much exposure to it.
Books are probably a less efficient method of learning the mathematics if you have targeted subjects you want to learn about. They're typically suited to introductions and breadth-wise coverage of fields, but once you get higher up, "linear algebra" (for example) can get fuzzy with things like abstract algebra. That means you'll end up with several tome-like books to work through which can be productive, but it'll take a while and you'll need to map the material to the applications you're interested in on your own. It's more efficient to develop a good baseline of understanding about a broad subject area, learn the foundational theorems, then move on to the specific areas you need to learn. This is typically doable if you've developed the requisite mathematical maturity overall and if you have learned the "essentials."
Practically speaking: maybe pick up foundation texts like Strang's (linear algebra), Spivak's (calculus) and Ross' (probability theory). You're going to want a solid foundation in analysis before moving on to higher order probability theory, so drill down on that after you do a refresher on the calculus. From there you should attempt to read each paper (even if you struggle a lot), take notes on what confuses you or doesn't make sense, read the prior art on those topics and then come back to it.
I don't particularly read machine learning papers often, but I read mathematical cryptographic ones very often (at least once per day I find myself in a new one). It's not typical that I read a research paper introducing a novel primitive or construction where I follow the math immediately on a single pass, and I often come across things I need to read about first. From a thirty thousand foot view the math for both of these subjects is broadly similar in rough topical surface area, so I think this methodology for academic reading is fairly applicable to most subjects that involve a lot of mathematics understanding.
Basically: don't approach learning the heavy math with a monolithic, brute-force approach as if you were in university. That's a slog and it's demotivating. Learn the minimum foundation for each area you need, then proceed to more advanced topics as you need them.
For calculus, books, why Spivak over Swokowski? Can you compare and contrast and suggest why someone might suggest one over the other? I don't have a preference myself, but it would be good to understand the differences.
The linear algebra is obviously key, and I wish we'd done more of the in my advanced high school classes instead of elementary analysis.
Thanks a lot for your comment. I do have exposure to all the three topics. I self-studied with Strang's MIT OCW course in high school, took calculus and probability in high school and undergrad. So, I'm not really looking for big introductory books for two reasons, I don't really have time to go through big books, and since I already have some exposure, it becomes hard to find new things to learn from such introductory books. So, I was looking for something more concise which efficiently covers such mathematics.
EDIT: I think the main topic missing from my background is this so-called "analysis". I never formally studied it. Is there a more efficient way to study analysis than spivak's, for someone who has a decent background otherwise?
Analysis is basically "really rigorous calculus". Basic analysis courses are also usually where you learn to do proofs.
(To some reasonable generality "calculus" stands for "rules of manipulation", while analysis is the mathematical theory of calculus. So I can teach you stochastic calculus in a couple of two-hour sessions but understanding what the hell is going on (stochastic analysis) requires measure theory, some functional analysis and much courage)
I don't know why, but people always seem to forget that optimization is an important topic in machine learning that requires study. Boyd's book is the canonical source (and free online). If you want to get some functional analysis background at the same time, you can look at Optimization by Vector Space Methods. It's an older book but it is still worth a read and provides more theoretical foundations than Boyd.
> I don't particularly read machine learning papers often, but I read mathematical cryptographic ones very often (at least once per day I find myself in a new one).
Where do you find new ones? I'd like to get into this.
The IACR eprint archive, which is essentially arXiv for cryptography. Essentially everything worth reading in cryptography is either in the IACR eprint archive or a conference proceeding. All conference proceedings from the IACR conferences can be read online for a fairly cheap membership fee. More often than not everything is cross-posted to the eprint archive even if it's published in a journal (which there's basically one: The Journal of Cryptology) or a conference.
It's problematic that the "experts in AGI" are generally self-designated, and have mostly produced little that is concrete, either in the form of theory or engineering. We should be concerned about long-term outcomes of technological progress, but I'm not convinced the current conversation centering on the term "AGI" is productive.
While I don't take exception with the word machine learning (it's reasonably well defined), I agree with most of the points here. Some are grossly exaggerated (e.g. needing petabytes of data to do machine learning). In my research on medical time series, we get strong results with a few thousand examples (megabytes). Same for natural language datasets which are modestly sized. But yes, absolutely agree that function approximation for supervised learning and some kind of artificial consciousness as portrayed in the media are very far apart.
Can you also put into words why? Sincere question, not being snarky; I also know the feeling of something feeling off but not knowing how to express it in words (yet).
Not op, but also from economics background - the issue with economics in public perception for me is that is't became modern "religion"/ideology and moved far from proper science. Especially "mainstream" economics theory.
So the issue is not that it's getting misinterpreted due to lack of understanding. The issue there is that it's getting unscientific and politicised purposefully.
Great book on this is James Kwak's "Economism" - the abuse of purported economic insight for political purposes.
As just one simple example, good old Hekscher-Olin trade theory (a neat general equilibrium model with 2 countries, 2 goods, and 2 factors of production, capital and labor) "shows" that free trade is a good thing, leading to a Pareto improvement.
But of course, that's predicated on a whole host of assumptions that might or might not hold, and furthermore predicated on the assumption that the "winners" compensate the "losers", through redistribution.
So, the economic case for free trade more or less includes the case for redistribution and compensation of those negatively affected by it - but that's often conveniently left out by proponents.
Popular economics and popular "AI" as described in the article both seem like situations where everybody with an idea to push cites popularisation of research conclusions without understanding the limitations to the models and people with actual expertise in the field are often happy to play along with these hyperbolic, caveat-free popularisations because it helps their end goals.
Sure, the economics profession and its popularisers might be a little more incentivised by political aims and AI research and AI's hypers and commercialisers a little more by money, but both fields suffer from the fact whilst researchers agonise over tradeoffs between tractability and predictive accuracy and fitting and overfitting and wonder whether the class of problem they're looking at is even soluble, the people with the most confidence in their assertions tend to get the column inches, even if they barely know what they're talking about.
I only wonder whether it will ultimately lead to similarly widespread middlebrow dismissals[1] of the entire field of AI...
[1]for the avoidance of doubt, not an accusation I'm levelling at the poster above
Right, but this is a political problem. H-O (or Ricardian trade) is about the simplest model of trade you can come up with to show the concept.
It's not like economists don't know there are problems in distribution of wealth -- one of the most discussed papers last year (see these podcasts [1] [2]) talks about the effects of China massively expanding trade with the US in early 2000s on some parts of the country.
Pretty much everyone has known for half a century what trade does, there's a political and logistical problem in redistribution, though (counties tend to get devastated, and people don't like to move).
Interestingly, trade has extremely similar labor market effects to automation.
Is you "economics background" a BA, by any chance?
The people who say economics "became religion" are usually those whose only exposure to economics is through secondhand knowledge (ie. they read about it in media) or took a handful of undergraduate classes.
Take a look at what modern economics research looks like [1]. Seriously, read __any__ of those articles and come tell me with a straight face it's not doing normal science (come up with theory, test with empirical data).
Economics is the most scientific it's ever been, most graduate curriculum, and even some undergrad, are veering towards the applied statistics arm of economics because it's what's been most successful in the last 20 years. Granted there are empirical problems in, say, macro, but that's mainly due to lack of data.
Economics is also probably the most politicized it's been in a long time at the moment, I agree with you in that. Apart from a few venues like Planet Money or Freakonomics, there isn't much pop-economics like there is pop-science in other fields like physics. Moreover, the incentive to politicize economics is much greater than other sciences.
Yup, I'm quite familiar with what modern economics research looks like. My problem with them start when I get to "We now use formal econometric techniques", when they get very limited model with huge amount of assumptions and then draw some conclusions from that.
Math checks out. No questions about that (Although there's always stories about reproducibility of results). But after that there's huge gap, when you try to extrapolate the conclusions of the model to the real world precisely due to assumptions/limitations of the model.
I am rather sceptical about this approach - it's formal to the point where "my model of my virtual world totally works due to ideal nature of this world and mathematical logic", or "we managed to fit our model to carefully selected dataset". Yes, it's formal 'scientific methodology', yes small scope research is the practical way to publish paper after paper (the only real 'currency' in scientific world), but again it seems just to be 'safe haven' for economists, when they are abstracted from real world enough to not interfere with real politics. As with biology in Darwin times, when it was fine to be clergyman and study biology (but mainly in taxonomy sense - listing all 'god creatures' and not thinking about origin of species)
Also, "The people who say economics "became religion" are usually those whose only exposure to economics is through secondhand knowledge (ie. they read about it in media) or took a handful of undergraduate classes." - yup, the same as some nobel laureates like Krugman or Hayek.
Can we agree that the state of the scientific field is as good as a decisions taken based on the achievements in this field?
This is actually very similar to the issues HCI suffered from a few decades ago, and which indirectly lead to Interaction Design trying to separate itself from it. The former was too dogmatic about trying to apply quantitative models to everything, and the latter consciously said "nope, we're going to go look at the humanities, anthropology, and all other fields known for qualitative research and see what we can learn from there to make better human-oriented designs".
(mind you, this was before the term "UX design" got diluted to "graphic design for webpages and buttons on touch interfaces")
> Can we agree that the state of the scientific field is as good as a decisions taken based on the achievements in this field?
Certainly not, or else climate science would still be in thorough disagreement about whether or not there's climate change and/or how anthropogenic it is.
Even if you have massive amounts of evidence pointing to something, if it goes against entrenched interests, nothing is going to be done.
Also, you need to differentiate Krugman the political commentator and Krugman the economist. He's much more of the former these days, posts labeled "wonkish" are generally from the latter. The fault is on him, though, for discrediting himself that way.
> Certainly not, or else climate science would still be in thorough disagreement
Looking at Europe - I see the decisions, policies and agreement. Looking at USA I see conscious decision to ignore the science predictions when it suits some political goals in short term and not to ignore when it's vital - like when planning military strategy in Middle East.
The US (and Canada and Australia) are bad on climate change because they have entrenched interests in fossil fuels which leads to political lobbying.
Europe doesn't have those specific problems, but it has others.
Politicians will do whatever has political benefits to them. Economic benefits, or even reality, are purely secondary. You might get the occasional exceptionally altruistic politician, but the average one will act on his incentives.
This is why you see stuff like the Reinhart & Rogoff paper, which never passed peer review, touted for austerity politics even long after it was retracted. Austerity is popular with a part of the population and politicians will use whatever to justify whatever they're trying to do.
Judging anything on its political success is nonsense
Sure. The first NBER paper in [1] features the following jewel (p.34):
"B. Calibration
To quantitatively decompose the contribution of different factors to the growth of shadow banks and fintech firms, we first have to calibrate the model to the conforming loan market data."
I can tell you with a straight face that is not normal science. Economists themselves increasingly recognize so-called "calibration" is a farce.
If the paper gets traction and the model specification is indeed not robust, the first thing you'll see in a few months is something like "paper xyz: comment" driving holes in the methodology getting even more traction. Empirical microeconomics is fairly open about methodology flaws and critiques.
Also, by the way, you see pretty much the same type of thing in a ton of fMRI neuroscience, medical and psychology studies (even the ones you'll later see on NPR or ted talks). You shouldn't ever believe any one empirical result in basically anything except maybe CERN particle physics type work.
Your response, and the ostensible fact my critique went completely over your head, probably shows you should definitely spend more time familiarizing yourself with the discipline, before going around defending it.
You should start with 'calibration' in economics. No, it is not quite what you (seem to) think it is. No, it is not quite "the same type of thing" as p-hacking and low-powered studies in psychology. (Whose poor reliability, by the way, is almost common knowledge by now.)
My point is that it's very easy to check "model calibration" (eg. "Plug values from outside data"). Just run the code with different values.
Because it's so easy to check those things (assuming the data is not proprietary or whatnot) I'd argue it's a much lesser problem than the "garden of forking paths" in lab experiments where it's much harder to test robustness of the result.
Moreover, I don't think anyone intelligent is foolish enough to take the coefficients in an econometric study literally; at least I would hope not.
In more established field it tends to be that people outside the field exaggerate things because the can't put them into the context of the field i.e. they lack domain knowledge. In "tech" it seem like it's the people in the field that exaggerate things because they can't put them into the context of the world.
I generally agree with this. I used machine learning almost exclusively in the article. Yet "AI misinformation" seems more appropriate in the title since the misinformation, generally, is in reference to "artificial intelligence".