Using Google Shopping is a cheat code to getting cheaper prices. Companies will mark their products down way more than any other coupon that can be found online, just to appear at the top of the results. It's amazing at saving money, if you already know what you want to buy.
While the idea behind it is nice, the number of possible outcomes isn't high enough to justify the complex decision tree the user has to go through. Here's a much easier way to represent it:
I put all paper documents I receive on top of a stack. Therefore it's roughly sorted by date.
If I need to get an invoice from last december, I just lookup around this date in my stack.
Time spent to store information : 0 ; time spent to find something : a few minutes, once every other month.
Every five years, I take the bottom of the stack and file it in the cellar. And I come back from the cellar with 10 year old documents I can either trash (in my office secured bin) or keep in my filing box.
I also keep contracts (insurance, bank, ...) in this filing box.
Last thing : All documents that will be used for my tax returns (at least the equivalent of it in France) go in one folder. I will use it once a year then file this in the "taxes" box.
The key lesson you need to remember about orbit (in KSP or real life) is that orbit is sideways speed, not up distance or up speed. Going up, you will come back down. Go sideways fast enough, you'll come down but keep missing the planet below you, so you stay in space.
The most efficient way to do this would be to just go sideways really fast from the ground. Problem: there's a lot of air in the way. So the right thing to do (again, both in Kerbal and in real life) is to initially go just up, but as the air reduces, gradually switch to moving sideways instead.
Then once you're up there, you need to know the basics of manipulating orbits. Lesson number 1: you can increase the height of your orbit on the opposite side of the planet by increasing your speed in the direction you're currently going. So once you get your apoapsis (highest point) above 100km or so, you can turn off your engines, wait until you get there, then burn hard in the direction you're going and your periapsis (the lowest point of the orbit) will raise up like magic.
If you are actually generating that much on adsense I think you'd likely make 20-30% more if you moved your ads to a high quality agency like adthrive or mediavine (those 2 are the only ones I'd firmly recommend over adsense). I'm sure you'd be approved by at least one of them based on the description of your site.
I used to use adsense on all my side sites too and just focus on the content and UX, but once you reach a certain volume (which you are well above), you are leaving a lot of money on the table not investing the effort into improving your ad monetization.
Blockchain was invented to solve one particular problem: distributed consensus on a sequence of transactions, where the choice of which transaction to include from a set of conflicting transactions is irrelevant.
The latter property here is key to understanding where blockchain is useful. It was created to solve the "double spend problem", ie. two transitions that spend the same coin but send it to different recipients (and so they conflict and cannot both be included in the canonical list of transactions). A double spend is the result of the sender either (a) making a mistake, or (b) attempting fraud. In both cases the important property is that as long as only a single of these conflicting transactions is included, the systems works.
Only if your problem exhibits the above property (and it's a distributed system) does using a blockchain make sense.
I am a high-schooler and I understand dependent types. (I am not the person you are replying to, but I am the person who first mentioned typed holes.)
In type theory, there are things called "terms" and "types." A term is something like `fun x -> x`, or `(1, 2)`, or `1 + 1`. In short, a term is basically an expression. Each term has a type. For example, `fun x -> x` might have the type `a -> a` and `(1, 2)` might have type `Nat * Nat`. If a term has a type, it "inhabits" the type. "x has type T" is written `x : T`.
According to an idea called the Curry-Howard correspondence, types can encode logical statements such that terms that inhabit the type are its proofs. Therefore, by inhabiting a type, you are really proving a theorem.
With dependent types, the term and type languages are the same language, so you can mix them together. In short, types are terms.
Here are some types in Martin-Lof type theory:
---------
() : Unit
The unit type is a type with one inhabitant, (). It can be interpreted as the type of the empty tuple. In the Curry-Howard correspondence, the unit type corresponds with truth.
x : A y : B
--------------
(x, y) : A * B
A * B, the product type of A and B, is the type of pairs. It is based on the Cartesian product operation: https://en.wikipedia.org/wiki/Cartesian_product In the Curry-Howard correspondence, the product type corresponds with logical conjunction (and). By proving A and proving B, you can prove A * B (AKA A /\ B).
x : Empty
----------------
elim-empty x : a
The empty type is the type with no inhabitants. If you have an inhabitant of the empty type, you can eliminate it to derive anything. The empty type corresponds with falsity, and its elimination means that false implies anything.
x : A
--------------
Inl x : A + B
x : B
--------------
Inr x : A + B
The sum type is the tagged union, meaning "this or that." The type A + B means that the Inl tag holds an A and the Inr tag holds a B. The sum type corresponds with logical disjunction (or).
f : A -> B x : A
-------------------
f(x) : B
The exponential type, the type of functions, means logical implication. If you have f, a proof that A implies B, and x, a proof of A, you can prove B by doing f(x).
With dependent types, the product and exponential types are generalized into the dependent sum (sigma) and the dependent product (pi) types respectively.
With the dependent sum type, or sigma type `Σ (x : A) B(x)`, the second component in the type is actually a function of A (meaning that B : A -> Type). The dependent sum type generalizes the product type so that the type of the second element of the pair is actually dependent on the first element. In the Curry-Howard correspondence, the sigma type corresponds with existential quantification, "there exists" or "for some." To prove Σ (x : A) B(x), you must inhabit A (show that A exists) and B(x) (where B is a logical statement about x, the inhabitant of A).
With the dependent product type, or pi type `Π (x : A) B(x)`, B is also a function of A that returns a type. The dependent product type generalizes the exponential type so that the codomain of the function, B(x), is dependent on the function's input, x. The dependent product type corresponds with universal quantification, "for all." To prove Π (x : A) B(x), you must prove B(x) for every possible x, where B(x) is a logical statement about x.
Why the dependent product type generalizes the exponential type and the dependent sum type generalizes the product type seems confusing at first glance, but it really isn't:
Σ for x = 1 to n, B(x) = B(1) + B(2) + ... B(n), and if B(x) is some constant C, Σ for x = 1 to n, C = n * C. The same rule applies to types, as multiplication (product) is just repeated addition (dependent sum of a constant).
Π is like Σ, but for repeated multiplication instead of repeated addition. After all, exponentiation is just repeated multiplication.
Another type that you should be aware of is the equality type, `x = y`. In vanilla Martin-Lof type theory, `x = y` is inhabited by `Refl` if `x` and `y` have the same normal form (reduce to the same simplest form). However, there is ongoing research in a field called Homotopy Type Theory, which redefines the equality type in various ways.
Finally, types themselves form a hierarchy. With dependent types, the types themselves are first-class values, and they have their own types. However, `Type : Type` leads to Girard's paradox (similar to Russell's paradox). The solution is for each type to be in a "universe" so that `Type l : Type (l + 1)`, where `l` is the universe level.
- I have a nanorc I copy around but am quite comfortable without. Here it is. You can use it under WTFPL-1. Feel free to fork it, but I'm not accepting pull requests.
set multibuffer
set nohelp
# Defaults for older versions
set nowrap
set smooth
set morespace
unset boldtext
- I use ^K and ^U to cut and paste lines of text.
- If I need to edit multiple files, I use ^R
- If I need to find and/or replace, I use ^W (find), optionally with ^R (replace) and/or M-R (Regexp).
- Sometimes I need to go to a specific line/col, so I use ^_
Basically there's nothing else I want my text editor to do. I want minimal magic.
> It's theoretically possible to build up so many extra cities that you can actually take a 30 minute break in the game
It's not only possible, it is fairly routine for good players. This is because of an interesting bug in the game which has a big effect on the skill progression of developing players.
The bug is commonly called the "810" bug, because on most machines it occurs when you get 810k points. It actually occurs at 800k + B, where B is the score that gives a bonus city, but 10k was by far the most common setting for B [1]. Hence, "810" bug.
What the "810" bug does is give you a huge number of bonus cities--something in the neighborhood of 170. Even if you just barely made to 810, with 170 cities you've got around 20 minutes of play even if you never save another city.
The typical progression of a developing player went something like this. Your first few games are overwhelming, and you die quickly. After a while, you are reaching the 6x stage (Missile Command has 6 levels of difficulty. It starts with, if I recall correctly, 2 stages of 1x, then 2 of 2x, then 2 of 3x, and so on, until you reach 6x, which is the top difficulty).
Once your games start regular reaching 6x, you go through a long stage of steady improvement, as you get better at 6x play. Things are still going pretty fast for you, and you have to rely on some probabilitic play. For instance, at this stage you probably heavily rely on using one of your slow side bases to lay out a line of missiles after you see the first incoming wave. You aren't really aiming at specific targets--you are trying to lay down a fence basically that will catch much of that incoming wave. Then you use the fast center base to deal with any smart bombs or any missiles that got through the fence that you don't think you can take with a side base. Then, if you haven't panicked yet and thrown away the missiles from the other side base, you do another fence to try to catch the second incoming wave.
When you are down enough on cities that you only have 2 or 3 at the start of stage, you probably switch to concentrating on saving those cities, so you depend less on making a missile fence, and more on trying to pick off specific targets that are coming to the cities. In particular, you are probably trying to make sure to save a city next to the center base, to make sure you stay alive.
You get better and better, so that you keep six cities on the ground, and even build up some in resevere, for longer and longer, and you start to put more thought into your missile fences--you start being able to recognize as soon as you see an incoming wave where there will be convergence points, and your fence starts to become not a solid line, but a few well placed obstacles.
Then the day comes when you manage to hold on long enough to reach 810. Your reward is at least 20 minutes in which you get to play 6x and cannot die. You can practice precise targeting, or practice making perfect fences, or practice using your side bases for things you would normally use the center for.
This practice is very fruitful. Next time you play, you will find you are noticeably better. You might not get to 810 on the next game, but you will in the next few games. And then after that practice, you'll find the gap to the next 810 even lower. Sometime in here, you'll find that you can regularly reach 810.
Now you start to get really good. All those immortal practice sections let you get to where you can pick off anything with any base, and you never need to use a missile fence because you can quickly see where to place the minimum number of shots to kill everything on the screen.
You will soon reach the point where you regularly get to 810 with 6 cities on screen and 70+ cities in reserve, so when you hit 810 and get the bug's bonus, you have around 250 cities. Congratulations, you are now at the "walk away and the machine plays itself for half an hour" milestone.
I don't recall for sure, but I think when you wrap the machine you hit another bug that gives you a bunch more cities. There's also some point in there, maybe on the second wrap, where the machine gives you two waves labeled 0x, where everything comes down real fast and you don't have any control, and then the machine starts over at 1x, but you still have whatever bonus cities you had left.
When you've reached this stage, the only limit to how long you can play Missile Command is your endurance and any bugs farther along in the game. Many players at this stage would just play to 810 (just to maintain their skill) and then walk away to go play some other game. I had a kid follow me around all day once picking up my Missile Command leftovers [2].
If you were lucky, an arcade in your area had a Super Missile Attack. That was a third party hack that modified Missile Command to go to 10x, and added an orbital laser platform. We had one in my area for a short time, and none of the Missile Command players (who all could 810 with essentially no cities lost on the way to 810) got past 100k on Super Missile Attack. It was wonderful.
Unfortunately, it was also very very rare. The company that made it was sued by Atari and stopped making them. (It worked out for them, though...Atari was impressed enough to hire them to write a couple new games).
[1] I think I only encountered a non-10k machine once. There was an arcade in Pasadena, California on Colorado Blvd named Pak Mann Arcade. It was a pretty good arcade. I went their one night a day or two before the Rose Parade, when Colorado Blvd is already full of people camping out for the parade. Pak Mann cranked all the games up to their hardest settings, so they had the Missile Command at 20k bonus. Even more brutal was Defender. Not only was it 20k bonus, and a couple less starting ships than normal, as soon as the first wave spawned, they went straight for the humans at high speed, and in about 10 seconds your planet was gone and you were in a space wave. Me and my friends were all "can play forever" Defender players at the time, and none of us got past 30k that night.
[2] Senior Ditch Day, Caltech, 1982. As a senior, I had to stay off campus all day. I went to an arcade and started playing Missile Command. Some kid, maybe 11 or 12, was watching. I hit 810 and immediately walked away to go play Defender. The kid happily jumped on the Missile Command to get some 6x practice. A little later, I left to go to another arcade. I noticed the kid was following me, at a respectful distance, trying not to look like he was following me. At the next arcade, I went to Missile Command, and the kid took up a position where he could watch. I 810'ed and walked away to play something else. The kid again took over. Long story short (too late, I know!), that kid followed me all day, to benefit from my Missile Comamnd leftovers.
Any steady stream of thoughts can stop me from falling asleep, even for hours. I have noticed that long hours of programming or programming late in the day causes this state of constant hard thinking that keeps me awake. It can be really exhausting and even end in nightmares.
I think that the article is right that you need to stop this cycle before falling asleep. Meditation practice helps to stop your thought cycle by focusing on your breath. As soon as your focus shifts away to any new thought, and you recognize it at some point, then you try to refocus again on your breath.
However, meditation is done while being fully awake. You want to feel the mind stopping, being fully present in the moment.
I found a trick, that helped me to find sleep, even when having the hardest thought cycles. Lying on the back in the bed, I try to focus on my breath just like being in meditation. After a while I slow down my breath as much as I can. It does not take long and I'm fully asleep.
I use this trick everytime I have a hard time finding sleep. And it always worked for me. Its one of the most valuable tricks in the toolbox of my life.
The must reads are Keny Dybvig's thesis, "Three Implementations of Scheme". The original lambda papers can wait until you read the Orbit paper, an optimizing compiler for T by Kranz, Rees et al.
The Lisp implementation bibliography pretty much runs through PL research like a vein. Some of the stuff you must read for Lisp are typically in "books"; Christian Quinnec's Lisp in Small Pieces is the most important work, but you will need a good foundation in denotational semantics (you can get by with the one chapter in the little book by Nielson and Nielson, "Semantics with Applications: A Formal Introduction".
Somewhere in there you will brush against various compilation methods and IRs for the lambda calculus, most importantly continuation passing style. Most semantics text introduce lambda calculus and its three rules, but none go in depth into this like the tall green book by Andrew Appel, "Compiling with Continuations", a good chunk of which can be read in Appel's other papers. Appel's work is MLish in nature, but don't let that stop you; most optimizing Lisp compilers are MLish down underneath anyway. CMUCL does very good type inference but gets short of implementing a full Hindley-Milner. Felleisen et al's "The Essence of Compiling with Continuations" might also come handy, though it's heavy on the theory. Andrew Kennedy continues the saga with "Compiling with Continuations Continued", this time CPS gives way to A-Normal Form, another IR. He describes the techniques used by a compiler targeting .NET.
Most compiling "meat" can be found in the bits-and-bytes type papers. Wilson's GC bibliography "Uniprocessor Garbage Collection Techniques" is a must, it should have been called "What Every Programmer Should Know About Garbage Collection". Not to be confused with Richard Jones' "the Garbage Collection Bibliography". Boehm's "Garbage Collection in an Uncooperative Environment" is sheer hacking bravado, perhaps second only to "Pointer Swizzling at Page Fault Time", which should introduce you to memory management for disk-based heaps (i.e. object stores) among other things.
Your start in hacking runtimes will probably be David Gudeman's "Representing Type Information in Dynamically Typed Languages"; this is where you learn how stuff looks inside the computer when you no longer need to malloc and free. A previous hacking of a Pascal dialect prepared me for this wonderful paper.
Implementations of runtimes are documented by Appel, for SML/NJ, Robert MacLaclahn's "Design of CMU Common Lisp" (also perhaps Scott Fahlman's CMU report on CMUCL's precursor, "Internal Design of Spice Lisp", but that confused the crap out of me as I don't know the machine architecture they're talking about.) You will also enjoy the Smalltalk research starting with L. Peter Deutsch's first optimizing Smalltalk compiler, documented in "Efficient Implementation of Smalltalk-80", follow the Smalltalk lineage btw, all they way up to David Ungar's "The Design and Evaluation of a High Performance Smalltalk System" making sure NOT to ignore Self and its literature, also spearheaded by Ungar (Start your Smalltalk hacking career with Timothy Budd's "A Little Smalltalk", should take you about a weekend and will absolutely prepare you for dynamic languages; a similar system is described by Griswold and Griswold, compiler, intermediate representation and VM, but that one is for ICON.)
Dynamic type inference and type-checking (TYPEP and SUBTYPEP, CLASS-OF, INSTANCE-OF, etc) you can learn a good chunk of how CLOS should look like to the runtime system from Justin Graver's "Type-Checking and Type-Inference for Object Oriented Programming Languages". He scratches the surface, and you should supplement this with a selection from Smalltalk and Self, though neither will prepare you for multiple-dispatch, for that peer into Stanley Lippman's "Inside the C++ Object System".
I have deliberately avoided "classics" on Lisp, compiler construction, optimization, and other stuff. None of the books and papers I have recommended are as popular as SICP, PAIP, or AMOP. Or even the popular PL books, like EoPL, van Roy and Haridi, both of which you should read by the way, but they're stuff that you need to read and understand to be able to implement a practical Lisp implementation, or at least satisfy your curiosity.
I just finished reading Sonya Keene's Object-Oriented Programming in Common Lisp cover-to-cover. There is a lot of stuff in there that just can't be covered by a brief guide for sure, a lot of pretty useful stuff, knowledge of which would quite possibly cause one to structure one's program in a substantially different way, if one knew about it. I highly recommend reading it if you intend to use CLOS for anything serious, and poring over the extended case study implementing streams is useful too, tedious though it might be. There are, however, some things that got changed between the printing of the book and the finalization of CLOS as we see in CL today. We write a describe-object method instead of a describe method, for example, and there is no longer a with-added-methods special form.
Very nice! A key attraction of pure Prolog code is indeed that it can be evaluated in different ways to obtain better termination and performance characteristics, and to derive more semantic consequences of programs. The equation "Algorithm = Logic + Control" indicates this ability to provide and use different ways of control to evaluate the logical meaning of Prolog code.
These slides seem to focus primarily, and in fact exclusively, on an important control mechanism called SLDNF resolution (SLD resolution with negation as failure):
There are other ways to execute Prolog code too though, notably SLG resolution. SLG resolution is also called tabling, and it is provided by an increasing number of Prolog implementations and also in different variations. SLG resolution is a control strategy that terminates in many cases where SLDNF resolution does not terminate.
For example, the Prolog predicate
a :- a.
terminates (and fails) with SLG resolution, but not with SLDNF resolution.
Note that Prolog is restricted to least Herbrand models. This is different from answer set programming (ASP), where you get all models. For example, the above program has two models in ASP (and in logic) namely one where a is true and one where a is false, but only one least Herbrand model (where a is false).
A common criticism of Prolog is that is "confined to depth first search", but that is not the case: You can reason about Prolog code in many different ways, and also execute it with iterative deepening, for example. This is in fact a major attraction of Prolog, not available in most other programming languages.
However, the more impure constructs you use, the more this flexibility shrinks. The ability to use different control mechanisms is a strong argument to use, as far as possible, the pure monotonic core of Prolog.
Here is a trick for learning how to make your hand “hover” that I was taught by the old pros when I joined the animation industry.
Take a pencil. A wooden one, not a mechanical one. Sharpen it. Then hold it so that the entire side of the exposed cone of graphite touches the paper, rather than the tip. Your thumb will be on one side of the pencil, with all four fingers in a row on the opposite side, rather than sort of clustered around the front of the pencil.
Now try to draw some lines. You will get very broad lines and probably have little control, because this grip forces you to keep your wrist still, and gives you very little room for your fingers to move the pencil either. It will feel very weird and awkward at first! You’ll have to make a bunch of big, broad motions because you’ve probably never tried to make fine motions like this with your arm in your entire life. It’s okay, you’ll get better!
A great way to get better: take a piece of paper, and draw a circle in the upper left corner, just barely touching the edges of the paper. Don’t worry about making a nice circle, don’t go over it multiple times, just make one simple sorta-circular gesture. Now move to the right and draw another circle, just touching the first one and the tip of the paper. Repeat for a whole row, then do another row that just touches the bottom of the previous row, until you’ve filled the whole page.
Your circles will probably look better by the end of the page. I did this every morning as a warm-up for one of my first animation jobs, and the circles got a lot better, and tighter, over the course of not much time.
Once you have learnt this, you can easily transfer this new control of your arm motions to tools held in other grips. I mostly work digitally, and have to address the tablet with the stylus’ tip for it to register, but I still move my arm with the fluidity learnt from this exercise.
As a bonus this is also a lot better for your arm. Keeping the Carpal Tunnel Syndrome Fairy away is very much a thing grizzled old animators wanted to teach the new kids coming in, they’d seen great careers cut short by injuries.
(You could also probably keep holding the pen in a more vertical fashion and use a wrist brace to keep your wrist from moving, if this is all too damn weird for you.)
> You can only imagine how, in command mode, he effectively summons the textual equivalent of the Nine Hells of Ba'ator upon my open documents if I am not quick enough...
No extra tooling, no symlinks, files are tracked on a version control system, you can use different branches for different computers, you can replicate you configuration easily on new installation.
The workflow I teach the newbies at my job is to just do a quick `git checkout -b tempbranch` when trying out: complicated rebasing, cherry-picking commits, whatever.
Do your thing on that temporary branch. Did the thing work? Hell yeah. `git checkout - && git reset --hard tempbranch`. Presto magic, your old branch is now a perfect replica of tempbranch. It's as if you did the whole thing on your original branch to start.
Did it fail miserably? Oh well. `git checkout - && git branch --delete tempbranch`. It vanishes and no one has to know that the rebase fell apart. Your working branch is in pristine condition still.
It's really hard to keep moving all the way up and down the stack.
The main advantage, to me, is being able to collapse layers of abstraction. We've done that at one of the startup's I CTO for.
On the backend, we eliminated the entire multi-tier stack we've come to know and love from the late 90s. The database, application server, caching server, and authentication server were collapsed down into a single executable.
We dropped TCP, and went to UDP + NaCl-based crypto. This in turn changed how we did session management and logins (we only target mobile devices), and allowed us to gain further performance by bypassing the Linux kernel entirely, and talking directly to the Ethernet hardware. Our wire-to-wire latency for a UDP packet (without any processing, just heading through the LuaJIT app framework) is ~26 nanoseconds. We measure the entire request/response time in less than 10,000 nanoseconds for updates, and frequently, less than 3000 nanoseconds for reads.
Nanoseconds! Today's hardware is crazy efficient, but yesterdays software architectures waste it all.
For caching, we employ a database library (lmdb) that uses the file system and memory mapping for all data, a lot like Varnish Cache does. We can service reads without a single malloc() call. No more memcached.
Finally, the entire system is built around an event streaming approach (see Disruptor/LMAX). For that to work, we wrote our own cross-platform Core Data alternative on the client, and for this particular application (a social network), the end result is that the user experiences ZERO network latency for everything but search.
Everything is local by the time the user is made aware of it. For updates like making a post, we make it happen instantly on the client, speculatively, and eventually the update makes it's way to the server and back, and then on to everyone else. The user doesn't even need a live network connection to post, and "offline" usage works as expected.
And also (since this is a speciality of mine), our object graph syncing is multi-device from day one. Lots of projects (e.g. Vesper) struggle with that one. We didn't bolt it on, it's a fundamental property of our data model and overall system.
After collapsing the crypto, database, and authentication, we were able to introduce object capabilities that requires zero memory lookups, instead of the usual RBAC, which in my experience is difficult to implement efficiently and hard for people to manage.
And because we have our own network protocol, and are targeting native clients we control, we can easily do client-side load balancing. We use consistent hashing to contact read-only servers (that also handle crypto), mostly so we can spread out the bandwidth without having a crazy expensive load balancer. Thanks to our custom database/app server, we can service all writes on a single machine, although like LMAX, it's designed to run three in parallel, discarding the results of two of them at any given time.
Point is, there are HUGE gains to be made when you (a) understand the whole stack, (b) have the skills and understanding to rewrite any part as you need to, and (c) have the guts to ignore 15 years of received wisdom on how to scale an app and are willing to collapse layers down in pursuit of speed and simplicity.
It may be harder, but the payoff is enormous. I did all of the above myself in just over four months, including building a cross-platform app framework for iOS (porting to Android in March), and the app itself.
Likelihood: given a probabilistic model and its parameters, what is the probability of observing some data under that model?
For example, given a coin with some probability f of getting heads and thus 1-f of getting tails (model parameter), a set of coin flips (data), and the assumption that each coin flip is independent of all others (model), the probability of seeing an arbitrary sequence with H total heads and T total tails after H+T flips is
p(H,T|f) = f^H(1-f)^T [0]
Note that this likelihood is normalized (i.e. sums up to 1) with respect to all possible sequences of flips of length H+T, e.g. for length 2, there are 4 possible sequences of coin flips:
(Also note that the likelihood is only a probability for discrete data; it’s a probability density for continuous data, since the underlying terms would no longer be probabilities but rather densities. In that case, replace the previous sum with an integral ranging over the entire domain of your continuous data.)
Partition function (usually called a "marginal likelihood"): what if we instead normalize the likelihood with respect to the model parameters? Then it would no longer be a probability distribution with respect to the data, but rather a probability distribution with respect to the parameters. The marginal likelihood is just this normalizing constant. In the previous example,
p(f|H,T) = p(H,T|f)/<constant that would normalize p(H,T|f) to integrate to 1 over all possible values of f>
You can optionally weight this likelihood by a prior p(f), in which case the numerator would be p(H,T|f)p(f), with the partition function updated accordingly.
>Also the above reveal there is more elaborate structure in bayes than just the idea of 'updating your prior with new information', which is simply a platitude among STEM folk
I could not agree more. The core philosophical tenet of Bayesian inference is that this re-normalization of the likelihood returns something probabilistically meaningful. The debate over the validity of priors pales in comparison to the debate over whether the likelihood ought to be treated as proportional to a probability distribution over the model parameters.
[0] Note that this is distinct from the probability of seeing any sequence with H heads and T tails. That would require normalizing with respect to the number of total sequences with H heads and T tails (H+T choose H), yielding the binomial distribution.
Sadly, this has facebook's new, horrible patent clause.
For those who aren't aware, it says
1. If facebook sues you, and you counterclaim over patents (whether about software or not), you will lose rights under all these patent grants.
So essentially you can't defend yourself.
This is different than the typical apache style patent grant, which instead would say "if you sue me over patents in apache licensed software x, you lose rights to software x" (IE it's limited to software, and limited to the thing you sued over)
2. It terminates if you challenge the validity of any facebook patent in any way.
So no shitty software patent busting!
"The BSD license says nothing about patents, so this project is in some sense safer than a "normal" BSD-licensed project (but not an Apache, EPL or GPLv3-licensed project). "
As I've mentioned in another common, this is both false and a common misunderstanding, because most engineers are not familiar with implied patent licenses.
BSD normally carries one. So you do in fact, get a implied grant. By doing this you don't, you only get the explicit grant. The implied grant is normally not revokable unless the underlying BSD terms are violated, whereas the explicit grant is revokable for other reasons.
The reason people use explicit patent grants is to avoid getting into some unsettled law. Particularly, the sublicensability of implied patent licenses is not clear. The TL;DR of this is "the answer is clear if you get the software directly from person owning the patents. If you get it through someone else, like say, a linux distribution, it's not clear what happens".
The related doctrine of "patent exhaustion" that plays into this is also not settled, and currently the subject of a Supreme Court cert petition by Cisco.
A little art and plenty of science. The kernel matrices can be broken down logically once you know what the numbers are operating on. Considering a 3x3 kernel, the center of the kernel matrix is the origin pixel, and the kernel elements around the origin are the neighboring pixels in their respective directions and distances.
The identity kernel is [0 0 0; 0 1 0; 0 0 0]. For every input pixel, the output is the original pixel value. Don't change the pixel value based on what is around it.
A simple blur kernel would be 1/9 * [1 1 1; 1 1 1; 1 1 1]. The output for each input pixel is the average of the origin's pixel value and all of its neighboring values, with an even weighting. A less dramatic blur can weigh the origin pixel higher than the neighbors, such as a gaussian blur. This will result in the output pixel being more similar to the origin pixel, than to it's neighbors.
Edge detection like [-1 -1 -1; -1 8 -1; -1 -1 -1] can be understood by multiplying the origin pixel value by 8, and subtracting off all of its 8 neighboring values. If the values are all fairly similar, lets say all gray, your output will be black. 8A - 8A = 0. So it "punishes" pixels that are similar to its neighbors. When a pixel is different than some or all of its neighbors, you will be left with some value > 0 at the output, which detects a change from its neighbors: an edge. Horizontal edge detection: [1 0 -1; 2 0 -2; 1 0 -1]. Ignores the pixels above, below, and center, but accentuates the differences between what is on the left from the right.
I think there’s lots of things about elisp the language which make it easy to hack on too. In no particular order:
1. Dynamism+late binding+late failure means that, although it is hard to write fully “correct” code (and correct is ill-defined/impossible), it is very easy to write code that does work in the common cases without having to care about the uncommon cases.
2. Dynamism makes it possible to extend existing code with more functionality. You can get at any function and easily add your own code to run before/after/around it, or just redefine it to do what you want.
3. Macros make it much easier to do a lot with a little and mean there are far fewer language limitations than in other PLs because one can usually macro around them. E.g. emacs didn’t support lexical scope natively until recently but for a long time supported it with cl macros.
4. Dynamic scope makes global scope local, which means that having lots of globally scoped variables (for various eg options) isn’t so bad so there are lots of these and these then allow one to tweak the behaviour of functions by letting these variables. (e.g. if you have a function that determines some arguments for a program foo and then runs it, you can run that program remotely by doing (let ((foo-executable "ssh") (foo-other-args (append (list "some-host" foo-executable) foo-other-args))) ...)
5. Many internal data structures are simple arrangements of lists, symbols, and strings that are easy to inspect and easy to modify. They are so easy to modify that many come with a spec of their structure that is automatically turned into a gui to modify it.
6. Buffer-local variables are a super convenient way of attaching state to a buffer and a buffer is a super logical place to attach a lot of state to. There are some caveats, e.g. often one wants buffer groups of some kind but these are much less common in the simple case.
7. Symbols are much better than strings. They are faster, feel more natural, and are much easier to attach things to. It seems weird to imagine JavaScript having some global object mapping nearly every string to an object listing various different properties of that string but symbol plists tend to feel pretty natural and are a handy way to extend things.
8. It’s reasonably easy to not have independent things stepping on each other’s toes. Typically if you can write down a unique symbol for your thing then it won’t really conflict with other things. You can use it for global/buffer-local scope. You can use it for a property in symbol plists without stepping on other properties.
9. There are hooks for extension built in all over the place.
10. Your code gets to do things instantly. With many languages you can’t do the useful thing you wanted to do until you can get the data in/out of it which is often a lot of work. It’s even more work if you want something interactive, and even more work if you want something better than curses. Even for an “easy to get going” language like JavaScript you can’t really build on what’s come before. You typically have to start by building up most of the environment you want to work in by yourself first. In emacs it’s already there and you can use everything that’s there for free. There is no building up your environment because your environment is the one and only environment.
11. Essentially anything you can do with keys is just a regular elisp function called interactively. It is easy to make a function a command by telling it what things you care about which means that most commands correspond to functions that take extra arguments which are useful when calling non-interactively.
12. Nearly everything is native. The things that aren’t in elisp are typically only very fundamental or boring things. And all this code is on your machine, emacs knows where it is and you can read it. If you want to understand something you can jump to definition and have a look. You don’t need to wait for the package maintainer to fix things, just write the fix yourself and stick it in your .emacs. In other editors/systems the bug might be much deeper. You don’t want to have to build everything from source to be able to fix bugs yourself. In emacs you are effectively building everything interesting from source (except without really doing any work most of the time thanks to the abomination of unexec) so it easy to hack on.
13. Emacs has lots more experimental things than other editors/languages because the users write them (to scratch itches) and accept them. One can wonder why an emacs user might be more likely to accept rough edges than a user of (say) vscode. One could stereotype that the typical emacs user is more used to code that is rough or buggy, or that the typical vs code user is more used to code that doesn’t do anything useful. Or maybe it is just that emacs being rough around the edges is the price one must pay for the smug superiority of using what is (objectively) the best system for hacking for hackers by hackers. One could suggest that some of these arguments also apply to vim users accepting an editor rough around the edges and this is partly true but despite the larger number of people who use vim, there seems to be quite different work done there (typically it’s focused only on editing and much less on big complicated useful things like magit/org/calc). Then again I’m pretty sure writing vimscript is a fate worse than death so that is probably why.
How to Measure Anything is a fantastic book. Here are the most significant insights you learn in the book
- how to measure anything; Hubbard actually comes through on the promise of the title - after finishing the book you will truly feel that the scope of what you can measure is massive. He does this by a change in the definition of what it means to measure something, but you realize his definition is more correct than the everyday intuitive one.
- value of information; Hubbard gives a good introduction to the VOI concept in economics, which basically lets you put a price on any measurement or information and prioritize what to measure
- motivation for 'back of the napkin' calcs; through his broad experience he has seen how a lot of the most important things that affect a business go unmeasured, and how his approach to 'measuring anything' can empower people to really measure what matters.
Reading this book provided one half of what I have been searching for for a long time - a framework for thinking about data science activities which is not based on hype, fundamentally correct and still intuitive and practical.
A fun and fairly simple project, with a surprisingly high ratio of usefullness to effort, is to write an interpreter for a concatenative language. Concatenative languages, like FORTH, can do a lot with very limited resources, making them good candidates for embedded systems.
If you want to play around with making your own concatenative language, it is actually surprisingly simple. Here is an overview of a step-by-step approach that can take you from a simple calculator to a full language with some optimization that would actually be quite reasonable to use in an embedded system.
So let's start with the calculator. We are going to have a data stack, and all operations will operate on the stack. We make a "dictionary" whose entries are "words" (basically names of functions). For each word in the dictionary, the dictionary contains a pointer to the function implementing that word.
We'll need six functions for the calculator: add, sub, mul, div, clr, and print. The words for these will be "+", "-", "x", "/", "clr", and "print". So our dictionary looks like this in C:
We need a main loop, which will be something like this (pseudocode):
while true
token = NextToken()
if token is in dictionary
call function from that dict entry
else if token is a number
push that number onto the data stack
Write NextToken, making it read from your terminal and parse into whitespace separated strings, implement add, sub, mul, div, clr, and print, with print printing the top item on the data stack on your terminal, and you've got yourself an RPN calculator. Type "2 3 + 4 5 + x print" and you'll get 45.
OK, that's fine, but we want something we can program. To get to that, we first extend the dictionary a bit. We add a flag to each entry allowing us to mark the entry as either a C code entry or an interpreted entry, and we add a pointer to an array of integers, and we add a count telling the length of that array of integers. When an entry is marked as C code, it means that the function implementing it is written in C, and the "func" field in the dictionary points to the implementing function. When an entry is marked as interpreted, it means that the pointer to an array of integers points to a list of dictionary offsets, and the function is implemented by invoking the functions of the referenced dictionary entries, in order.
A dictionary entry now looks something like this:
struct DictEntry {
char * word;
bool c_flag;
void (*func)(void);
int * def;
int deflen;
}
It’s also not absolutely necessary to use Bayesian reasoning to come up with a concrete final answer, but still allow for scientific style “we have a significant discovery, but it may not change our view from the status quo just yet” types of publications—significant changes in probability which nevertheless do not make unlikely things likely or vice versa.
So there is a very convenient way to view Bayes’ theorem by looking at odds ratios, Pr(A)/Pr(A'), where A' is the complement of A (the event which obtains precisely when A does not), which are equivalent to probabilities but span the range from 0-∞ rather than the range 0-1:
In other words the odds after obtaining some evidence E are a multiplicative factor times the prior odds.
Taking the logarithm, one essentially can measure the impact of any new evidence in decibels as a linear result; the core of Bayes’ theorem can be stated as saying that evidence is measured in decibels.
For example in the breast cancer example in the original article, you have a +13 dB evidence—Pr(E|A)/Pr(E|A') = 20—but the prior log-odds are given as -20dB and thus the shocking conclusion is that after the evidence your log-odds are still negative, -7dB, and you likely do not have cancer. For a life-or-death situation, resolving the relative nature of evidence to a specific probability is clearly essential.
But in terms of p-values, if you have an experiment which gives +20dB evidence to something, I would say that this should still be meaningfully publishable as a relative increment even if it hits something whose prior is -30dB and therefore it still is only a ~1/10 chance that the thing is actually true.
Like, I think that the key is having a vocabulary for relative results versus absolute results and I think that the Bayesian language can be used to take one step towards that, but perhaps it would be nice if we could have a standardized way in our language to indicate that numbers are relative rather than absolute, the way that we do habitually with decibels and pressures and such all the time...