Our success rate for extracting the reference information from papers in some categories, notably math and cs, is still pretty low. This means we can't place these papers very well and also that their radius is not a good representation of their true number of citations. Furthermore, from a distance these categories appear to lack structure. If we don't have references for a paper, we use its keywords to place it, in which case it can certainly be misplaced. Improving the reference extraction means building a better database of the journals available to that field coupled with more robust regex.
Thanks, good point, the colour-key box should change when the "age" colour scheme is selected. For now: the newest papers are red and oldest papers are gray, and, as you may have noticed, the gradient from gray to red is 1-1 but not linear with age. For a bit more of an explanation see this post: http://blog.paperscape.org/?p=60
We use HTML5 canvas elements to draw pregenerated tiles (bitmaps) together with interactive overlays and underlays, such as the outlines and halos you see when you click on or search for papers. The tiles are redrawn each morning (using cairo) after the map has been updated with the new arXiv papers for that day.
Is there any long-range order? For example, am I allowed to deduce from the layout that mathematical physics is "the" bridge between condensed matter and high energy physics? Would I also be able to use this graph to plan my career (e.g. working on topics located at strategic points)?
The interface between hep-th and cond-mat is quite a mixed bag, and math-ph is quite diffuse (the label just represents its centre of mass so to speak). So I'm not sure if I would call math-ph the bridge between the two, but the map is based on simple principles/forces so it's very open to interpretation.
I would say that hep-th is the bridge between a lot of the other categories, namely hep-ph, astro-ph, gr-qc, quant-ph, cond-mat and math-ph.
Also the various interfaces between different categories are examples of long-range structure. For example the field of dark matter is located where hep-ph meets astro-ph, and the field of cosmological inflation occurs at the interface of hep-th, astro-ph, gr-qc and hep-th.