For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | bcleary's commentsregister

I know javascript and jQuery suffer from their popularity and utility I think in this analysis. There is also the issue of jQuery and javascript tags being used by people as synonyms. jQuery particularly is a 500 pound gorilla over all the dataset, it shows up everywhere. As I mentioned above there was a joke on SO at one point that the answer to nearly every question was jQuery with a link to the site. Apparently it wasn't a joke.


No just the post bodies (question and answers) at the moment, but we are working parsing the comments and the post history, those datasets are about 4 times the size of the posts! So there are probably a lot more URLs in there, although we will have to decide if we treat all URLs the same or if we differentiate between URLs in post bodies contained in the dump, URLs in the post history (that may have been removed from the post) and URLs in the comments. Not maybe a concern for the website, but more so for research.


You usually see notes or clarification posted to questions in the form of comments first, where the types of links are more introductory, general purpose, than specific as you might find in answers.

Can't wait to see the updated stats.

If you could make "more" load more than just a few more records, though, that'd make it a lot easier to dig deeper.


Thanks. Yea i agree our initial use case was to provide an interface to the dataset and to allow us to explore what kinds of things developers were sharing on SO. For the next version we are working on new ranking metrics that will improve the discover aspect, vote:cite and view:cite are 2 we are looking at.


Looking forward to v2!


Wow - ok looking into that now. What version of ios?


The latest, iOS 6.1.3


Thanks, yea second most requested feature after the domain filter. Will hopefully add soon.


Thanks, yes we are actually working on a paper to that effect at the moment.


Thanks. Yes the domain filter is a great idea, will go to the top of the feature request list.

The site is c#, asp.net mvc, with a javascript front end, backed with sql server 2012. And running on AWS.


Thanks for the support. Yes to everything above :) We are currently mining the post history data to be able to do those kinds of time range queries. Cant wait to get that out there it should be very cool indeed, also want to allow users to search by their SO id and to filter their links by tag. (As an aside when we do mine the history we will be able to get more accuracy on which users actually posted which links rather than just the post owner.)


So these are taken from the March 2013 data dump, which includes questions going right back to the start of SO. So some of these links have been collecting citations for a few years. We only mined the actual post content on the date the dump was created, we did not mine the post history. But we are working on that right now, its a lot of data to process :)


I actually had that in an earlier version and took it out just to simplify the design but i am actually looking at this again to produce a better sorting experience. Also the number of views a post receives may be a good metric also.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You