For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | andikleen3's commentsregister

An eternity ago a particular Linux kernel developer gave a series of "Why user space sucks" talks at Linux conferences. It was just going through strace logs and making fun of particularly bad examples.

Here's one instance: https://www.kernel.org/doc/ols/2006/ols2006v1-pages-441-450....

At some point I stared a lot at instruction level traces with Processor Trace or LBRs and usually I found some bad sequences in many programs, where things are implemented very inefficiently.


That talk is brilliant, and matches my experience unfortunately..


Do you know if there are videos of those talks posted anywhere?



You just have to be very careful with the algorithms in that paper, they had some serious problems (apart from their basic inability to deal with faster links). I like this old but fairly damning analysis from an early Linux TCP developer:

https://ftp.gwdg.de/pub/linux/tux/net/ip-routing/README.rto


I can confirm the issues with formal methods.

I was working on a new type of locking mechanism and thought I would be smart by modelling it in spin [http://spinroot.com], which has been used for these kind of things before.

I ended up with a model that was proven in spin, but still failed in real code.

Given that's anecdata with a sample size of 1, but still was a valuable experience to me.


I maintain a tool that reports errors on Linux systems with ECC memory, and also a website describing the tool (http://mcelog.org). I wrote about patterns in the access logs correlated to time some time ago in my blog:

http://halobates.de/blog/p/181


> I wonder if it’s possible to detect solar flares in these logs. Need to find a good data source for them. Are there any other events generating lots of radiation that may affect servers?

Cosmic ray showers [1] potentially. My prediction is that there should be "clusters" of bit errors every few minutes or so (given sufficient amount of servers...).

I was talking to a sysadmin at a local company running a few thousand servers about getting their ECC error logs in order to look for these, but scraping them apparently wasn't trivially managable for them.

[1] - https://en.wikipedia.org/wiki/Air_shower_(physics)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You