For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | TomLimoncelli's commentsregister

I'm not sure how email would help. I guess you mean the system would email you once a day saying that the monitoring system is working and if you don't see the email you know to check into it. The monitoring system I use has a higher SLA than 24 hours.

Usually folks divide the monitoring work among two servers and each server monitors the other. Or, you "meta monitor"... a monitoring system that just monitors the monitoring system. Then you get a third-party to monitor that. Then it is turtles all the way down.


It sounds like both of you are talking about cloud stuff when thinking about datacenters. I can see how that many layers and cross-checks would be necessary when all you really control is running memory and some pieces of storage, but a lot of that is due to the platform. When you control the actual metal, third-party monitoring services are much less necessary.

For a real DC, when it goes down I get a phone call from a human. I don't have to reinvent that process. If it's my own server room, I use a landline and a modem for OOB "dude you gotta come down here" notifications, a WAV of Woody Woodpecker or something. If the phone lines are down, I look at the newspaper headlines to see what happened.

There's no reason not to set up a standalone monitoring regime. Whether or not you use heartbeat notifications to tell you all is well is a matter of taste, but there is definitely more to maintaining your nines on a daily basis than simply adding more layers of monitoring.


"Sysadmins and devs shouldn't really care if they have five generations of hardware."

I wish it was true. sadly there are some services where scale and latency is so carefully measured that individual software releases are rejected if performance gets worse (or unacceptably worse, etc). In these situations you need to test on all hardware platforms. It is much better to have fewer platforms: Optimally: the one you are migrating off of, the one you are moving to.

For desktops... have you ever tried to maintain an Windows or Linux desktop environment with more than 4 "standard desktop configurations"? It becomes a nightmare. If you have a single "gold image" you blast to all machines it makes the task harder; if you stay with the vendor's OS and try to maintain it "forever" it is even worse.

One thing that makes virtualization a "win" is that the virtual box looks like a single hardware platform. It reduces testing, etc. However, then you still need to test the virtualization software on all hardware platforms... so you've made things easier for everyone but that team.


Yes, very constructive.

A more detailed reply: 3) really? the core of devops is to be data-driven in your decisions. how can you decide if you are maintaining the right uptime if you don't measure it?

8) I agree with you, as does the text. I think you may have reversed what I wrote.

9) The good ones write so that they "think before they do". On a larger team it is important to communicate what you are about to do, or what you have done. I prefer to write mini design docs. The team I'm on does this and I like it so much I want to spread the word.

11: Again, the team I'm on does this and it works so well I want to spread the word. I see I need to expand this out to explain why, not just how.

14: I'll clarify that the point is not to make big changes on your production system. Whether it is qa+live or a zillion steps including dev, qa, UAT, pre-prod, canary and prod. As long as it isn't zero steps.

22: refresh policy: This is for PCs (non-servers). I'll clarify.

23: The last part makes your point. I'll rewrite to make it more evident

28: "Anti-malware? Really? In 2011? I'm sure you have a whole blog post". Yes, I do: http://everythingsysadmin.com/2011/04/apt.html Thanks for the reminder to add a link! (and if you are blown away that I had to list this, you can imagine my surprise about finding sites that violated this one!)


I dunno, that felt like bragging.

By the way... I'll be teaching a class based on The Test at Usenix LISA in December.

Also... I'm working on an edited/corrected update to The Test and I appreciate all the feedback I'm getting here on HN!


I read it this January after hearing it mentioned so many times. I was very impressed. My impressions were: 1. Ah, that's what it would have been like to attend a GOOD university! 2. Much of the systems knowledge I learned the hard way are clearly explained here. This would have jumpstarted my career by 10 years. 3. No current first-year student would sit through this. Though, it should be required reading for seniors.


Yes, someone may have joined the network at your old IP address but that's ok. That first ARP is going to determine if that has happened already.

Am I right?

The author should try the same ethernet sniffing experiment but put a machine at the old address and see how the algorithm adapts.


OH MY GOD! THIS GUY WANTS ME TO CHANGE THE WAY I WORK!

I"VE BEEN WORKING THE SAME WAY FOR 20 YEARS

YOU CAN"T MAKE ME CHANGE!

CHANGE SCARES ME!

20 YEARS AGO MY TERMINAL DIDN"T HAVE LOWERCASE, COLOR, OR WINDOWS. WHY HE WANT ME TO CHANGE!!?!?

(Damn, I've never seen a group of people so change-adverse and closed minded about trying new things as the software developers that are in the business of making new stuff! Geeze! Download it and give it a try before you post a comment!)


As the co-author of TPoSaNA I am flattered but for the question at hand, I'd recommend a different book. Time Management for System Administrators from O'Reilly is more in line with what he's looking for. http://www.tomontime.com/ and http://www.amazon.com/o/ASIN/0596007833/tomontime-20 for more info. (Of course, since I wrote this other book, I don't mind recommending it :-)

--Tom


I worked at AT&T / Bell Labs when those commercials were flooding every show on TV.

Bell Labs was upset because AT&T made those commercials without consulting us. A PR company thought up all the ideas. We had zero projects internally working on such products.

That's why those products came from EZPass, not AT&T; Skype, not AT&T; Apple iPad, not AT&T.

Even thought it wasn't AT&T that brought those things to market, nearly all of them do exist today. It is a beautiful thing. I feel lucky to be living in what my co-workers call 'The freakin Buck-Rogers-would-be-jealous future'.

Tom

PS. Oh, and the one where they guy has a Dick Tracy-style video-phone on his watch? Well, soon after AT&T bought McCaw Cellular my friends in the cellular phone communications research area were asked to work up an explanation of why such a thing can't exist and isn't likely to exist any time soon. It turns out that after AT&T bought McCaw they (McCaw) was very unhappy to learn that the wrist-phone was a figment of a marketing person's imagination and not something actually being developed at Bell Labs. Ooops. I hope they didn't let themselves get bought by AT&T just because they thought we had that product in the wings.


Tom: I was also at the Labs (Research @ MH) when these spots were made.

We were consulted by the PR people before they were shot.

As far as I know, most "predictions" in those spots were based on real technologies and demos that were running in the Labs at that time. My dept was directly responsible for two of them.

I do not recall the Dick Tracy watch but the projects behind the books on-line, video-on-demand and "EZ Pass" were being done at HO and MH.

Others were products or concepts in the pipeline (e.g. fax from a tablet -- remember GO/EO?, tickets from a cash machine -- NCR ATMs).

Some things shown were straight line extrapolations from core technologies that had existed in the Labs for some time (e.g. driving across the country without needing a map, video telephony, packet voice/video, etc.)

Much of the really interesting work from Research at the Labs never made it into real products from AT&T due to various political, business and regulatory issues.

Many things later got "reinvented" by other firms that were better able to capitalize on the innovation.

It has been said that any company that can afford an organization like Bell Labs Research will ignore it.

That was definitely the case for Bell Labs, Xerox PARC, etc.


> That's why those products came from EZPass, not AT&T; Skype, not AT&T; Apple iPad, not AT&T.

Fascinating explanation. I never understood the relationships between most of the ideas in those commercials and AT&T. It's actually awesome to see these commercials again and realize how many far fetched ideas have come true. I remember thinking that most of the things in the commercial were nonsense!

Looking back it's more interesting how little AT&T has had to do with any of those things. I wonder how influential these commercials were on the engineers today who actually built those things (EZPass, Skype, iPad) or are the presence of these technologies an inevitable extension of the direction technology was going to go anyway?


I was at a distance learning software company that had AT&T as a client. I always wondered if there was a connection to the commercial. We were connecting classrooms with video and shared multimedia content. I even made a "Jazz" course but I think it was later.

Around 1998, "The Internet" changed our market so we rushed to rewrite the software to work on it. Up until that point, it wasn't really on our radar.


The zillions of little companies with this kind of problem disagree.


It's truly common sense for anyone that has racked a server to consider heating conditions in a room. If not, they shouldn't be allowed to rack anything!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You