For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | napkindrawing's commentsregister

I work in the entertainment / ticketing industry and we've been burned badly before by relying on AWS' Elastic Load Balancer due to sudden & unexpected traffic spikes.

From the article: "Elastic Load Balancer (ELB): [...] It scales without your doing anything. If it sees additional traffic it scales behind the scenes both horizontally and vertically. You don’t have to manage it. As your applications scales so is the ELB."

From Amazon's ELB documentation: "Pre-Warming the Load Balancer: [...] In certain scenarios, such as when flash traffic is expected [...] we recommend that you contact us to have your load balancer "pre-warmed". We will then configure the load balancer to have the appropriate level of capacity based on the traffic that you expect. We will need to know the start and end dates of your tests or expected flash traffic, the expected request rate per second and the total size of the typical request/response that you will be testing."


You'd be surprised about how many people don't know this. I had an expectation to scale past 1B users. I was trialling AWS when I realised through testing that it was this way. It could not deal with sudden spikes of traffic.

Suffice to say, I went elsewhere.


A billion users? Are you Facebook or the Olympics?


Neither. But once you start doing something like serving ads. The paradigm shifts. Of course, what I do is a lot more intensive/complex. But I'll say this to get the basics across.


It doesn't take facebook. I'm in a small adtech company. Tens of billions of requests a month is not unexpected.


> Tens of billions of requests a month is not unexpected.

10000000000 / (60 * 60 * 24 * 30) = 3,858 req/sec. That's a pretty good clip.


That's a small adtech company. The larger ones do that per day with some over 50B/daily.


Yep. I spent some time working for one of the largest.


We see 10,000 req/sec on a regular basis.


It's not always _users_, but requests. As companies embrace microservices, I think you'll see a moderately sized application pushing tons of requests over HTTP that would normally have used a different protocol


Where did you go, if you don't mind expanding?


I don't mind. I went with dedicated hosting. I found a supplier which had their own scalable infrastructure. They already had clients which had ad server type applications that scaled into the Billions and could handle traffic spikes. With that type of setup, it was a no brainer.

I'm a sysadmin with over 10 years with Linux. So for me to setup and support servers is pretty trivial.

The agreement I had with the supplier. They managed the network and hardware 24/7. I managed the setup and support of the servers from the OS up. This arrangement worked well and I had zero downtime.


> I went with dedicated hosting

This doesn't get mentioned as much as it should but there are VPS/dedicated providers who are very close to AWS DCs.

Enough so that for many use cases you should have your database in AWS and your app servers on dedictated hardware. Best of both worlds.


Can you share a list of providers that are close to AWS DCs?


Pretty much any data center in Virginia will be close to US-EAST. If you contact them for setting up direct connect pipes they'll also provide you with a list of locations to check out.


You'll have to compare regions depending on providers. Softlayer has pretty good coverage with matching regions and low latency.


  I don't mind. I went with self hosting. I found a supplier 
  which had their own scalable infrastructure.
That's a little vague. By "self-hosting" you mean Linux VMs, like EC2, right, or something more abstracted than that? What supplier?


Sorry, I just updated the post. I meant dedicated hosting. So bare-metal machines.

If you want to know the supplier. They are called Mojohost.

http://www.mojohost.com/


When you need performance, bare metal is always the way to go.


This saying holds such little value for so many engineers. They want uptime, ease of management, and security.

Most people aren't worried about squeezing another 3% performance out of thei servers. In fact I would say the slice-and-dice nature of VMs allows for better overall capacity usage because of over provisioning of resources. How many apps do you know that hover at 0.07 load all day long?


Okay, how's this:

"If you're willing to pay up to a 40% premium for the features cloud providers provide, pay them. If not, go bare metal."


Fair enough.


All they say is it costs 125$. 125$ for what ? They do not mention the specs of the hardware in their website.


If you hadn't been a sysadmin, would still have chosen dedicated hosting? (Given that you have serious scaling requirements, of course). In other words: Would it be realistic to say that a service like Elastic Beanstalk saves on hiring a sysadmin?


Sysadmins / operations people should be able to handle anything below an OS better than your usual devops guys that would be able to build you a variation of EBS and their value further depends upon if your software has special needs that are not suitable for cloud / virtualized infrastructure.

I've heard of many start-up companies save plenty of money using dedicated hosting even without any operations / sysadmin pros around scaling to millions of users when the equivalent in AWS with relatively anemic nodes fared much better. In fact, WhatsApp only had a handful of physical servers handling billions of real users and associated internal messaging and they had developers as the on-call operations engineers.

I'm an ops engineer / developer and I'd use dedicated hosting if success depends a lot upon infrastructure costs. For example, if I started a competitor to Heroku at the same time they did, I'd definitely be having a very careful debate between dedicated / colo hosting and using a cloud provider tied intimately with my growth plans. Many companies have shockingly bad operations practices but achieve decent availability (and more importantly for most situations, profitability) just fine, so even the often-cited expectations of better networks and availability zones may be worth the risks of not caring that much.


We went to Softlayer with their smallest instances running Nginx to load balance everything. Much faster and cheaper.


Why in the world would you assume any off-the-shelf solution would serve a billion users?

Unlike many cloud providers AWS can be setup to serve a billion requests but you need to think that mess out from start to end. You can't setup an elb, turn on auto scale and then go out to lunch.


Why not? That's exactly the use case, if you dont need to prewarm for bursty loads. It'll just be extremely expensive.

Also, as another comment here says, I believe a billion "users" is more like "requests" as users is vague and undefined. A single person could launch 1 or 100 requests depending on the app.


What other vendor did you go with and now looking back was it worth it from a cost & operational perspective?

Why not work with AWS to mitigate such risks now that you know more about ELBs?


This might be of interest, Netflix pre-scales based on anticipated demand: http://techblog.netflix.com/2013/11/scryer-netflixs-predicti...


After testing ELB and seeing the scaling issues, we ended up going to a pool of HAProxies + weighted Route53 entries. Route53 does a moderately good job of balancing between the HAProxies, and the health checks will remove an HAProxy if it goes bad. HAProxy itself is rock solid. The first bottleneck we came across was HAProxy bandwidth, so make sure the instance type you select has enough for how much bandwidth you expect to use.


Do health checks work within a VPC? My understanding was they don't, so this only works for externally facing services.

I agree Haproxy is solid, but ELBs are wonderful for internal microservices.

If you do decide to use Haproxy for microservices internally, I highly recommend Synapse from AirBnB: https://github.com/airbnb/synapse


Ruby, High Availablity and High Scalability? Despite idempotency, I'm not sure how comfortable I am with that.


Synapse is a service discovery framework. Essentially, it just writes HAProxy config files based on discovered upstreams - it does not receive any requests itself. The scalability is handled by HAProxy.


I was under the impression that HAProxy is what it is powering Amazon's ELB service.


I wish Amazon would switch to a 'provisioned throughput' model for ELB like they have for DynamoDB, where you say what level of throughput you want to support and you're billed on that rather than actual traffic. Then they keep sufficient capacity available to support that service level.

So if you expect flash traffic, you just bump up your provisioned throughput. Simple and transparent.


You can contact AWS support if needed, and they'll warm up the ELB ahead of time.

http://serverfault.com/a/321371

https://forums.aws.amazon.com/thread.jspa?threadID=76834

It's not perfect, but works in a pinch.


That would be a very cool offering.


Another gotcha is that ELB appears to load balance based on the IP addresses of the requests... We had private VPC/IP workers talking hundreds of requests per second to a non-sticky-session, public ELB fronted service (... don't ask why ...) and experienced really strange performance problems. Latency. Errors. What? Deployed a second private ELB fronting the same service and pointed the workers at it. No more latency. No more errors.

The issue appeared to have been that the private IP workers all would transit the NAT box to get to the public service and the ELB seemed to act strangely when 99.99% of the traffic was coming from one IP address. The private ELB saw requests from each of the individual IP addresses of the workers and acted a lot better. Or something.


Elbs are one of the known biggest weaknesses of aws...

Their whole position on them is super opaque and prewarming is still an issue.

I'll write more about this later, but so many people have had outages due to aws' inability to properly size these things.


I went to a meetup about 2 years ago and one of the engineers from CloudMine gave a talk about load balancing on AWS. CloudMine ended up dumping ELB for HAProxy to handle their scaling needs.


how does HAProxy compare to OpsWorks? the HAProxy wikipedia page mentions OpsWorks is based on it


Nginx running on a tiny instance can load balance 100k connections at thousands of requests per second. The network bandwidth for the instance will probably be saturated way before the CPU/RAM becomes a problem.

ELB (and most other managed service load balancers) are overpriced and not great at what they do. The advantage with them is easier setup and lack of maintenance.

If you're running a service with hundreds of millions or billions of requests, it's just far more effective in every way to use some small load balancing instances instead. Their Route53 service makes the DNS part easy enough with health checks.


Why do you say they're overpriced? I would say for most apps their downright cheap. Especially since you spend so little time tinkering/monitoring/worrying about them. Most people just want to work on their app not manage Nginx configs.


There is absolutely a tradeoff (as with everything in life) but in the context of this thread talking about scale with 100s of millions of requests, gigabytes of bandwidth and large spikes - it's far better to just host your own load balancers.

Most people (and apps) likely won't hit this scale so ELB is just fine. If you do though, ELB is just pricey and not really that great.


Link to the documentation? I thought this was changed over a year ago to not requiring pre warming?


I had that same problem when I was proofing out FreeBSD to replace Linux at $work but found an errata that fixed everything by disabling hardware checksum offloading:

    ifconfig xn0 -tso4


As a data nerd, I got very excited once I read all the features, but I'm just assuming this will wither and die since it's closed-source =(


Does anyone know the background of the decision to restrict commercial usage for the 64bit version? Seems strange to have the 32bit vs. 64bit versions licensed differently.


I'd imagine it's because nobody will want to use the 32-bit version for anything commercial, so there's no worry about it. If some company wants to use the 64-bit version commercially, then the team is going to want to have some say there in case the project picks up steam and really goes somewhere.


sigh

I just wish PSD files weren't so horrendously, horrendously, horrendously, terrible with space.

Our design people routinely have to deal with 300MB+ files for client work. I'm not convinced yet it's worth the benefits if the only way they can work w/git(hub) is with git-bigfiles or git-annex, or some convoluted custom workflow =(


Image data is big by nature. If you are working with print-resolution files you can multiply that figure by ten. Each PS layer can contain a large amount of image data. It has nothing to do with the file format, if that is what you are implying.


Crashes Chrome 33.0.1750.146 on OSX 10.9.1


Hilarious! So now all it takes to shut down an airport is to sprinkle a bunch of blackpowder outside in the parking lot for people to track in on their shoes?

The government is making the "terrorists"' job way too easy.


As I'm currently building a Serious Application in Backbone (dozens of Models, dozens of Views, a dozen Controllers), the differences between Spine & Backbone to me seem to come down to:

Spine does class/inheritance closer to the "JavaScript" way where properties are resolved correctly at runtime, Backbone seems a little more hackish in this area IIRC (+1 Spine).

Spine seems to have given up the separation between Controller & View. Backbone has a great separation here that has been a great aid to me in refactoring a large application. (+1 Backbone)

Backbone treats collections of Models (Backbone.Collection) as proper first-class citizens. Collections can receive & emit their own events, handle their own serialization/fetching, and make use of all Underscore.js methods (+1 Backbone)

Other than that they seem pretty similar to me. So my verdict: Spine +1, Backbone +2 :)


    > Spine does class/inheritance closer to the "JavaScript" way [...]
Nope. See later on in this thread. Backbone uses constructor functions and prototypes, Spine uses Object.create. Both approaches have properties resolved correctly at runtime.

    > Spine seems to have given up the separation between Controller & View.
Nope. Spine Controllers == Backbone Views. Spine doesn't have an equivalent notion to Backbone Controllers (although if one were added, I'd imagine it would be called a Router).


I actually think that Spine seems to implement more sane controllers than Backbone (but I never really liked Backbone's way of doing it in the first place).

Controllers are supposed to coordinate the interactions between views and models by propagating changes and events between the two.

For example, if you had a todosController responsible for controlling a collection of views, there isn't really a clean way (in my opinion) to utilize it from two independent Backbone views without passing the controller through to the view (which makes the view depend on the controller). By handling events in the controller instead of in the view, you remove this dependency (which is also similar to my perception of Cocoa's delegates/SproutCore's bindings, but I may be wrong).

Both Spine's and Backbone's view layer consists of rendering templates to the DOM. To me, a "proper" view layer would include things like positioning and both frameworks seem to utilize CSS to do this, for the most part.

I think of Backbone's controllers as a poor man's State chart, without any enter or exit callbacks (e.g going from "#/" to "#/todos" should trigger a callback that states that I will exit the state "#/", so I can do necessary cleanup).


This is true, but because Backbone made a poor naming choice. Backbone Views are what one normally thinks of as "controllers" (coordinating presentation and models), and the view/presentation layer is not handled by Backbone per se but by something like Underscore templating. Backbone Controllers really are just some helpers to deal with URL hashes.

In my experience Backbone Views become the "top level" of a client-side app, but that is appropriate because they encapsulate controller logic.


Yeah, I'm aware of that, but from my experience the code can end up uglier than it needs to be, either because you need to pass through proxying views further down the chain, or need to duplicate functionality.

I guess I just dislike the convention of interacting directly with collections from places where it shouldn't be done, e.g a todoView removing its own Todo from the collection (hence the risk of duplicating functionality throughout your code), instead of asking a proxying controller to do it. I think this is the reason as to why I've seen a lot of Backbone code where a reference in to the view is stored in the model itself (e.g http://documentcloud.github.com/backbone/docs/todos.html), which disgusts me since it completely misses the point of MVC.

Also, events passing feels more complicated than it would need to be (even if you can always include Backbone.Event).


Says the ad company! I love Google as much as the next guy, but I think entrepreneurs and creatives would create more wealth & the world would be a better place if there was more focus on real products & services rather than trying to grab more time on peoples' eyeballs.


Don't take this the wrong way but... It's a free market: go ahead and actually do what you advocate. If you succeed, you'll be a role model and people will be inspired by your approach.

As I like to say, entrepreneurs do, everyone else talks about what should be done.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You