2012-11-19

AWS: why the bias towards US-east?

As MastodonC will point out, Amazon's US-East sites are the most polluting, not just because they have a high CO2 footprint, but because the coal they (and the other east coast) industries burn is polluting in other ways, such as sulphur. It's not as bad as, say, a steelworks (having had relatives living near  Ravenscraig Steelworks I can vouch for this), but as datacentres can be placed near other electricity sources, it's needless.

I intermittently use US-West-2, up in Oregon, where the melting snow creates electricity.

Crater Lake Tour 2012

Unfortunately there's an implicit bias in the AWS APIs towards US-East. Where's the default site for S3 Buckets? US-East. Where's the default site for EC2 instances? US-East. What is the default location for EMR jobs? The same -to the extent that the command line clients treats requesting a different site as "uncommon":

Uncommon Options
 --debug               Print stack traces when exceptions occur
 --endpoint ENDPOINT   EMR web service host to connect to
 --region REGION       The region to use for the endpoint
 --apps-path APPS_PATH Specify s3:// path to the base of the emr public bucket to use. e.g s3://us-east-1.elasticmapreduce


Because of all the implicit "us-east" bias, it becomes self reinforcing. Once you've got a bucket on S3 east, that's where you want to run your webapps otherwise you get billed for the remote bandwidth. Once you've got the webapps, that's where your logs go, hence even more reason to run your MR jobs on the same site: it's where your data lives.

Because it's the default location for stuff, it's also the default location for people serving up data on the site: RPM and Maven repositories, public datasets. This pushes you towards that location so as to avoid the costs of downloading that data from other sites, as well as the speed gain.

Why the bias? Either it's where the the majority of servers lie, or through a combination of cost of electricity, site PUE and bandwidth, it's got the lowest operating costs -hence the most profit per CPU-hour, MB stored or MB downloaded.

That's a shame, because amazon themselves have better options. They're being crucified by Parliament over their tax avoidance strategies -it'd be tactically wise to have something positive to talk about.

[Photo: Crater Lake & Mt Thielsen. Smoke is a forest fire blowing up from CA]

No comments:

Post a Comment

Comments are usually moderated -sorry.