One thing I almost never hear in articles about moving to AWS is the mention of network latency and variability of latency. It seems like a fantastic platform for bulk/batch processing but much less so if your service requires low and/or predictable network latency/response times.
I notice that they still have data gathering systems in house. They are using the cloud servers for content distribution. I suspect even the processing is light: authenticate the user (redirection referrer), determine device parameters, start pumping bytes.
Sounded to me it's more like they're using CDNs for content distribution and Amazon AWS for bulk/batch processing jobs. From the original article:
RLB: Could you describe at a high level what Netflix is doing on AWS?
AC: Encoding movies for streaming, log analysis, production web site and API, most everything that scales with customers and streaming usage. Easier to say what we don’t have there: most internal IT that scales with employee count, legacy stuff, DVD shipping systems, account sign-up and billing systems. We use Akamai, Limelight and Level3 CDN’s for streaming the movies, which is a cloud based service. There is an AWS CDN service, but they aren’t a big enough player in this space at this point.
Anyone know how many AWS instances (and what types) are being run? Are these the same servers that account for up to 20% of downstream bandwidth in the U.S.?
I doubt these servers account for the bandwidth, since the article mentions they are using other CDNs for hosting the streaming since CloudFront isn't yet up to the same level.
http://cloudscaling.com/blog/cloud-computing/cloud-innovator...