Hacker News new | past | comments | ask | show | jobs | submit | emcooke's comments login

Twilio has a service-oriented architecture internally. We leverage a variety of application stacks for different services.


Several people have asked for additional details. We just posted a quick follow-on:

[UPDATE] A central theme of the recent AWS issues has been the Amazon Elastic Block Storage (EBS) service. We use EBS at Twilio but only for non-critical and non-latency sensitive tasks. We've been a slow adopter of EBS for core parts of our persistence infrastructure because it doesn't satisfy the "unit-of-failure is a single host principle." If EBS were to experience a problem, all dependent service could also experience failures. Instead, we've focuses on utilizing the ephemeral disks present on each EC2 host for persistence. If an ephemeral disk fails, that failure is scoped to that host. We are planning a follow-on post describing how we doing RAID0 stripping across ephemeral disks to improve I/O performance.


Ironically, this highlights one of the main issues we discuss in the post!

The Twilio Engineering blog is hosted off an external Wordpress site with a single IP that's forwarded from ngnix load balancer pool. Since the load balancers assume that the external service can fail, they won't tied resources blocking access to other parts of the site.

Hope you enjoy the post :)

-Evan Twilio.com


Why not stick an angel-mode Varnish in between? Serving stale blog is usually better than no blog!


Yup, good idea. We set up an ngnix proxy to cache the page while the blog hosting provider fixes their server.


Evan, I just noticed that your service seems to be running on Slicehost, not the AWS colo in Virginia. Is that correct? I got the opposite impression from your post, which seems to imply that Twilio is hosted on AWS, yet managed to weather the storm because of your design decisions.


Our main infrastructure is deployed on AWS but we have capacity at several cloud providers for load-balancing, redundancy, etc.


Ah, I see it now. I just got a POST from one of your servers in the AWS US-West region. Is Twilio also hosted in US-East (the region affected by today's outage), and, if so, would Twilio have stayed up if it hadn't been spread across multiple regions?


Eggs and baskets.


Speaking of highlighting, something about Disqus' markup/styles causes your blog text to be un-highlightable with mouse (Firefox 3.6.16 Debian 5.0.8).


Thanks for the heads up, I've disabled Disqus comments for now... was also causing some issues for iPhone/iPad readers. Regular commenting is enabled


You were born to work at Twillio.


We just enabled caching on the ngnix proxy to the external site hosting our Wordpress install for the engineering blog. Hopefully that should help performance.

-Evan Twilio.com


For those wondering about the technical aspects of this decision take a look at this thread: http://getsatisfaction.com/twilio/topics/international_sms-1...

When we launched the Twilio SMS Beta we tried hard to support sending SMS messages to both US and International destinations. When there were problems, we worked with our customers to collect forensic data on hundreds of carriers worldwide and pass it to our carriers partners to debug.

At Twilio we are dedicated to working with top quality carriers and technology. After months of working to fix problems, were not able to deliver the reliable International SMS service our customers have come to expect.

We apologize for any problems this has caused for our customer and we'll work to bring back International SMS service after were able to deliver on the quality we do the rest of Twilio services.

Cheers, -Evan

CTO and Co-Founder


That's fine. In fact it's a good move. But why email me after the fact. It's like you brlirve its ok to communicate with paying customers like that.


Hi Anthony, Twilio service status available via our public status page http://status.twilio.com/ We communicated degraded international SMS service to customers on August 16.

You bring up a good point that the information might not be readily discoverable. We'll work to make the status page more findable and to extend the API (http://status.twilio.com/documentation/rest) with features such as RSS to let customer subscribe up-to-the-minute status information.


Or. Just email me before you cut a feature next time.


unless automated, not a scalable response pattern.


Presumably they have a list of all their customers with contact info. Not saying you should spam them often, but for the message "no more international support"? I think it's worth telling everyone.


In a 100% ideal scenario an organization would only need developers as every possible failure/problem would be covered and handled by automation.

Obviously the real world is different. New code is deployed that has bugs, there are unpredicted events, there are complex failure scenarios that are difficult to automate etc.

A DevOps engineer may write automation software in conjunction with developers to automate the operational aspects of business logic. Thus, it's a partnership between the DevOps engineer whose metrics are driven by availability/reliability/scalability/security and the developer who is trying to attain some business objective.


You say "DevOps engineer" when you really mean SysAdmin, but I upvoted you anyway because you get it.


Can't someone whose role embodies a certain culture and philosophy not have a title that fits that goal? How would you define DevOps?


Thanks for the pointer to OpSec term. DevOpsSec... I sense a tshirt in the making?


I first heard about OpSec from Roland Dobbins with Arbor. Really smart fellow who has many suggestions on network design / defence.

I just think people need to specialize in their field (+ a general knowledge of how the overall systems works).

Almost how a carpenter, plumber, and electrician each specialize but probably have a general idea how everything works. Do you really want a plumber building the foundation of your house? Just seems a little silly to me. Each domain is so large that I would rather have deep expert knowledge in a area than a general handyman who might cut corners or waste time with solutions he doesn't know about.


Correct. Stashboard is simply a lightweight frontend display for your API/service status. It has a GUI and REST API that allow you to update status information. Using the API, one could wire Stashboard into Nagios or any other alerting system.


You might check out http://venmo.com/ They have a dead simple SMS payments platform.

-evan (@twilio)


I checked it out, and it looks handy (requested an invite, but we'll see how it goes) although perhaps I worded it wrong.

I meant sms billing, services like zaypay/daopay. It's just that now the fees are so horribly high that it's almost pointless.


I wonder how they do send payment sans venmo account. According to their info page it can be done with just a cellphone...


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: