More

emcooke · on Oct 3, 2011

Twilio has a service-oriented architecture internally. We leverage a variety of application stacks for different services.

emcooke · on April 22, 2011

Several people have asked for additional details. We just posted a quick follow-on:

[UPDATE] A central theme of the recent AWS issues has been the Amazon Elastic Block Storage (EBS) service. We use EBS at Twilio but only for non-critical and non-latency sensitive tasks. We've been a slow adopter of EBS for core parts of our persistence infrastructure because it doesn't satisfy the "unit-of-failure is a single host principle." If EBS were to experience a problem, all dependent service could also experience failures. Instead, we've focuses on utilizing the ephemeral disks present on each EC2 host for persistence. If an ephemeral disk fails, that failure is scoped to that host. We are planning a follow-on post describing how we doing RAID0 stripping across ephemeral disks to improve I/O performance.

emcooke · on April 22, 2011

Ironically, this highlights one of the main issues we discuss in the post!

The Twilio Engineering blog is hosted off an external Wordpress site with a single IP that's forwarded from ngnix load balancer pool. Since the load balancers assume that the external service can fail, they won't tied resources blocking access to other parts of the site.

Hope you enjoy the post :)

-Evan Twilio.com

aaronblohowiak · on April 22, 2011

Why not stick an angel-mode Varnish in between? Serving stale blog is usually better than no blog!

emcooke · on April 22, 2011

Yup, good idea. We set up an ngnix proxy to cache the page while the blog hosting provider fixes their server.

markerdmann · on April 22, 2011

Evan, I just noticed that your service seems to be running on Slicehost, not the AWS colo in Virginia. Is that correct? I got the opposite impression from your post, which seems to imply that Twilio is hosted on AWS, yet managed to weather the storm because of your design decisions.

emcooke · on April 22, 2011

Our main infrastructure is deployed on AWS but we have capacity at several cloud providers for load-balancing, redundancy, etc.

markerdmann · on April 22, 2011

Ah, I see it now. I just got a POST from one of your servers in the AWS US-West region. Is Twilio also hosted in US-East (the region affected by today's outage), and, if so, would Twilio have stayed up if it hadn't been spread across multiple regions?

chopsueyar · on April 22, 2011

Eggs and baskets.

oinksoft · on April 22, 2011

Speaking of highlighting, something about Disqus' markup/styles causes your blog text to be un-highlightable with mouse (Firefox 3.6.16 Debian 5.0.8).

dmor · on April 22, 2011

Thanks for the heads up, I've disabled Disqus comments for now... was also causing some issues for iPhone/iPad readers. Regular commenting is enabled

nopal · on April 22, 2011

You were born to work at Twillio.

emcooke · on April 22, 2011

We just enabled caching on the ngnix proxy to the external site hosting our Wordpress install for the engineering blog. Hopefully that should help performance.

-Evan Twilio.com

emcooke · on Sept 25, 2010

For those wondering about the technical aspects of this decision take a look at this thread: http://getsatisfaction.com/twilio/topics/international_sms-1...

When we launched the Twilio SMS Beta we tried hard to support sending SMS messages to both US and International destinations. When there were problems, we worked with our customers to collect forensic data on hundreds of carriers worldwide and pass it to our carriers partners to debug.

At Twilio we are dedicated to working with top quality carriers and technology. After months of working to fix problems, were not able to deliver the reliable International SMS service our customers have come to expect.

We apologize for any problems this has caused for our customer and we'll work to bring back International SMS service after were able to deliver on the quality we do the rest of Twilio services.

Cheers, -Evan

CTO and Co-Founder

feint · on Sept 25, 2010

That's fine. In fact it's a good move. But why email me after the fact. It's like you brlirve its ok to communicate with paying customers like that.

emcooke · on Sept 25, 2010

Hi Anthony, Twilio service status available via our public status page http://status.twilio.com/ We communicated degraded international SMS service to customers on August 16.

You bring up a good point that the information might not be readily discoverable. We'll work to make the status page more findable and to extend the API (http://status.twilio.com/documentation/rest) with features such as RSS to let customer subscribe up-to-the-minute status information.

feint · on Sept 25, 2010

Or. Just email me before you cut a feature next time.

davemc500hats · on Sept 25, 2010

unless automated, not a scalable response pattern.

ajdecon · on Sept 25, 2010

Presumably they have a list of all their customers with contact info. Not saying you should spam them often, but for the message "no more international support"? I think it's worth telling everyone.

emcooke · on July 27, 2010

In a 100% ideal scenario an organization would only need developers as every possible failure/problem would be covered and handled by automation.

Obviously the real world is different. New code is deployed that has bugs, there are unpredicted events, there are complex failure scenarios that are difficult to automate etc.

A DevOps engineer may write automation software in conjunction with developers to automate the operational aspects of business logic. Thus, it's a partnership between the DevOps engineer whose metrics are driven by availability/reliability/scalability/security and the developer who is trying to attain some business objective.

blueben · on July 28, 2010

You say "DevOps engineer" when you really mean SysAdmin, but I upvoted you anyway because you get it.

emcooke · on July 27, 2010

Can't someone whose role embodies a certain culture and philosophy not have a title that fits that goal? How would you define DevOps?

emcooke · on July 27, 2010

Thanks for the pointer to OpSec term. DevOpsSec... I sense a tshirt in the making?

WestCoastJustin · on July 27, 2010

I first heard about OpSec from Roland Dobbins with Arbor. Really smart fellow who has many suggestions on network design / defence.

I just think people need to specialize in their field (+ a general knowledge of how the overall systems works).

Almost how a carpenter, plumber, and electrician each specialize but probably have a general idea how everything works. Do you really want a plumber building the foundation of your house? Just seems a little silly to me. Each domain is so large that I would rather have deep expert knowledge in a area than a general handyman who might cut corners or waste time with solutions he doesn't know about.

emcooke · on July 20, 2010

Correct. Stashboard is simply a lightweight frontend display for your API/service status. It has a GUI and REST API that allow you to update status information. Using the API, one could wire Stashboard into Nagios or any other alerting system.

emcooke · on Feb 9, 2010

You might check out http://venmo.com/ They have a dead simple SMS payments platform.

-evan (@twilio)

decadentcactus · on Feb 9, 2010

I checked it out, and it looks handy (requested an invite, but we'll see how it goes) although perhaps I worded it wrong.

I meant sms billing, services like zaypay/daopay. It's just that now the fees are so horribly high that it's almost pointless.

djb_hackernews · on Feb 9, 2010

I wonder how they do send payment sans venmo account. According to their info page it can be done with just a cellphone...