Hacker News new | past | comments | ask | show | jobs | submit login
Sendgrid is down (sendgrid.com)
103 points by sunils34 on March 21, 2013 | hide | past | favorite | 97 comments



I'm sad to see that HN is being used as a platform for this sort of demagogy, and sad that stories about this tawdry little tale are indiscriminately voted up above other stories which are far more interesting.

DDOS of a mailing service that lots of websites rely on because a completely unrelated company decided to fire someone is not an occasion for lol and schadenfreude as some posters here would have it. As a method of justice it has more in common with a lynch mob than a court of law - this isn't going to get the guy's job back, and it certainly isn't going to teach anyone a lesson, apart from that the internet is fickle, and monumentally stupid. But I very much doubt the people behind this attack are interested in justice or truly care about the man who lost his job, they're just doing it for the lolz and are punishing the internet at large over a silly little dispute at a tech conference.

Congratulations to the mob, I guess; it has shown its power, if not any sense of discrimination or proportionality.


Now you can argue as much as you want - but the (sad) truth is that mob mentality exists and we will have to deal with it for at least the next few centuries.

This should be a lesson for companies that hire highly confrontational people as their official community representatives.

Yes, in a perfect world there shouldn't be DDOS' and other attacks because of a tweet about an immature joke but it's not a perfect world and we should be very wary what personalities we hire to represent us.

As much as I don't like it - the "right" thing for sengrid would be to replace their 'developer evangelist' with someone who's less confrontational. Yes, it sucks. But as businesses we have to deal with reality.


If she was confrontational she would have confronted them. Replace 'confrontational' with 'passive aggressive' I totally agree.


This smacks of a serves them right attitude. This same attitude is what drives doxxing efforts against individuals that directly threatens their safety.


>As a method of justice it has more in common with a lynch mob than a court of law

You mean like when someone gets personally offended by a comment and instead of resolving it one-on-one turns to her Twitter account to shame and blame the "violators"?

Yeah. Except the person who instigated this set up the lynching in her capacity as a professional, i.e. on her "evangelist" twitter account. The kids DDOSing aren't representing companies or doing business


Agreed, as an evangelist her twitter account represent Sendgrid, as such Sendgrid approve of this message which is baffling to me.

I'm really sad these guys lost their jobs because of these remarks, I don't think the way she handled that is the way we need to move forward.


Sometimes something happens where posts on hacker news are part of the debacle[1] and it results in an influx of new users. I don't want to sound like an old coot, but I'm a bit concerned about the type of user this thing has been attracting

[1] in this case, the guy who was called out for the jokes, posted in a comment on here that he was fired


OK, is there any actual evidence that this is a DDOS of SendGrid? Because their status page indicates nothing of the sort, and all I'm seeing is HN jumping to conclusions.


Adria's blog was already under a DDOS, so it's not a huge stretch to think this is too.


It is a huge stretch- at least partly because there is no third party verification that Adria's blog is actually victim to a DDOS attack, as far as I know. I'm not accusing her of lying, just that "a huge, huge number of people reading your blog post" can look a lot like a DDOS attack if you're running an unoptimised version of Wordpress, or similar.


It's uncharitable of you to assume she can't tell the difference between a botnet and a spike in traffic. The "third-party verification" is that her site came back online after implementing CloudFlare's DDoS mitigation.


It's uncharitable of you to assume she can't tell the difference between a botnet and a spike in traffic.

I don't think so. It's incredibly difficult to tell the difference, given that a DDOS is a huge spike in traffic.


Real users load CSS and images, run javascript, stay on the page more than a second, etc.


Not when they can't load the page, they don't. Once a traffic spike overwhelms the server no-one sees any HTML, downloads CSS or stays on the page.


That is hardly any proof?!


I couldn't agree with you more on the subject of what happened with PyCon. I don't think it should be mixed with what is happening now.

On the flip side, I'm happy that this story has been up-voted and that it attracted some attention because I was able to know that Sendgrid is down (never received any mail notification and I don't follow them on Twitter).


> ... this isn't going to get the guy's job back, and it certainly isn't going to teach anyone a lesson

As much as I don't support the methods, seems like this actually sends a message (right or wrong). So it perfectly works from attacker's perspective.

I'm pretty sure by now every critical person in Sendgrid know the details of what happened, why happened and people involved. Not because of the initial twitter discussions but because this actually costs them money directly.


This was a Heroku win for us. I just visited their add-ons page, identified a comparable provider, and restored mail in our application with the following:

    heroku addons:add mailgun:basic
    heroku addons:docs mailgun:basic
    vim config/initializers/mail.rb
    git add config/initializers/mail.rb
    git commit -m "Switching to back-up mail provider."
    git push heroku master
Since the account creation is all automatic and billing is all through Heroku, I never even had to visit Mailgun's website.


How do you handle SPF/DKIM and domain keys records? Do you send out a lot of emails? Have you had a mailgun account ready?


I don't know anything about this. If someone could provide some details/resources around this I'm sure it would be valuable to folks right now who need to evaluate the risks of switching providers on a short-term basis.

(For us in particular, SendGrid only represents a small amount of the email we send and it's mostly internal emails. The problem for us was that the ActionMailer emails aren't sent in a background process, so this caused a couple request timeouts.)


If memory serves, if you can log into Mailgun, it will have some text on how to set up all those things on your dashboard. Congrats on switching - Mailgun has vastly superior deliverability, at least in my observation.


I'd love to hear why you think mailgun is more superior than sendgrid. I'm currently using sendgrid.


We started off using Sendgrid too but found that our emails were often ending up flagged as spam, even though we had the SPF records and all that jazz. I should note that our emails were not spammy at all. Switched to Mailgun and haven't had any issues since.

I have also noticed that false positives from other startups in my own spam folder are often sent through Sendgrid (though I suppose that's not dispositive since it could just be that more startups use Sendgrid).


Even though I'm a SG customer, I can't help but feel a little schadenfreude.

edit: when I made this comment I thought this was a random service failure that would last couple of minutes. More than 1 hour later, I don't think it's that funny anymore as I'm being affected as well.


Schadenfreude is, however, not an RFC compliant message delivery protocol.


SG customer as well. What's all the negativity towards them? Is there something I should know about them? I wasn't aware of any incidents.


There's not much negativity toward them just a bit of criticism on how they're handling this outage. As I mentioned in another comment of mine on that thread - they should have reached out to customers and let them know there's an outage (especially if this is an outage that is > 10 minutes).

P.S: I still think Sendgrid are awesome and fortunately they'll listen to what we have to say and next outage will be handled differently.


A female Sendgrid employee overheard a joke mentioning a dongle looking like a phallus, took a picture of the guy, posted it to Twitter, and he got fired from his job without anyone hearing his side of the story. His apologies were ignored.

Now the mob is mad at Sendgrid.


and by mob we mean possibly one person acting like a dongle and knocking out a website.


Can I have a picture of you please for using the word dongle in a fashion that I may or may not be upset about? You know, for internet justice..


This story was all over 4chan today. This probably means there's more than one person behind this.



Same here. I just flipped the switch to send things from an internal server, bypassing SG for now.


Really? All it says to me is that people in the tech field are so against a woman standing up for herself that they're actually willing to bring down the company she works for. It's disgusting.

Will you also feel schadenfreude when they hack her bank account?


First, nobody cares whether or not she was a she or a he or an it. Her actions were objectively ridiculous and harmful to both general civility and gender relations in the community. Largely because SHE made it a gender issue (saving future women programmers? joan of arc?)

Second, she wasn't "standing up" for herself because nobody wronged her. Overhearing a third-party say something about a dongle that you construe in a sexual light isn't being wronged, and the world doesn't require you to "stand up" to it.


> people in the tech field are so against a woman

This is not because she's a woman. Even if a man had done such a "stupid" thing (according to DDOS-ers), the result would have been the same.


Is this actually a revenge DDOS attack, or is everyone jumping to conclusions?

I don't dispute for a second that the tech community (and indeed HN) has a huge problem with sexism that manifests itself in very ugly ways, but let's stick with the facts here.


Woman standing for herself, aka a woman randomly attacking a man who made a joke about dongles? Really?

I will definitely feel schadenfreude if her bank account gets hacked. After all, she did rid a person of their income by her actions.


Let me pose two theories for you.

1. The guy who got fired had an overzealous manager who fired with insufficient cause.

2. The guy who got fired had a history that we don't know, but the manager does, and this was the final straw.

Both theories are possible. I personally have known a higher portion of guys fitting #2 than managers who would enable #1. Therefore conditional probability suggests to me that he was fired for more than just this incident. If so then his firing would not be her fault.

(Even in #1 the firing was not her fault - it was the manager's.)


"(Even in #1 the firing was not her fault - it was the manager's.)"

But in this case, it's not a simple binary case of was/was not "her fault". It's at least partly her fault, regardless of the accuracy of scenario #1 or #2. There is a chain of causality here, and it all starts with Adria's tweet/blog post.

I'd also argue that your two theories represent a false dichotomy, and sussing out your conclusion based on the idea that there really are only two possible explanations (throw in some personal anecdotes for good measure!) is, well... lazy.


Fault is a legal concept. There is always a chain of events with many events that were necessary. But who had the power to make the decision? Who made it? That is who is at fault. In this case that wasn't Adria. (Not that her behaviour is anything to be proud about.)

I'd also argue that your two theories represent a false dichotomy...

Actually they don't. Are we agreed that this event is insufficient cause for a firing? If so, then if this event was the real reason for the firing, then the manager fired for insufficient cause, which is my #1. If not, then there is more to the story, which is my #2. Those two possibilities are therefore logically complete.

However they are not mutually exclusive. There might have been more to the story, and yet the manager still fired for insufficient cause.

That said, what's going to happen now? The guy who got fired has just become a cause. If he's got any skills at all, he'll get another job. I'm confident of it.

Adria has become radioactive. She hasn't been fired, but if her employer keeps getting attacked because of outrage over this, that's a possibility. If she does get fired, she's going to be radioactive for a while. Her line of work requires her to be public about where she is. And no sane company wants to be included in the outrage aimed in her general direction.


She's now been fired.


What the fuck is HN becoming...Jesus.


User created: 2 hours ago

2 posts.

Happens all the time, always has.


Sendgrid customer here.

Since many services rely on SMTP providers like Sendgrid, they should have a way to notify customers when their server go down and transactional notifications may be disrupted.

I shouldn't be notified by someone who know we're using Sendgrid and happen to read HN.


Yeah -- outage notifications to a service's users should go out, rather than just being pulled.

Also seems like best practice would be to have SPF/etc. entries in place for multiple ESPs, even if you routinely just use one, and be able to switch, for just this reason.


We do have SPF entries set up for another provider, the main problem is that some services may need to warm up an IP prior sending lot of emails through a backup provider.

What it means is that basically you should be using 2 providers at the same time and send 50%/50% on each provider in order to keep your IP warms.

While I think big services should account for such scenario, I think 95% of Sendgrid customers are using them to avoid setting up that kind of redundancy.

Main question is should I create the same redundancy for my hosting provider, DNS services, CDN etc...


Most of the really heavy CDN users I know do have multiple CDNs constantly in operation, for contract negotiation reasons if nothing else.

DNS is interesting, especially if you do anything location-based. I currently use just CloudFlare, but am not convinced they have enough internal redundancy on DNS, so investigating using Dyn, or self hosting again, or Route53, or some combination. It gets a more complex if you want to use multiple providers doing anything beyond simple DNS though (the DNS protocol itself is totally fine for this, but most of the config management is provider-specific, and using normal dns zone transfers/notifies doesn't really work in this model)


What do you mean by "to warm up an IP"?


If an IP address does not have a history of sending [large amounts of] email it is considered "cold". If that IP address then suddenly begins sending huge numbers of emails, many providers will assume the worst (that the email is being used for SPAM) and either block the IP or mark messages coming from it as SPAM. "Warming up" the IP means sending a smaller number of emails (per time period) and slowly increasing the volume so that Email providers can properly "score" the IP as legit and okay in their systems.

Update: For some reason I couldn't reply to your response kstrauser so here is my reply: That works if and when their is trust. First you have trust the buyer isn't going to abuse the system and wreck the IP [reputation] and you have to trust the buyer isn't giving you stolen info. Obviously things one can work through but again, it involves trust. Also the provider has to have warmed up IPs to give out. Which, having warmed up IPs would actually be a great valued add upsell for those that need them!

Update 2: Thank you FfejL and symfoniq for explaining it better.


That's when you physically call your sales rep and explain why your using their services and ask them to whitelist you. I say this from experience. If you called my employer (and any of our competitors, I'm certain) and said "I need to host a mail relay that's ready to go from no traffic to tens of thousands of emails per second, here's my company website, and here's our credit card", we'd be happy to make it work for you.

I'm not saying this to advertise but to offer a suggestion: call someone and ask for help. This isn't an unusual need at all and any reputable provider will be quick to help.

Short version: you don't have to "warm up an IP" if you do a little advance homework.


That's not what 'warming up an IP' means.

If Google (or any other ISP) suddenly sees an big spike in email from @example.com on an IP that @example.com hasn't used previously, Google is much more likely to mark those emails as Spam.

So senders need to 'warm up' an IP by sending a small amount of email first, usually for a few days at least.

So if you're a SendGrid customer expecting to send 2,000,000 emails today, you can't just switch to a new ESP and send those same emails. Spam rates will go through the roof.


jrs235 is referring to IP reputation as measured on the receiving end of the email. In other words, Yahoo or Hotmail might reject your emails if they see unusual activity from an IP address. Until an IP is "established" as having sent a lot of email with low bounce and spam rates, getting bulk email delivered from that IP address will be a frustrating experience.


Thanks for the clarification! I'll pass that along to our products department. :-)


Who do you work for?


Mail me at kirk@strauser.com . I'm not trying to be evasive but I like to keep work and personal separate, especially since this story has become more political than I was expecting. I just came in to read about a service outage.


They've been tweeting about it for an hour or so: https://twitter.com/SendGrid

I would expect them to send emails to customers in the event of serious issues though, expecting your customers to learn about problems through Twitter isn't that much better than learning about them from HN.


Critical notification should be sent to me and not pulled by me. Nice addition would be to have an addon to newrelic that would basically poll services/API like sendgrid and allow you to setup notifications for failure or degradation.


> I would expect them to send emails


Preferably not with their own SMTP service ;)


Aaand the PR slide for SendGrid continues. Perfectly valid point though.


This is a “PR slide” in the same sense that torching someone's store is PR.


"I shouldn't be notified by someone who know we're using Sendgrid and happen to read HN."

Was what I was replying to, not insinuating the fact that SendGrid's website going down is a "PR Slide"


how does one send outage notices if the mail is down?


Have an offsite cloud server with a mail daemon that's ready to send a few thousand messages to customers when your main server is down.


It might actually be worthwhile for a third-party status site/monitoring service to have a way to send this mail to users. It would be thousands of users per developer tool, but not millions like consumer services, so technically a much simpler service.

Letting users set "I must ack notifications of outages or keep pinging me" would make sense, too.

I'd probably consider mobile push notifications with confirmation and/or voice or SMS as well.

Basically like Nagios, but for third-party components.


OK, I'll bite - is this a coincidence - or are they being DDOS'd over the PyCon incident?


The dreaded FDDOS: Forking Dongle Denial of Service.


They could just use bigger Dongles to connect their servers to the internet.


Still unfunny and tasteless. Imagine that.


Well I laughed at his joke. Maybe it's time to grow up and understand different people like different jokes?


It's totally unfunny and tasteless, I agree. That doesn't make it worth the hullabaloo that it caused. And I think that's the point.


I'm just tuning in now, as I happened to see this here on HN and my company is a SendGrid customer (used for transactional emails).

Am I understanding this right - transactional emails for my company may be interrupted because of some random personal argument between two people?



I'm surprised PlayHaven hasn't shared the blame for their role in this whole thing. Adria, and by-extension, SendGrid shoulder some of the responsibility for what happened, but PlayHaven chose not to back their own employee.


It seems that Adria's personal blog is having similar problems..

http://butyoureagirl.com/


Yes she mentioned that her website was getting DDOSed and that she was going to put it behind Cloudflare. I'm not sure its a DDOS though, it could just be the amount of traffic she's getting from HN/Reddit/Twitter/etc.

https://news.ycombinator.com/item?id=5400134

EDIT: It seems that her personal website (adriarichards.com) is down too.


How is putting it behind Cloudflare going to help now that the (alleged) DDoSers have the IP address of the actual server?


mirror the content to a different host. point cloudflare at it. something like that i guess. could be a pain in the ass if you can't get to the original server and you have no other copies of the content, though.


As a SendGrid customer, I'm certainly glad I visit HN frequently and was able to find out about this. I'm surprised we weren't notified by them directly, though.


We've never been directly notified by them despite their SMTP breaking a number of times.


I'll just start following them on Twitter then. I've recently started working somewhere new where we use SendGrid, and this is the first outage I've been aware of.


They tend not to post about interruptions on Twitter. In a lot of ways, they fail as an API-type service provider:

* require you to use user:pass instead of keys

* no historical service status page

* poor notifications about downtime

* unclear explanation of uptime


I did the same thing. Followed them on twitter just to be aware of the outage status without having to go look for it.


Since the service is timing out, I think we'd all be best served in this case by having robust internal monitoring.


First we replaced SMTP that was distributed and designed to be resilient to outages (i.e. not lose messages) with proprietary HTTP APIs and now we complain that they don't notify us via email when they are down. Nice.

I am guilty of using SES myself, but it's sad to see email becoming increasingly centralized.


We use SendGrid for transactional email, but handle (solicited) bulk email in-house. The centralization of email is at least partly due to the fact that ensuring email gets delivered is difficult and time-consuming.

While I'm sure a lot of SendGrid's customers don't want (or don't know how) to configure a mail server, there are other customers who know that delivering email isn't as simple as installing Postfix. The rise of centralized email services is the inevitable byproduct.


Yeah, I know. Not blaming SendGrid in the slightest. Spammers have spoiled it for us all. Freedom disappearing because of a minority of abusers seems to be a common pattern...


Actually SendGrid recommends using SMTP. And their API endpoint is very SMTP-like.

As services go, its lock-in seems very minimal.



That doesn't belong in this discussion. This isn't reddit.


Gah! I totally forgot to ask for your permission and editorial insight before I posted. Please accept my apology, and thanks for staying on top of things.


FWIW, we use Amazon SES and enjoy it immensely.


Looks like she got the boot. http://www.facebook.com/SendGrid


Great for Adria: more blog hits. Not so great for SendGrid: a PR disaster.


all thnx to adria and her cyber bullying


All thanks to a dickhead with a botnet




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: