Different outage. The Gmail postmortem is linked in another thread, but the gist was that "gmail.com" is a configuration value that can be changed at runtime, and someone changed the configuration. Thus, *@gmail.com stopped being a valid address, and they returned "that mailbox is unavailable".
In that document they seem to think they have solved the issue as of the 15th. But that is far from true. As of yesterday I was still getting unsubscribed from email lists due to bounces on my @gmail.com account. But that's not the worst.
As of yesterday there are some google email customers like NOAA.gov that cannot receive emails from external mailservers (like my personal domain mailserver I run) because they are now proxying through some "security consultant service" ala mx.us.email.fireeyegov.com which causes the SPF validation to fail because it's no longer the external mailserver's IP that's sending it.
Received-SPF: fail (google.com: domain of superkuh@superkuh.com does not designate 209.85.219.72 as permitted sender) client-ip=209.85.219.72;
Note that IP, 209.85.219.72, that's not my mailserver's IP, that's an IP that Google owns and use with their new setup to foward email for (some) government accounts.
I've re-signed up for the email lists that gmail's behavior got canceled and subscribed to them with my personal domain/mailserver. It's incredible that a random $5/mo VPS has given me better uptime over the last decade than all of google's infrastructure.
The odds that they would happen simultaneously if they’re completely unrelated seem astronomically small, certainly?
Both are noted as being related to “ongoing migrations,” though AFAICT not related ones. I would bet there’s a human factor connection- e.g., the day before there was a big meeting where a higher-level management gave multiple ops teams go-ahead on their respective plans, resulting in a multiple potentially breaking changes occurring at the same time.
I think it's as simple as a case of the Mondays; you wouldn't roll out a migration like that on a Friday or the weekend, and rolling it out on a Monday gives you the least chances of problem occurring on those dates.
I think the likelyhoood of two incidents happening in the same period is not astronomically small, and is a variant of the birthday problem. It's a bit counter intuitive but if you have a few incidents during a year, the probability to have two incidents the same week is a lot higher than what you would expect.
The birthday problem involves random people with unrelated birthdays. This is like two not-random siblings both calling in sick in the same week. The case of two large Google outrages where the whole service gets conked has much more potential to have a shared or related cause than two birthdays happening at once. You're right that it's not astronomically small, but the odds of them being related seem healthier than the odds of them being unrelated.
If you don't want to scroll to the other thread, here's the postmortem: https://static.googleusercontent.com/media/www.google.com/en...