Facebook Explains Largest-Ever crash: 10-point Guide to What Went Down
Facebook had its largest outage ever that took down WhatsApp, Instagram, Messenger and even Facebook’s internal tools for over six hours on Monday night. The social giant blamed the outage across its platforms on configuration changes made to routers that coordinate network traffic. “This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,” Facebook vice president of infrastructure Santosh Janardhan said in a blog post.
The company in a late Monday blog post did not specify who executed the configuration change and whether it was planned. However, there are reports that the outage was caused by an internal mistake.
Here’s what we know about Facebook’s largest ever crash:
- On Monday, Facebook and all its platforms — WhatsApp, Instagram, Messenger, Oculus, as well as the company’s internal tools all went offline at 9pm in India, and remained offline for six hours.
- The company’s own internal tools and even offices that required security badges were made inaccessible owing to this issue, which caused additional disruption and delays in finding a fix.
- A poster on Reddit claiming to be an insider also said that the lower staffing in data centres due to pandemic measures made it harder to get the fix in place.
- Facebook blamed the issue on a faulty configuration change but did not give out more details.
- Very simply, the routers are used to coordinate data inside Facebook, and a change to their settings meant that Facebook’s computers could not talk to each other. DNS, the “address book” that is used to find Facebook services on the Internet was not working, because of the faulty BGP update (Border Gateway Protocol), which is used to send data to the right place, according to an insider’s post on Reddit.
- “Facebook basically locked its keys in its car,” tweeted Jonathan Zittrain, director of Harvard’s Berkman Klein Center for Internet and Society.
- Shares of Facebook fell 4.9 percent, their biggest daily drop since last November.
- “To every small and large business, family, and individual who depends on us, I’m sorry,” Facebook Chief Technology Officer Mike Schroepfer tweeted.
- With Facebook down, Twitter reported higher-than-normal usage, which led to some issues in people accessing posts and direct messages.
- Signal also benefitted from the WhatsApp outage, tweeting that millions of new users signed up on Monday.