Last Monday, on October 4th 2021, millions of netizens around the world were in shock, I was in shock. All of a sudden we can’t reach WhatsApp, Instagram, or Facebook! Some said it’s because of the Domain Name System (DNS), but is it really?
What I experienced last week, for sure, was that my WhatsApp app kept loading with no success. At first, I thought something was wrong with the internet service provider, but then I found out the hashtag #facebookdown started trending on Twitter that day (yes, Twitter worked just fine).
Facebook and all of its affiliated companies and services--like Instagram, WhatsApp, and Oculus--went down for about six hours. It started around 15.40 UTC, Facebook suddenly went off the grid, as if someone in there “plug-out” all the cables that made them disconnected from the Internet.
Well, it turned out to be more complicated than that, and it included technical terms to explain the cause of the Facebook outage. But don’t hold your breath, we’re going to get through it step by step, as simple as possible. Ready?
It All Started With...
I don’t know if you are familiar with the American horror franchise movie “Final Destination”. Well, I know not everyone has watched this movie, but please bear with me just a little.
In the movie, bad things are mostly caused by something that looks “insignificant”. But, this small event then leads to catastrophe. In one scene, for example, a train accident is actually caused by a single rat, chewing on some snack near open electrical cables on a subway line. And then, BANG!
Apparently, this similar “insignificant” act also caused the Facebook outage. Not the rat part surely, but Facebook stated it was “a wrong command during routine maintenance work”. And this “wrong command” had taken Facebook all the way down to the bottom of the internet surface.
Facebook periodically executes maintenance jobs to make sure all systems are running seamlessly. They said, during one of these routine maintenance works, a command was issued to assess the availability of global backbone capacity. Unintentionally, this took down all the connections in Facebook’s backbone network.
Let’s take a step back. So, what is a backbone network? According to Wikipedia, a backbone network is a part of a computer network that interconnects networks, providing a path for the exchange of information between different LANs or subnetworks.
For Facebook, this so-called “path” is made of fiber-optic cables that connect all of their computing facilities together. And in a company as big as Facebook, they built their own backbone network that consists of “tens of thousands of miles of fiber-optic cables” crossing the globe and linking all their data centers.
Perhaps we wonder how a giant high-tech company such as Facebook made such minor mistakes? Well, nobody is perfect. An outage had happened before to Google, Slack, and other internet-based companies. But, it’s no excuse.
In the recent case with Facebook, they admitted that “a bug” had made the command go “mad”. Facebook said their systems are designed to audit commands to prevent mistakes, but a bug prevented it from properly stopping the command. This caused a complete disconnection of server connections between their data centers and the internet.
If we typed “Facebook.com” in the browser last Monday, it would have shown “This site can’t be reached” and the server’s IP address could not be found. In short, the Domain Name System or DNS records became unreachable. According to ZDNet, both DNS and Border Gateway Protocol (BGP) were down. (Don’t panic, we’ll explain these two alien terms shortly).
What made things worse, Facebook lost the connection to the outside world and made remote access impossible. So, they had to physically access the data center, which meant the engineers went onsite to debug the issue and restart the systems themselves.
DNS and BGP explained
Now that the problem is solved and Facebook has been fully running again, it’s good for us to build understanding on an overview of how the internet is working. And as promised, decode the DNS and BGP!
Think of the DNS as a phone book and BGP as the navigation system. People don’t usually remember codes of numbers, they remember names. We perhaps remember our parents’ names, but not always their phone numbers, do we?
For instance, our website potatopirates.game has an IP address (a set of unique numbers) listed on the DNS server. But our customers don’t remember them, they just know our brand’s name. So, they type “www.potatopirates.game” on their browser, instead of typing the long IP address.
Now, this is where BGP takes the role. When you type the web address, the BGP decides the fastest route so that the www.potatopirates.game page appears on your screen in no time. Trust me, it’s so many routes and options on the internet that sometimes it gets messy!
In Facebook's case, their entire backbone network was removed from operation and declared themselves “unhealthy”. Thus, DNS servers disable BGP, making it unable to tell the internet about its existence on the web.
As a result, the Facebook page didn’t appear even though we’re connected to the internet and typed the web address correctly. As if Facebook didn’t exist!
That’s perhaps the simplest explanation. But in reality, it’s pretty impossible to explain how the internet works just in one sitting.
Admittedly, how the web actually works is pretty complicated. Just remember how allergic some people are when they hear the words “coding” or “computer science”. So, yes it takes time and a lot of effort.
If you’re still interested in understanding the internet world, though, there is a rather fun way of doing it. Simple enough, that is through board games! There are so many game options out there that can help you set the right comprehension about the internet concept and how it works.
One of them that is worth consideration is Potato Pirates: Enter the Spudnet. This board game is able to give you some knowledge about networking and cybersecurity concepts, interestingly, without a computer! You can also learn about the possible dangers online and how one can stay safe in cyberspace.
Moreover, this board game is designed for ages 10 and up, so it’s easy to understand, and most importantly, it’s a fun game for Saturday game night with all your family members. Perfect for gamers, parents, and educators alike, or for people who want to build an interest in computer science.
So, who’s up for a game night?