NEW YORK — Internet disruptions tied to Amazon's cloud computing service affected people around the world Monday trying to connect to online services used for work, social media and video games.
About three hours after the outage began, Amazon Web Services said it was starting to recover from the problem. But the company later said it was continuing to respond to “significant” errors and connectivity issues across multiple services.
What is Amazon Web Services
Amazon Web Services is a cloud computing provider that hosts many of the world’s most-used online services. AWS provides behind-the-scenes cloud computing infrastructure to many government departments, universities and businesses
Seattle-based Amazon said the problems were centered in its Virginia-based US-EAST-1 data center region, one of its most important cloud hubs around the world. The region is a backbone “for so many services that when things go screwy, domino effects around the internet-as-we-know-it are enormous,” wrote John Scott-Railton, a cybersecurity researcher at Citizen Lab, in a social media post.
What happened?
AWS traced the source of the problem to something called the “DynamoDB endpoint in the US-East-1 Region,” in a pair of jargon-laden updates.
“DynamoDB isn’t a term that most consumers know, but it underpins the apps and services that all of us use every single day,” said cybersecurity expert Mike Chapple.
DynamoDB is a centralized database service that many internet-based services use to track user information, store key data and manage their operations, Chapple said by email.
It’s “one of the record-keepers of the modern internet,” said Chapple, an IT professor at the University of Notre Dame’s Mendoza College of Business. “It’s fast, it’s cheap, and it’s reliable. But today it stopped working and we saw the effects of that outage ripple across the internet.”
Amazon’s updates suggest the problem isn’t with the database itself, but rather that something went wrong with the records that tell other systems where to find their data, he said.
“Amazon had the data safely stored, but nobody else could find it for several hours, leaving apps temporarily separated from their data. It’s as if large portions of the internet suffered temporary amnesia,” Chapple said.
Amazon has attributed the outage to a domain name system issue. DNS is the service that translates internet addresses into machine-readable IP addresses that connects browsers and apps with websites and underlying web services. DNS errors disrupt the translation process, interrupting the connection.
Because so many sites and services use AWS, a DNS error can have widespread results.
Who was affected?
Internet users around the world faced widespread disruption because Amazon's problem took down dozens of major online services, including social media site Snapchat, the Roblox and Fortnite video games and chat app Signal.
On DownDetector, a website that tracks online outages, users reported issues with Snapchat, Roblox, Fortnite, online broker Robinhood, the McDonald’s app and many other services.
The risks of centralized cloud services
Some cybersecurity experts have warned for years about the potentially ugly consequences of allowing a handful of big tech companies to dominate key internet operations.
“So much of the world now relies on these three or four big (cloud) compute companies who provide the underlying infrastructure that when there’s an issue like this, it can be really impactful across a broad range, a broad spectrum” of online services, said Patrick Burgess, a cybersecurity expert at U.K.-based BCS, The Chartered Institute for IT.
“The world now runs on the cloud,” and the internet is seen as a utility like water or electricity, as we spend so much of our lives on our smartphones, Burgess said.
And because so much of the online world’s plumbing is underpinned by a handful of companies, when something goes wrong, “it’s very difficult for users to pinpoint what is happening because we don’t see Amazon, we just see Snapchat or Roblox,” Burgess said.
“The good news is that this kind of issue is usually relatively fast (to resolve)” and there’s no indication that it was caused by a cyber incident like a cyberattack, Burgess said.
Has this happened in the past?
This is not the first time issues with Amazon’s key services have caused widespread disruptions.
Many popular internet services and publishers were down after a brief outage in 2023. AWS’s longest outage in recent history occurred in late 2021, when companies -- everything from airline reservations and auto dealerships to payment apps and video streaming services -- were affected for more than five hours. Other major outages happened in 2020 and 2017.
Unrelated to Amazon, a faulty software update by cybersecurity company CrowdStrike also rippled across the world to cause massive disruptions in 2024.
Concrete company responds after multiple drivers say repaving project damaged their cars
Weeks after drivers started reaching out to Tampa Bay 28 about damage to their vehicles, we’re finally hearing from the contractor behind the Tyrone Boulevard repaving project.