An Amazon employee inadvertently took down part of the internet
March 2, 2017 01:40 PM
(Bloomberg)—Amazon.com Inc. said efforts to fix a bug in its cloud-computing service caused prolonged disruptions Tuesday that affected thousands of websites and apps, from retailers to project-management and expense-reporting tools to commuter alerts.
An Amazon Web Services employee working on the issue accidentally switched off more computer servers than intended at 9:37 a.m. Seattle time, resulting in errors that cascaded through the company’s S3 service, Amazon said in a statement Thursday. S3 is used to house data, manage apps and software downloads by nearly 150,000 sites, including ESPN.com and aol.com, according to SimilarTech.com.
"We are making several changes as a result of this operational event," Amazon, No. 1 in the Internet Retailer 2016 Top 500 Guide, said in a statement Thursday. "While removal of capacity is a key operational practice, in this instance, the tool used allowed too much capacity to be removed too quickly. We have modified this tool to remove capacity more slowly and added safeguards to prevent capacity from being removed when it will take any subsystem below its minimum required capacity level."
The websites for Express Inc. (No. 99 in the Top 500), Lululemon Athletica Inc. (No. 96) and One Kings Lane (No. 104) went down completely when AWS S3 experienced problems at a northern Virginia data center for about four hours, according to an analysis by Apica, a website monitoring and optimization firm. The outage did not affect Amazon or Amazon-owned Zappos.com, Apica says. Data from web analytics companies SimilarWeb and SimilarTech show that Amazon is the leading shopping site that uses AWS S3, with about 2.47 billion global monthly visits.
A major failure from what appears to be a minor maintenance procedure highlights that AWS, and the cloud computing industry in general, still have some maturing to do, says Ed Anderson, an analyst at Gartner Inc. "The fact that an incorrect keyboard entry could bring down an entire region shows they have some operational issues," Anderson says. "Even though they are the world’s biggest cloud provider, they still have some work to do to refine their processes."
AWS is the company’s fastest-growing and most-profitable division, generating $3.5 billion in revenue in the fourth quarter. It’s the biggest public cloud-services provider, with data centers around the world that handle the computing power for many large companies, such as Netflix Inc. and Capital One Corp. Amazon and competitors like Microsoft Corp. and Alphabet Inc.’s Google are growing their cloud businesses as customers find it more efficient to shift their data storage and computer processes to the cloud rather than maintaining those functions on their own. Widespread adoption also increases the likelihood that problems with one service can have sweeping ramifications online.