Microsoft Azure Outage Blamed on Bad Code
Microsoft has corrected a software bug in its Azure Cloud Storage that, when deployed widely on Nov. 18, triggered a massive outage. In some cases the error triggered infinite loops and tied up storage servers, dragging down much of the Azure cloud into a drastic slowdown.
As a result, connection rates to Azure on Nov. 18 dropped from 97% to 7%-8% after 7 p.m. Eastern in Northern Virginia. The Azure data center in Dallas suffered a complete outage for a short while. Data centers in Europe didn’t recover until deep into the following day.
Microsoft had tested the storage update code before deployment, but contrary to its own best practices, Azure administrators rolled it out to Azure storage services as a whole instead of “flighting” — limiting the roll-out to small sections at a time.
More: Microsoft Azure Outage Blamed on Bad Code - InformationWeek