Microsoft’s Outlook.com email service suffered a massive 16-hour outage yesterday, which saw users unable to access parts of SkyDrive, Hotmail and Outlook. Microsoft has revealed that it was the result of a mischievous firmware upgrade, which caused “a rapid and substantial temperature spike” in the data centre.
It’s quite an unusual situation for a large-volume, high-profile web service. Indeed, it required more than just a few code tweaks to solve too:
Based on the failure scenario, there was a mix of infrastructure software and human intervention that was needed to bring the core infrastructure back online. Requiring this kind of human intervention is not the norm for our services and added significant time to the restoration.
Exactly what that human intervention was Microsoft doesn’t reveal. Whatever it was, with Microsoft pushing Outlook.com hard and sinking millions into transitioning Hotmail users across to it, it can’t really afford for these incidents to occur. [Microsoft via The Verge]