Facebook Blames Outage On Database Failure
For the techies:
Facebook officials said the lengthy outage that hit the site Thursday was the result of a glitch in the social media network’s database software.
“The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition,” said Robert Johnson, Facebook’s director of software engineering, in a blog post.
Johnson said software that’s designed to detect and fix such errors backfired, compounding the original problem. “An automated system for verifying configuration values ended up causing much more damage than it fixed,” said Johnson.
“The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values form the persistent store,” Johnson continued. “This works well for a transient problem with the cache, but it doesn’t work when the persistent store is invalid,” he said.
[…]