Portions of the Internet panicked yesterday when Gmail was hit by an outage that lasted for an agonizing 18 minutes. The outage coincided with reports of Google’s Chrome browser crashing. It turns out that the culprit was Chrome’s sync server, which allows users to sync bookmarks and other browser settings across multiple computers and mobile devices.

Ultimately, it was human error. Google engineer Tim Steele explained the problem’s origins in a developer forum:

  • Chrome Sync Server relies on a backend infrastructure component to enforce quotas on per-datatype sync traffic.
  • That quota service experienced traffic problems today due to a faulty load balancing configuration change.
  • That change was to a core piece of infrastructure that many services at Google depend on. This means other services may have been affected at the same time, leading to the confounding original title of this bug [which referred to Gmail].
  • Because of the quota service failure, Chrome Sync Servers reacted too conservatively by telling clients to throttle “all” data types, without accounting for the fact that not all client versions support all data types.

The crash is due to faulty logic responsible for handling “throttled” data types on the client when the data types are unrecognized.

If the Chrome sync service had gone down entirely, the Chrome browser crashes would not have occurred, it turns out. “In fact this crash would *not* happen if the sync server itself was unreachable,” Steele wrote. “It’s due to a backend service that sync servers depend on becoming overwhelmed, and sync servers responding to that by telling all clients to throttle all data types (including data types that the client may not understand yet).”

Read 2 remaining paragraphs | Comments

via Ars Technica » Technology Lab http://feeds.arstechnica.com/~r/arstechnica/technology-lab/~3/mhb_IMDDJq0/

Advertisements