I'm not a backend person (servers, databases, etc) - other than setting up the basics and letting the uber nerds handle the rest :) - but it seems with an app that important, there should have been a fallback for any kind of server failure. My web host has several data centers, for example - as do all of the big hosting companies. This article may help you, seems to have some good info about the ORCA failure...
ORCA Woes