Delta Down and Cloud Outages...What Happened?

Delta Airlines Cloud Down




The Bottom Line

Large enterprises tend to have large, complex system environments with thousands of components and a mix of networking, servers, and storage hardware.

Moreover, the ecosystem usually contains multiple versions of the same application, database, middleware and operating system software and generations of hardware. Keeping all the elements patched, current and in sync is a challenge at most companies.

Then add to that the communications networks, universal power supplies (UPSs), ATSs, and other switching gear needed to complete the picture – and then throw in the backup equipment needed to handle the redundancy. No matter who you are or how big or small your IT environment maintaining availability at 99.999 percent (or even over 99.5 percent) is a challenge and cannot happen without a good operating and backup/recovery (BC/DR) plan that is continually executed and tested. 

Failure is inevitable – but it is IT's job to mitigate the impacts to the business to acceptable levels given the funds available.

While there are some computer systems and software stacks that are intended to provide higher availability than others, achieving high availability goes beyond that and requires good operating, redundancy and BC/DR planning and processes that are assiduously followed. IT executives should work with the corporate and line of business executives to determine the optimum service level requirements (including recovery point and time objectives – RPOs and RTOs) and the associated capital and operating costs.

This will not eliminate downtime but it should set the right levels of expectation that can be used as a basis for establishing backup procedures that can be executed when systems are unavailable.

Related articles:

Death by Cloud, the Explosion of Instances and Mitigation

3 Things to Consider When Moving to the Cloud

The Tail Wags the Dog- Death by Cloud- Part 2 [Video]


About the author

Cal Braunstein

Mr. Braunstein serves as Chairman/CEO and Executive Director of Research at the Robert Frances Group (RFG). In addition to his corporate role, he helps his clients wrestle with a range of business, management, regulatory, and technology issues. 
He has deep and broad experience in business strategy management, business process management, enterprise systems architecture, financing, mission-critical systems, project and portfolio management, procurement, risk management, sustainability, and vendor management. Cal also chaired a Business Operational Risk Council whose membership consisted of a number of top global financial institutions.