DRaaS is the answer to all those questions you hope to never ask: What if my critical applications are compromised by the next cyber attack? What if my infrastructure gets knocked out in a disaster? What if my business can’t protect itself?
Disaster Recovery as a Service (DRaaS) gives you a plan for dealing with any outage — no matter how unexpected or sinister the cause. Since all good DRaaS solutions are tailored to the business and its infrastructure needs, the same plan will work in every instance. With catastrophes both natural and human-induced, cloud-based disaster recovery services are now a necessary part of IT security planning.
When Forbes Insights and IBM surveyed 184 senior IT execs, 80 percent fully believed their Disaster Recovery plans could keep their business running. Of that same group though, only 22 percent had strategies that covered every critical application. That means a full 78 percent of surveyed businesses simply aren’t prepared to survive a disaster without significant downtime.
That number may even be higher. Self-reporting can’t tell us what shape those backup and disaster recovery plans are in. With cyber threats getting consistently more sophisticated and natural disasters becoming more common, it takes the 24/7 focus of DRaaS providers to stay ahead of potential problems. Here’s how we do it.
What is Disaster Recovery as a Service?
To understand the definition of DRaaS you have to start by understanding all the pieces that go into keeping a business online after disaster recovery events. We break our business continuity services into five parts:
- Disaster Recovery (DR)
- Disaster Recovery as a Service
Backup is a simple way to copy and later retrieve applications, files, and data from a location outside of your production environments. The same way you would back your computer up onto a hard drive, your business needs a super-charged, usually cloud-based, storage solution for every application and set of data your business needs to run. That way if there is a disaster, you can failover to your backup and continue operations as normal.
Archive is the cold storage of backups and data that don’t need to be recovered to maintain production. Archival tools are typically used for data with long-term security, legal, or compliance retention requirements — think the digital version of the filing cabinet with tax forms from five years ago. You need to keep it, but you don’t need to look at it often.
Replication ensures your Backup and Archive have the most-recent versions of all the information you need them to store. It relies on a frequent, even real-time, process for copying applications and data from one location to another, so information matches across systems.
Disaster Recovery combines backup, archival, and replication to protect your data. The objective of DR is a formal IT disaster recovery plan (DRP) that can be followed in an emergency to restore applications and services. This plan is a collection of DR software, policies, procedures, and actions that are clearly understood throughout an organization, tested regularly, and quickly executed in the event of a disaster.
Disaster Recovery as a Service is, effectively, managed Disaster Recovery services. That’s the definition of the “as a Service” part of DRaaS. SCTG, for example, takes on the work of keeping you operational, and adds things like failover testing to ensure the DRP will always work. It also turns the Disaster Recovery process into a cloud-based solution, moving backups and archives to our data centers. The entire DRaaS service is wrapped in a Service Level Agreement (SLA) that we write together to meet your operating objectives.
The right business continuity plan for your business could be as affordable as Managed Backup or as reliable as fully managed Disaster Recovery as a Service. Understanding each component allows you to pick and choose, so you get all the benefits of disaster recovery you need without paying for the things you don’t.
How a DRaaS solution is built
We follow a multi-step process to build cloud-based disaster recovery solutions tailored to your business needs:
- Testing and Deployment
The first step, Discovery, is a process in its own right. We start by working with the business to answer a series of questions:
- What’s the current status of your DR strategy?
- What are the RTOs/RPOs for each application?
- Can you set recovery prioritizations for each application, so we know in what order they need to be brought back online?
- What is the compute, memory, storage, and network footprints for each application?
- Can you define the minimum viable infrastructure for recovery of each application, so we know which need full coverage in a disaster and which don’t?
- How exactly do end users connect to these applications?
- What are the security and compliance requirements should a disaster event occur?
- What specifically needs to happen in order for an event to be declared and a failover action to be executed?
We ask such granular questions so we can define exactly what DRaaS must accomplish to keep the business operational. Applications and processes are likely to have significantly different requirements than one another, and so we need to know which are critical, and then quantify how critical each actually is. Knowing which applications need instantaneous failover and which can wait lets us define the requirements and create a plan that ensures business continuity.
In the Planning phase, SCTG takes all of the answers to the above questions and turns them into a plan. Our solution architects, network engineers, virtualization experts, storage experts, data center operations managers, provisioning team, and professional services managers work together to develop a cloud-based Disaster Recovery as a Service solution that meets the requirements we identified during the Discovery phase.
Validation is the step where the most give-and-take occurs. This is when we meet with you to present our plan and get your feedback. Then we discuss and tweak the project timelines, budget, and scope. The goal is to create a solution that’s just right, so everyone is happy and comfortable, and all needs are accounted for.
Development is when the infrastructure this solution starts to take shape. In this phase, SCTG builds out the event mitigation processes, network elements, security/compliance requirements, compute/memory/storage resources, and everything else required to give you a functioning solution. The Development phase includes a lot of tactical back-and-forth while we connect to your systems and build your DRaaS infrastructure.
Testing and Deployment
Testing and Deployment is when we put all that work through its paces, making sure the infrastructure performs as required. When we get to the Deployment phase, we run our first round of failover testing to simulate a disaster state and check to see that the infrastructure is set and all applications perform as intended in a failover state. Our manual failover testing allows us to validate:
- VM/VPG failover
- VM/VPG point-in-time failover
- Compete protected site disaster
- VPG failback & move
- IP address reconfiguration
- Automatic testing
- Interoperability of VMotion
- Interoperability of protected VM storage VMotion
- Recovering a shutdown VM
- Alarm system functionality
We continue failover testing in the Management phase. At this point, the solution is deployed and we’re providing ongoing maintenance and administration of all core business continuity elements, including network, security, compute, storage, failover testing, and event mitigation.
The results of a Disaster Recovery planning process
There are many benefits of cloud-based Disaster Recovery, peace of mind chief among them. All depend on the tangible results of this process: a backup environment (or environments) and a runbook.
Most often we seek to keep DRaaS online with disaster recovery as a cloud service. It allows for easier maintenance, more frequent backups and shorter RTO and RPO windows. We do have clients who require non-virtualized DR environments, however, whether for regulatory or security purposes.
Disaster Recovery hardware solutions are dependent upon the same information virtualized environments use, but they manage replication using snapshot technologies. Whether independent apps or in-SAN replication capabilities, we’re talking about completely different RTO and RPO windows.
There is, for example, no possibility of real-time backups with a non-virtualized environment.
To address that difference and others, we ask a handful of unique questions in the Discovery phase for clients looking for a server-based DR environment separate from the cloud.
The benefit of having a disaster recovery runbook
The No. 1 benefit of DRaaS though, is a unique runbook that outlines the business continuity and disaster recovery plan steps a business needs to take. We build a unique runbook for every SCTG customer based on their infrastructure architecture and RTO/RPO needs This runbook is a living, breathing document that we all follow in the event of a disaster. Because of its importance, we schedule periodic reviews as part of our Maintenance offering to make sure the runbook meets all needs and accounts for every circumstance.
What our managed disaster recovery service looks like in action
This runbook is the first thing we turn to when a business declares a disaster event. It gives a precise definition of the disaster recovery services that need to be put into motion. Before we do that, though, the leadership team needs to define what a disaster event means to their business. There are two ways to do this: automatically or manually.
Automatic disaster declarations
Automatic disaster declarations can be triggered by any disruption in the system that you deem critical enough for applications to failover. We work with you to define these alerts in Discovery, build them in Development, and note them in your runbook.
Manual disaster declarations
A manual declaration requires your on-the-spot judgement of what constitutes a disaster for your business. The process is set in motion with a phone call or trouble ticket, and we immediately begin to execute the failover for your applications and data.
Once the failover is complete and compute resources are spun-up and verified to be operational, we’ll redirect your IP space to the DR infrastructure. Your users will be able to continue using their applications, usually without even noticing a disruption.
Returning your environment to production
That’s not the last step of the runbook, however. Once the root cause of the outage has been determined — with SCTG’s help or without it, depending on your organization’s needs — we need to return your environment to production. This is, without a doubt, the most complex part of the entire process. Your runbook will define the unique processes needed to meet your requirements. This may include using the original production environment for DRaaS, or it may include an application-by-application reverse migration to the old production equipment.
Just like in the rest of DRaaS, there isn’t one answer here. We’ll work with you to discuss the benefits, challenges, and risks associated with each potential disaster recovery decision. Then we can determine which one best fits the infrastructure needs and budget of your business.
And by the end of it, you’ll know that your disaster recovery plan actually works.