For businesses that have been attacked by ransomware, a fully automated, high-speed disaster recovery is the way to successfully avoid paying the ransom. Recovery is the second step in the two-step process after getting rid of the malware as described here.
There are three ways to affect a disaster recovery after a ransomware attack: a traditional recovery, an image-based recovery, or a cloud-based recovery. But the only way for most environments to afford automating a large-scale recovery is to recover in the cloud.
Traditional disaster recovery
A traditional disaster recovery is one where you begin a traditional restore after you have suffered a loss -- in this case, after you receive a ransom demand. It is still a traditional restore if you are restoring virtual machine images to a hypervisor platform such as VMware, Hyper-V, or KVM, or a hyperscaler such as Amazon Web Services, Azure, or GCP.
What makes it traditional is that you are waiting until the event happens to begin the restore. As you will see later in this article, there are ways to restore the data before you need it.
This is type of recovery is considered traditional because it’s the way everyone did it not so long ago. Most companies did not have the budget to maintain a recovery data centr so they paid a service to have a recovery data center available for them in case they needed it.
Since they had to pay every time they actually used that data centre, they never thought of restoring the data in advance. They waited until a disaster happened, then contacted the recovery data centre and began the recovery. This method is slow and very difficult to automate, as you don’t own the hardware on which you will be performing you restore.
Please do not use this method. Performing a traditional disaster recovery of an entire data center after it has been infected or destroyed by a disaster simply takes too long, and in the case of ransomware, you will inevitably be pressured to pay the ransom. Since paying the ransom is bad, a DR plan that takes too long is a DR plan that you should reject. It is time to move to a much faster, much more automatable process.
The idea here is to separate the recovery of the operating system and applications from the recovery of the data itself. If you can do that, you can significantly simplify the recovery of the operating system and application by using an imaging solution. Create an image for each operating system/application pair so you can quickly and easily re-image as many servers as you want to. This can also be automated.
Restoring 50 servers from a back-up might take a long time, but re-imaging 50 servers or VMs or containers might actually go much faster. This is because some imaging solutions can do a thin-provisioned restore, where the VM is allowed to begin booting while the re-image process is still in process. This is also possible in some backup systems, but only for a limited number of VMs. Therefore, imaging systems are likely to bring your systems back online quicker than a traditional restore.
Image-based recovery also can be done in a predictable amount of time if you are used to using such a system. If every time you upgrade your operating system you do so by re-imaging it from an already-upgraded image, you will know exactly how long this type of recovery would take.
This is as opposed to patching the existing operating system. There are a variety of imaging solutions on the market that handle both Linux and Windows servers and VMs. Kubernetes and Docker also lend themselves to this method of recovery.
Image-based recovery serves as a defence against ransomware that attacks the operating system or application of a server by encrypting critical system files. It’s not very useful against ransomware that actually encrypts documents or database files, since those will still need to be restored in the traditional fashion. Therefore, this method is really only slightly better than a traditional disaster recover, but it’s still better.
Recovery in the cloud can happen before you ever need it. It starts with automatically and periodically performing an incremental restore of your computing environment to an IaaS vendor. This means your entire environment -- including back-ups of both structured and unstructured data -- is already restored before it’s needed.
Yes, you will lose some amount of data depending on the window between the last restore and the ransomware attack, so you will need to decide up front how often you execute the pre-restore process to minimise the loss. You also need to agree on what amount of data loss is acceptable, which is officially referred to as your recovery point objective (RPO).
Technically, this type of recovery doesn’t require the cloud, but using the cloud makes it financially feasible for most environments. Doing it with a physical data centre requires the cosly route of paying for the data centre before you need it. With the cloud you pay only for the storage associated with your pre-restored images.
Cloud-friendly back-up and DR products and services can proactively restore your entire environment to the cloud of your choice -- once a day, once an hour, or continuously. Obviously, the more often you update, the more it costs.
The beauty of pre-restoring is that you can bring up your entire computing environment in a matter of minutes, and there are products and services that can meet RTOs of anywhere from a few seconds to a few hours.
Imagine receiving a ransomware message and simply laughing because all you need to do is press one button and your entire environment is running from the cloud in a matter of minutes. The process is completely automated.
Note: Even if you restore your environment in minutes, you still need to ID the ransomware and get rid of it. Some recovery solutions can help automate this process. Given its ability to pre-restore data without charging for compute until it’s needed and the speed with which it carrys out an actual restore make cloud-based recovery worth considering.