There is no sadder moment in the back-up world than finding out the file or database you need to restore has never been backed up. Understanding how systems, directories, and databases are included in the back-up system is the key to making sure this never happens to you.
The first step toward this goal is making sure that servers and services you want backed up are registered with your back-up-and-recovery system.
For example, if you start using a new software-as-a-service offering such as Salesforce, no back-up system will automatically notice that addition and start backing it up for you. If you are fully virtualised on VMware, the systems will automatically notice if you add a new node to the configuration.
But if you start using Hyper-V or KVM, no back-up system will automatically notice there is a new hypervisor in the data centre and start backing it up. And of course your back-up system will not notice you installed a new physical server.
Selective inclusion vs selective exclusion
There are two very broad categories of how items can be included in a back-up system: selective inclusion and selective exclusion.
Selective inclusion is where the administrator individually specifies which filesystems, databases, or objects the back-up system will back up. For example, if an administrator says they want to back up just the D:\ drive, or just the Apollo database, they are practicing selective inclusion.
Selective exclusion, or automatic inclusion, is when an administrator specifies to back up everything on the server, except for what is specifically excluded. For example, an administrator might select to back up all filesystems except for /tmp on a Linux system or a user's iTunes or Movies directories on a Windows laptop.
It's very common for administrators to think they administer systems in such a way that there is no point in backing up the operating system. They know they want to back up C:\Users on a Windows laptop, /Users on a Macbook, or something like /data or /home on a Linux system.
They see no point in backing up the operating system or applications, and so they manually select just the filesystems they want to back-up. The same is true of databases. They might not want your back-up test databases, so they selectively include which databases to back-up.
Selective inclusion downside
The problem with selective inclusion is configuration changes. Every time a new database or file system with data is added to a system, someone needs to change the back-up configuration, otherwise, the new resource will never get backed up.
This is why selective exclusion is the safest back-up-selection method. The worst possible side effect is you might back-up some worthless data. Compare this to the worst possible side effect of selective inclusion, which is important data gets completely excluded from the back-up system.
There is simply no comparison between the two. Selective inclusion may appear to save money because less data is stored, but it’s not worth the risk.
It is easy to exclude data that you know to be worthless, such as /tmp or /temp on a Linux system. If you see no reason to back-up the operating system, you might also exclude /, /user, /usr, /var, and /opt. On a Windows system, you could exclude C:\Windows and C:\Program Files if you really don't want to back-up that data.
One thing to consider, though, is the effect that deduplication might have on this decision. It's one thing to know you are backing up hundreds or thousands of filesystems that have no value and waste valuable storage space on your disk array or tape library.
But what if the operating system that you are spending so much time excluding is actually only stored once? Deduplication would ensure that only one copy of a Windows or Linux operating system is actually stored in your back-up system.
Considering this, perhaps you could just leave the back-up system at its default configuration and not worry about excluding the operating system, as the cost to your back-up system will be very small.
Another way to automatically add back-up data to the back-up system is via tag-based inclusion. This has become popular in the virtualisation world, where each new VM or database that is created can be given one or more tags that can be used to classify the type of VM or database.
For example, all new database servers might be given the database tagging, indicating to multiple other processes that it is a database-related VM. This might tell certain monitoring systems to monitor whether the database is available. It might also automatically apply certain security rules and firewalls to that VM. And in most back-up systems it can also automatically apply a database-centric back-up policy to that VM as well.
One important thing to note when using tag-based inclusion: You need a default back-up policy. You should create a back-up policy that your back-up software automatically uses if no appropriate tags are found.
Then make sure to monitor that default policy for any new systems that show up, because it means that the data on those systems might not be getting properly backed up.
If your back-up software product does not support a default back-up policy when used with tag-based inclusion, it might be best not to use this functionality, because it comes with the risk of new VMs or databases not being backed up.
Just be safe
Default to the safest methods of back-up inclusion, which is automatic or selective inclusion. Spend your valuable time on other administrator activities, without having to worry whether your new database is being backed up.
Back-up priorities should always be safety and protection first, cost second. No one ever got fired because their back-up system backed up some worthless data.