VMware vSphere Data Protection (VDP) 6.x is included with VMware vSphere 6.x Essentials Plus Kit and higher editions of vSphere. Considering the incredible number of organizations running vSphere, it is no surprise that VDP is being used more and more everyday. VDP is engineered by EMC and it is based on EMC's Avamar solution. That means customers get great Avamar features in VDP 6.x such as variable length deduplication and backup data replication.
As with just about any software solution, it has a few nuances and best practices that should be followed. I spoke with a few of our VMware Technical Support Engineers to find out what issues generate the most calls. This article briefly discusses these top issues and provides some general guidance on getting started with VDP.
1. Storage is Too Slow
This is probably the most common cause for a variety of VDP issues. As with nearly any backup product, VDP generates a fair amount of I/O when performing backups and maintenance activities. If the storage on which the VDP appliance is running cannot support the I/O generated by VDP, this will naturally cause issues. VDP includes a performance analysis test feature that should be used to verify storage can meet or exceed the levels of I/O typically required. This test can be run when the VDP appliance is deployed (recommended) or after the appliance has been deployed. To run the test after deployment, log into the VDP Configure UI and select the storage tab. The screen shot below shows the test results from a VMware Virtual SAN datastore (four hosts, hybrid configuration). Needless to say, Virtual SAN easily exceeded the requirements.
If you suspect slow storage is causing issues, you can reduce the number of concurrent backup jobs a VDP proxy runs. This is done from the Configuration tab of the VDP Configure UI. The default is eight concurrent backups. This can be configured to any number from one to eight. A lower number will lessen the workload on the underlying storage.
Another option that might help alleviate several issues is increasing the amount of memory configured for the VDP appliance. For example, an appliance with 1TB of backup data capacity is configured with 4GB of memory. Changing this to 6GB or 8GB might help to resolve an issue.
2. Add more memory
VDP is deployed with 4GB of memory by default and larger amounts of memory are highly recommended with larger VDP capacities as seen in the VDP admin guide (see next tip). In many cases, stability and performance is improved with adding more memory. For example, bump that 4GB default up to 8GB. With larger VDP capacities, consider 16GB, 20GB, and maybe even 24GB.
3. Read the VDP Administration Guide
I know - this seems like a "no-brainer", but you would be surprised how many people ask questions that are answered in the admin guide. It is important to review the prerequisites and limitations of VDP - especially regarding the application agents. Please read the VDP documentation before deploying VDP.
4. VDP Capacities with Data Domain
Dell EMC Data Domain can be used by VDP as a backup data target in larger environments. Even though the backup data is stored on the Data Domain system, capacity is still needed for the VDP appliance to store metadata and perform other operations. Use the following chart when selecting a VDP capacity size for environments where Data Domain will contain the VDP backup data.
5. Don't Exceed the Backup Data Capacity of the VDP Appliance
This is one of those items covered in the VDP admin guide. The consumed capacity of a VDP appliance should not exceed 80%. This allows for fluctuations of capacity consumption as backup jobs are added, changed, or removed and backup data changes. Beyond 80%, VDP will issue warnings about consumed capacity. Things get considerably worse if free capacity gets below 5%. The recommendation is to round up when considering how much backup data capacity to configure for a VDP appliance. Example: If considering 2TB or 4TB, deploy 4TB. It is possible to expand the capacity of an existing VDP appliance up to a total of 8TB. However, this process does require a fair amount of time. Details on this are also in the admin guide.
6. Number of Protected VMs and Load Balancing
In general, it is best to keep the number of protected VMs to 100 or less per VDP appliance. In larger environments, deploy a VDP appliance for every 100 protected VMs for optimal load balancing and performance.
7. Protected VM Sizing
Limit the size of each protected VM to a maximum of 2 TB, if possible, for better performance of backups and restores.
8. Backing up SQL Server, Exchange, and SharePoint VMs without the VDP Agents
Nearly any virtual machine can be added to an image-level (entire VM) backup job. Some organizations choose to back up workloads such as SQL Server and Exchange using the image-level backup job type in VDP. While this is the easiest method, it is often better to utilize the application agents that are included with VDP. The agents provide application-consistent backup and recovery and include options such as selecting individual databases and performing log truncation. However, it is important to understand these agents back up only the application databases, not the guest OS, application executables, configuration files, and so on.
9. Run the Latest Version of VDP
VMware and EMC release new versions of VDP on a regular basis. In nearly every case, these new releases contain bug fixes and improvements to VDP to provide a better user experience. VMware Support often gets support requests for issues that are fixed simply by upgrading to the latest version of VDP. I realize it is not always feasible to upgrade a software solution every time a new release comes out, but the recommendation is of course to perform these upgrades when possible. Be sure to read the documentation on the upgrade process.
10. DNS Configuration
As with most VMware solutions, there is a dependency on DNS. Be sure to create a DNS record for the VDP appliance before you deploy the appliance, not during or after deployment. Also be sure, time (NTP) is configured properly and consistently across the infrastructure.
11. Backing up vCenter Server with VDP
I'll start by saying this blog article is not intended to provide comprehensive guidance on backing up vCenter Server, SSO, and other VMware solutions. When it comes to VDP, keep in mind VDP requires a connection to vCenter Server while backup jobs are running. If vCenter Server is being backed up by VDP, it is possible the connection will be temporarily interrupted as the VM snapshot process attempts to quiesce the vCenter Server VM for the snapshot. If backup jobs are running when the connection is lost, the backup jobs could fail. Recommendation: Place vCenter Server in its own backup job and schedule this backup job after all other backup jobs are typically completed. If you would like more information on general availability recommendations for vCenter Server, please see the VMware vCenter Server 6.0 Availability Guide.
12. Consider VDP External Proxies for Remote Datastores
I'll start with the definition of a "remote datastore". A remote datastore is basically a datastore that is not accessible by the host on which the VDP appliance is running. For example, a vCenter Server environment with two vSphere host clusters. A single VDP appliance could run in one cluster and backup VMs in the other cluster. The other cluster is considered a remote cluster from a VDP perspective.
Typically, all hosts in a cluster have access to the same datastores whereas hosts in another cluster do not have access to those same datastores. VDP utilizes the "HotAdd" transport method to back up VMs running in the same cluster as the VDP appliance. If VDP is not able to utilize the HotAdd transport, the backup occurs over the network using NBDSSL or NBD transports. More details on transport methods can be found in the VMware VDDK documentation.
To improve performance and minimize network utilization for VDP backups, VDP features external proxy virtual appliances than can be deployed to remote datastores. When performing backup jobs, these proxies verify whether the same backup data segment already exists in the VDP appliance. If it does exist, the proxy instruct VDP to basically create a pointer to that existing segment and the proxy will not send the backup data segment over the network again to reduce network bandwidth consumption. External proxies have a few other benefits, as well. Please see the VDP admin guide for those details.
13. Avoiding VDP Cleanup Steps after an Infrastructure Outage
VDP should always be shut down gracefully (i.e. guest OS shutdown). It does not take kindly to sudden changes in power state such as powering off or resetting the VDP virtual appliance. In a few cases, this is unavoidable such as when power fails. Should this occur, the recommendation is to triage VDP by checking for VMDKs attached to the appliance from a backup job that was running and verifying the VDP appliance does not need to perform a "checkpoint rollback". Information on checkpoints and performing a rollback is found in the VDP admin guide.
Checkpoints are created when integrity checks are successfully completed during a VDP maintenance window. It is also good to make sure these integrity checks run and checkpoints are created on a regular basis.
14. Check the Logs when there are Issues
I will be the first to admit that logging is not the best in VDP, but it has improved considerably in versions 6.x. The first recommendation is to upgrade to the latest version of VDP to take advantage of these improvements. VDP 6.0.x and 6.1.x are compatible and supported with vSphere and vCenter Server versions 5.5 and higher. If you are running vSphere 5.5, you can access the VDP 6.x OVA files for download by temporarily upgrading one of your vSphere license to 6.0 in your VMware licensing portal. This is assuming you have current Support and Subscription (SnS) for the vSphere license you wish to temporarily upgrade.
You can also view high-level logging using the VDP UI in the vSphere Web Client.
Another good logging tip is to log into the VDP appliance using a console connection or SSH and running this command:
tail -f /vdr/server_logs/vdr-server.log
This shows log entries in real time as backup and replication jobs are running, restores are taking place, and so on. Enabling SSH root login in a VDP appliance is discussed in this blog article.
This is an educated guess at best simply because there are so many variables that affect backup data capacity usage - the amount of data, the types of data, and the data change rate, to name a few. With that in mind, here are some general guidelines to at least provide you with a starting point:
- Average VM is 60GB of actual data (not configured VMDK size)
- 5% daily data change rate
- 30-day backup data retention
The sizing guideline assuming averages above: 20 to 25 VMs per 1TB backup data capacity.
Do not consistently consume more than 80% of VDP appliance capacity. Ideal steady-state capacity consumption is 60-80%, which provides good deduplication benefit while not going over the 80% threshold.
8TB VDP appliance should support up to approximately 160 "average" VMs (8TB x 80% x 25 VMs). Assume fewer (100-120 VMs) for a more realistic/conservative estimate.
Each appliance is managed separately. Most will not want to manage more than 2 or 3 appliances, which means VDP is not recommended for larger environments. Longer/shorter retention, larger data sets, types of data, change rates, etc. will affect actual results.
16. Snapshot Alarms
In some cases, VDP might fail to clean up a snapshot after a VM is backed up. Set vCenter Server alarms to notify administrators when a VM is running on a snapshot:
Configuring vCenter Server to send alarms when virtual machines are running from snapshots (1018029)
Virtual machine consolidation needed alarm (2061896)
Hopefully, this article has provided some useful information to help get started, avoid issues, and/or solve a few common problems with VDP. However, this article is far from an all-encompassing guide to VDP. I will again strongly recommend that VDP users start with the admin guide. If an issue persists, open a support request with VMware Support.
@jhuntervmware on Twitter