Thursday, June 12, 2008

Troubleshooting Vmware(ESX) snapshots

Virtualization administrators can use snapshots on VMware ESX to travel back in time and figure out what went wrong with their virtual machines (VMs). But what do you do when your snapshots start acting funny? In this tip, we’ll troubleshoot potential problems that may come up when using snapshots on ESX.

Locating VMs that have snapshots
Trying to find out which VMs have snapshots can be challenging. There is no centralized way to do this built into the VMware Infrastructure Client or VirtualCenter, so you should periodically check your ESX servers for old snapshots that need to be deleted. There are a few methods you can use to accomplish this.

Method 1 – use the Find command on the Service Console

  1. Login to service console.
  2. Change to your /vmfs/volumes/ directory.
  3. Type find -iname “*-delta.vmdk” -mtime +7 -ls to find snapshot files that have not been modified in 7 days or simply find -iname “*-delta.vmdk” to find all snapshot files.

Method 2 – Use a free Perl script from Dominic Rivera called Snapalert. This script uses the VI Perl toolkit to talk directly to VirtualCenter and makes sure that no components need to be installed on each host (also works with ESXi Installable). Optionally, the script can also generate an email report.

Method 3 – Use a free utility from Xtravirt called Snaphunter, which can report back on the snaphot status of VMs from multiple ESX Servers and also send email reports.

Method 4 – Query the VirtualCenter SQL database. VirtualCenter keeps track of all the snapshots on every host in its VPX_SNAPSHOT table. I’ve written a Visual Basic Script (VBS) that queries this table to display a list of VMs with running snapshots. This method works okay. But it relies on database tables, which could potentially change in future versions of VirtualCenter.

Dealing with snapshots that do not delete properly
Occasionally, a snapshot will not delete properly leaving an active snapshot for a VM. This can happen when using VMware Consolidated Backup or when deleting snapshots through Snapshot Manager. In most cases, the snapshot will not appear in the Snapshot Manager for you to delete. The only indication that a snapshot may still exist is the presence of delta files in the VM’s directory.

If you do have a snapshot running that is not in Snapshot Manager, you can attempt to delete it one of two ways. First, create a new snapshot using the VI Client and delete all snapshots from the snapshot manager after the new one has been created. Alternatively, login to the ESX Service Console, switch to the VM’s home directory and create a new snapshot by typing vmware-cmd createsnapshot . Wait for the snapshot to be created and type vmware-cmd removesnapshots. When it completes, check to see if the delta files have been deleted. If they have, then it was successfully completed.

If the delta files weren’t deleted, check the vmx file for the VM and locate the lines starting with scsi. If the VM is configured with only one virtual disk, it is usually scsi0:0 (if .present is false, it is a non-existent drive that you can ignore). The .fileName should be using the original disk file that was created with the VM and is usually the same name as your VM. If this is the case, then your VM is not using the snapshot files. If it has a -00000# in the filename, it is currently using a snapshot file. The following makes this a little clearer: VM with no snapshots: scsi0:0.present = "true" scsi0:0.fileName = "myvmname.vmdk" VM with snapshots: scsi0:0.present = "true" scsi0:0.fileName = "myvmname-000001.vmdk"

If this is the case and the above operation failed, your only other option is to either clone the VM or clone the VM’s disk file. To clone the VM you can use VMware Converter to create a new clone of your existing VM and, when completed, shutdown and delete the old VM.

Another method is to shutdown the VM, login to the Service Console, switch to the VM’s directory and clone the VM’s disk file by using vmkfstools and specifying the snapshot file as the source disk, i.e. “vmkfstools –i myvmname-000001.vmdk myvmnamenew.vmdk” Once it completes go into the settings for the VM, remove (don’t delete) the hard disk, add a new hard disk and browse to the newly created disk file. Power on the VM and verify everything is working before you delete the old disk and delta files.

Changing snapshot file locations
By default, the snapshots are written to the home directory of each virtual machine. Sometimes you may want to change this to not take up space on the volume of which your VM resides. It is possible to individually specify a new working directory for snapshots on each VM. Both snapshots and vswp files are written to this directory when you do this.

Be warned, though. If your VM is on shared storage and you specify local storage as a location you will not be able to use features like VMotion/HA/DRS. To do this follow these steps:

  1. Power off your VM and login to the Service Console.
  2. Edit the VMX file of your VM with Nano or Vi
  3. Add a new line using the following syntax: workingDir = “/vmfs/volumes/SnapVolume/Snapshots/”
  4. If you want your vswp file to stay in the VM’s directory, add the following line to the VMX file: sched.swap.dir = “/vmfs/volumes/VM-Volume1/MyVM/”. This step is optional. Furthermore, you do not need to worry about updating the existing “sched.swap.derivedName” parameter because it is generated by the VM and written to the config file each time the VM powers on.
  5. Power on your VM and your vswp, vmsn and snapshot (delta-vmdk) files will now be located in this directory

Using VMotion with snapshots:
If you try to VMotion a VM with running snapshots from one host to another you will receive the following warning: “Reverting to snapshot would generate error (warnings) on the destination host.” This simply warns that if you have changed the default locations for any files of the VM (like snapshot or vswp files as detailed above), the VM will crash when the migration is complete. This is true if the destination host cannot access the same storage that the files are located on as the source host.

So if your VM was on shared storage and configured so that the snapshot files were on local storage, then you would have a problem if you VMotioned the VM to another host. If your VM has all its files on shared storage, and that storage is accessible to all ESX hosts then you’re in good shape. VMware recommends that you commit all snapshots before VMotioning VMs. But if you do not do this, it will work just fine.


No comments: