Drive:Activated logo
hi there!

I see you've stumbled on to my humble home on the net, Drive:Activated. My name's Sam, I'm an ambitious and driven uni student, residing in Melbourne, Australia, wanting to make my mark on our world. This is my site, which is mainly just my blog and some other bits. There's no definite theme to my blog, just anything that interests me, and currently that's web trends, startups, ideas and cool stuff. Check it out, leave me a comment, click on 'Who is this?' to find out more about me, or drop me a line by clicking on 'Let's Talk'. Hope you enjoy it!

My signature

Content sign

Recovering VMware snapshot after parent changed

   Filed under: , ,    

Scroll down to the problem or solution section below if you want to cut to the chase. 

I upgraded my Kubuntu installation to Gutsy today - of course, it wasn't as smooth as it should've been. First I had to work out how to do it - the instructions were brief, screenshots confusing, and the process just didn't feel natural. The 'version upgrade' button only appears after you have satisfied certain conditions, conditions that you don't know. It just magically appears when it wants to, after pressing a special sequence of buttons.

Then the 'distribution upgrade' process crashed, packages won't install. Ended up working after a few tries.

For some stupid reason, they still haven't fixed the 'failed to set xfermode' bug that heaps of people have encountered and really cripples the system because the system doesn't boot at all. In fact, it removes the fix for it too - adding irqpoll to the end of the kernel line for the appropriate entry in /boot/grub/menu.lst.

Plus they introduced a new bug by adding tablet settings into /etc/X11/xorg.conf by default, even if no tablet exists, tripping up the system. And did I mention that the network connection is flaky and standby/hibernate still doesn't work? Linux is still Linux it seems.

Anyway, it all worked out in the end after some googling so I went to install VMware Server on it so I could run my virtual machines on it as well as in Windows. There is no package install available for it, so follow the instructions here, however, use this patch instead.

Once all that was working, I ran the VMware Console, about to run my Windows Server 2003 Standard Edition virtual machine, when I thought, hmm..., I don't want this VMware instance fudging with the Windows VMware instance, so I'll create a new virtual machine, and link it to the existing virtual hard disk.

Problem

All sounded cool, until I accidentally linked to the base parent hard disk, and not the latest snapshot. So once I booted it, not only did I not have the latest changes, but when I re-linked it to the latest snapshot, it wouldn't boot anymore. Instead I got the error message, "Cannot open the disk ... Reason: The parent virtual disk has been modified since the child was created."

Did I mention that the virtual machine housed the test instance for this website, including the changes I had been working on all weekend, and I had no other backup? Stick out tongue

After a few minutes of cursing and swearing, banging on tables, wondering wtf I had done, and pondering redoing all those changes again, I did what every self-respecting nerd does when they're stuck - turn to google.

Solution

I found these links:

Here is my solution, which is basically a rewrite of the process in the last link above, with a few more details. I used Linux to do the recovery, mainly because it had commands that I needed. I assume you have some Linux command line knowledge, as all this will be performed in the terminal.

  1. Make a copy of the virtual machine folder in case you screw up.
  2. Look at the size of the snapshot virtual hard disk. If it is more than 2GB and you're running a 32-bit OS, or it is more than the amount of memory that you have available, the following method will probably not work. You're welcome to try though.

    The virtual hard disk files all end in .vmdk. The snapshot one has -xxxxxx on the end of the file name, indicating the snapshot number. For example, if my virtual machine was called Windows Server 2003 Standard Edition, my base parent virtual disk will be named Windows Server 2003 Standard Edition.vmdk, and my snapshot may be named Windows Server 2003 Standard Edition-000002.vmdk.
  3. Find out the CID of the base parent virtual hard disk. Because this virtual hard disk will most likely be larger than 2GB, you won't be able to open it in nano, vi etc. As we only need to read from it, we can use a linux command to print out only the first 20 or so lines.
    head --lines=20 {base parent vmdk path}

    Replace {vmdk path} with the path to the base parent virtual hard disk file, e.g.
    head --lines=20 /media/sda1/"Virtual Machines"/"Windows Server 2003 Standard Edition"/"Windows Server 2003 Standard Edition.vmdk"
    The CID is the 8-character random string on the line starting with CID=. Write this down somewhere.
  4. Now open up the snapshot virtual hard disk in a text editor, and change the parentCID (not CID) to the CID you recorded in the previous step. Then save. You can use nano, vi or some other Linux editor, e.g.
    sudo nano {snapshot vmdk path}
    Make sure to sudo the command, and also be patient - it could take a few minutes, during which the console may remain black; it is loading.

    I chose to do this in Windows instead, using Editpad Lite which is amazingly fast.
  5. That's it, your virtual machine should now start up again.

Further explanation

If you're interested, here's a deeper look into what you just did. At the beginning of each vmdk file is a disk descriptor section, which contains the properties of that virtual hard disk in text. The CID is a random unique identifier that identifies a particular state of the virtual disk - each time a change is made to the virtual hard disk, the CID changes.

In normal operation, the CID property of the base parent virtual hard disk is synced with the parentCID property of the snapshot virtual hard disk to show that the two files work together. The snapshot has to work with the base parent to be useful, as it only contains the differences from the base parent virtual hard disk. It is important to note that it is the snapshot's parentCID property that is synced with the base parent's CID property, not just the two CID properties in the virtual hard disks - the two virtual hard disks are in a parent-child relationship.

When you startup the base parent virtual hard disk on its own however, changes are made to that virtual hard disk without being in sync with the snapshot, so the CIDs no longer match.

And when the CIDs no longer match, VMware complains because the snapshot is out of sync and the changes in the snapshot may not apply properly to the base parent anymore, possibly resulting in data corruption.

By forcing the CIDs to match again, you effectively trick VMware into thinking it was never out of sync.

Depending on how complex your virtual machine is though, it may be worth recreating your virtual machine after recovering your data because it won't be known where the corruption is, if any. If you did anything to the base parent virtual hard disk before realising and shutting down, e.g. copied files around, the risk of corruption is higher.

Trackbacks sign
3 Trackbacks
Trackback URL

Pingback from  Ojat’s Blog  » Blog Archive   » Problem Dengan VMware Snapshot Disk

Pingback from  The parent virtual disk has been modified since the child was created « A Blog on Tech

Comment sign
Tech4Him tracked back:

Average: Select ratingPoorOkayGoodGreatAwesome Your rating: None Okay, I'm really thanking the good Lord right now. He granted us discernment that kept us from losing an entire day's worth of data for my employer. To Him be all the glory. T

Comments sign
23 Comments
Comments RSS RSS icon
Comment sign
Ian said:

Hi Samuel --

One suggestion: instead of opening the snapshot file to replace the parentCID number (which, as you point out, doesn't work if the snapshot is >2GB), use command line utilities to make the change.

I found my parent CID from the base vmdk with:

grep --text -m2 CID= {base vmdk}

and the "wrong" parent CID in the snapshot vmdk:

grep --text -m2 CID= {snapshot vmdk}

Then replaced the child CID using a sed command:

sed -e 's/{wrong CID}/{right CID}/' {snapshot vmdk} > {snapshot vmdk}

That should get it done!

Comment sign
Sam said:

Good idea Ian.

Gotta admit that thought never really crossed my mind as my snapshots were small enough. My Linux command/regexp skills aren't that awesome, so I had no idea about the sed command, but I'm kicking myself for not using grep to find the parentCID and CID lines - so obvious now.

Thanks for the tip!

Comment sign
Oliver said:

Thank you!

That certainly saved me from my own stupidity. Even before I had a chance to lose any sleep.

From now on my snapshots are going to experience very short lives.

Test and commit shall be the new motto.

Comment sign
Francis said:

fantastic advice

Comment sign
Justin said:

You friggen rock!  You saved my 6 hours of a night shift and 2 secs of stupidity!

Comment sign
Lorenz said:

Thank you! Great manual!

Comment sign
fallermax said:

I would like to say thank you very much! This manual was very helpful. Now i will live longer.

if you have windows 32bit system you can open and save big files with the program "winhex". It is very fast - i tried it out because i had not linux on my notebook.

Comment sign
Lucas Violini said:

What a day.. This really really saved me. Now I'll have to re-do our backup policy, keep everybody out of our vmware, but most of all CONGRATULATE you for your skills and knowledge. This saved me and now I have a much better understanding of those freaking snapshots. You are the MEN!

Comment sign
Mike Slass said:

The outline of the fix is this:

1) BACK EVERYTHING UP

2) lookup the CID of the parent disk image

3) lookup the (incorrect) parentCID of the curdled snapshot

  (you'll need both to make the sed command as restrictive as possible)

4) KEEPING THE BACKUP, remove the original of the curdled snapshot file

5) pipe just the beginning of the curdled snapshot through sed to change the parentCID

     and save that as the beginning of the reconstructed snapshot

6) append the rest of the curdled snapshot to the end of the reconstructed snapshot

dd is the tool for snipping pieces of a HUGE file

And here's how it looks in practice:

[root@build12 virtual_machines]# cp -R sea-cm-winvm01 /backup

[root@build12 virtual_machines]# cd sea-cm-winvm01

[root@build12 sea-cm-winvm01]# head -10 /backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk

KDMV

# Disk DescriptorFile

version=1

CID=0d55cd6c

parentCID=b1ce363c                           <-- INCORRECT PARENT CID

createType="monolithicSparse"

parentFileNameHint="sea-cm-winvm01.vmdk"

# Extent description

RW 83886080 SPARSE "sea-cm-winvm01-000001.vmdk"

[root@build12 sea-cm-winvm01]# head -10 /backup/sea-cm-winvm01/sea-cm-winvm01.vmdk

KDM

Disk DescriptorFile

version=1

CID=d68511e8                                 <-- CORRECT PARENT CID

parentCID=ffffffff

createType="monolithicSparse"

# Extent description

RW 83886080 SPARSE "sea-cm-winvm01.vmdk"

[root@build12 sea-cm-winvm01]# rm sea-cm-winvm01-000001.vmdk

[root@build12 sea-cm-winvm01]# dd if=/backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk count=10 | sed 's/parentCID=b1ce363c/parentCID=d68511e8/' >sea-cm-winvm01-000001.vmdk

10+0 records in

10+0 records out

5120 bytes (5.1 kB) copied, 0.00722415 seconds, 709 kB/s

[root@build12 sea-cm-winvm01]# dd if=/backup/sea-cm-winvm01/sea-cm-winvm01-000001.vmdk skip=10 seek=10 of=sea-cm-winvm01-000001.vmdk oflag=append

75301238+0 records in

75301238+0 records out

38554233856 bytes (39 GB) copied, 716.488 seconds, 53.8 MB/s

Comment sign
Sam said:

Thanks Mike for that - the solutions for this problem are getting more and more streamlined :) The one thing I'd probably add is to pipe the head commands through grep to pick out only the CID and parentCID lines. A bash script anyone? (Although I'd rather do it line by line just to be sure; it's worth understanding how VMWare works underneath anyway.)

You gotta wonder why VMWare hasn't automated a solution for this yet given how common it seems to happen. Then again, I'm not sure if I want to use their solution, given their track record with VMWare Converter - it's extremely slow, and often randomly fails for no obvious reason.

Comment sign
WeSam said:

THANK YOU .... YOU JUST SAVED ME WITH YOUR BLOG...

I used "010 Editor" to edit the 30G file, which was very fast.. no loading time even.

Comment sign
WeSam said:

THANK YOU .... YOU JUST SAVED ME WITH YOUR BLOG...

I used "010 Editor" to edit the 30G file, which was very fast.. no loading time even.

Comment sign
OMG said:

Thank you so much, you saved my life!

Comment sign
redfive said:

A MLLION THANKS!

I messed the VMDKs of our main production server after attaching the main VMDK to another virtual machine to add some Windows files. When I attached the HD to the original virtual machine, I didn't boot any more, came up with the dreaded "parent modified..." message.

Fixed it on ESX server 3.5 from the console, with "head --lines=20" and "nano", following your instructions. Worked perfectly! the main file was 137Gb and there was 3 snapshot files, about 10Gb each. The snapshots were linked from last to first and then to the main file (3->2->1->original)

After fixing the CIDs, the machine worked fine, even after having writing and then deleting some files inside the VMDK.

You are a Star!

Angel, Santiago de Compostela, Spain.

Comment sign
Tang said:

I have solve the issue follow your steps.

But I didn't work in Linux.

I make a simple tool for windows.

Main Code:

try

           {

               txtResult.Clear();

               StreamReader sr = new StreamReader(txtPath.Text);

               decimal Up = nudLines.Value;

               decimal i = 0;

               while (i < Up)

               {

                   txtResult.Text += sr.ReadLine();

                   txtResult.AppendText("\r\n");

                   i++;

               }

               sr.Close();

           }

           catch (Exception ex)

           {

               MessageBox.Show(ex.ToString());

           }

Comment sign
Martin said:

Great solution!!

It save me a lot of time. Because I don't have to reinstall the hole system.

Thankyou very much.

Comment sign
untill said:

I googled, found you, and you just saved my day. Quick, comprehensive, and easy.

Thanks a lot!

Comment sign
Mark Fitzwater said:

Thank you. You save my life. I moved our primary domain controller only to find it would start up. AHH.

Your fix did the trick. In esx 3.5 the files you mention are much smaller now and the main disk is called ***flat.vdmk

Guys... to sum it up : THANK YOU!!!

I too had the bad luck of a non-booting VM.

This page contains more relevant info than the rest of the web...

Again... THANKS, you guys saved me weeks of work!!

Bert

Comment sign
Matt said:

Pefect this saved my bacon.  We had the issue described but the problem occured during a VCB backup.

Comment sign
Ruediger said:

You saved my day. 2 weeks of work where in that snapshot the i just clocked an old "Copy of ".vmx file.

I had more adrenaline than blood in me. If you are every looking for someone th marry you... ;-)

Thanx Ruediger

Comment sign
T. Lucas said:

Thanks for your post. It got us through quite a pickle last night when ESXi blew up a VM during a snapshot deletion. Great stuff!

Comment sign
WT said:

Thanks for many hours saved

Post comment sign
Leave a Comment
I know you want to!
(required)  
(optional)
(required)  

Want to keep stay in the loop with the comments here? Leave your email address below and you'll be informed when a new comment is added to this blog post.

(optional):  

Submit