[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Xen-unstable save error



On 06/21/2010 03:45 PM, Keir Fraser wrote:
On 21/06/2010 14:37, "Michal Novotny"<minovotn@xxxxxxxxxx>  wrote:

My guest is RHEL-5 i386 guest but this seems that the suspend port is
missing. AFAIK, you started using the SUSPEND_CANCEL some time ago which
requires the modified kernel.

Isn't it possible that's the issue or how is it with the SUSPEND_CANCEL
functionality?
SUSPEND_CANCEL is a different thing. The suspend port is simply a quicker
way for suspend notifications to be passed back and forth between the guest
and the dom0 toolstack. We fall back okay if the guest kernel does not
support the new faster method.

I'm not sure why the domain restore operation fails. Unfortunately some
error messages are now expected in the logs, since Remus functionality went
into the tree. So it's hard to work out what the first error is.

  -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
Ok Keir, but what I don't understand is why there's nothing in `/local/domain/%d/device/suspend/event-channel`. So this is OK?

For the restore functionality:

# ls -ahl rhel5-32fv.sav
-rwxr-xr-x 1 root root 53M Jun 21  2010 rhel5-32fv.sav

As you can see the save file is 53M big but the guest was having 1G of memory and I think this is why it's failing.
You can see it should be having 1G of memory here too:
...
[2010-06-21 17:29:20 4305] DEBUG (XendDomainInfo:237) XendDomainInfo.restore(['domain', ['domid', '1'], ['cpu_weight', '256'], ['cpu_cap', '0'], ['on_crash', 'restart'], ['uuid', 'c91ec802-2015-cb49-80e5-810c808bf725'], ['bootloader_args'], ['pool_name', 'Pool-0'], ['vcpus', '1'], ['name', 'rhel5-32fv-stubdom'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['cpus', [[]]], ['description'], ['bootloader'], ['maxmem', '1024'],* ['memory', '1024'],* ['shadow_memory', '9'], ['vcpu_avail', '1'], ['features'], ['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'], ['start_time', '1277134046.11'], ['cpu_time', '1.550284835'], ['online_vcpus', '1'], ['image', ['hvm', ['kernel'], ['superpages', '0'], ['tsc_mode', '0'], ['videoram', '4'], ['hpet', '0'], ['boot', 'c'], ['loader', '/usr/lib/xen/boot/hvmloader'], ['serial', 'pty'], ['vpt_align', '1'], ['xen_platform_pci', '1'], ['opengl', '1'], ['vncunused', '1'], ['rtc_timeoffset', '0'], ['pci', []], ['pae', '1'], ['stdvga', '0'], ['hap', '1'], ['viridian', '0'], ['acpi', '1'], ['localtime', '0'], ['timer_mode', '1'], ['vnc', '1'], ['nographic', '0'], ['guest_os_type', 'default'], ['vncdisplay', '1'], ['pci_msitranslate', '1'], ['oos', '1'], ['apic', '1'], ['sdl', '0'], ['nomigrate', '0'], ['device_model', '/usr/lib/xen/bin/qemu-dm'], ['pci_power_mgmt', '0'], ['usb', '0'], ['xauthority', '/root/.Xauthority'], ['isa', '0'], ['display', 'localhost:10.0'], ['notes', ['SUSPEND_CANCEL', '1']]]], ['status', '2'], ['state', 'r-----'], ['store_mfn', '1044476'], ['device', ['vif', ['bridge', 'virbr0'], ['uuid', 'dcd99a20-2e8f-2692-8e56-dc4051579923'], ['script', '/etc/xen/scripts/vif-bridge'], ['mac', '00:16:3e:5b:bd:9c'], ['type', 'ioemu'], ['backend', '0']]], ['device', ['vbd', ['uuid', 'e7e07da9-c104-800d-ee3f-5fe9757167fd'], ['bootable', '1'], ['dev', 'hda:disk'], ['uname', 'file:/var/lib/xen/images/colossus/rhel5-32fv.img'], ['mode', 'w'], ['backend', '0'], ['VDI']]], ['device', ['vbd', ['uuid', '0180089b-8394-cbfa-0da4-b8c1fc688617'], ['bootable', '0'], ['dev', 'sda:disk'], ['uname', 'file:/home2/test.img'], ['mode', 'w'], ['backend', '0'], ['VDI']]], ['device', ['vfb', ['vncunused', '1'], ['location', '127.0.0.1:5901'], ['vnc', '1'], ['vncdisplay', '1'], ['uuid', '7fa1bcc0-797d-66ac-eb88-6ef15f1209f0']]], ['device', ['console', ['protocol', 'vt100'], ['location', '3'], ['uuid', 'd77b182b-4152-a4d2-f577-8b610b5cd6ff']]]])

The first error (Error when reading batch size (0 = Success): Internal error) is coming from libxc/xc_domain_restore.c in pagebuf_get_one() function where it is there:
...
    if ( RDEXACT(fd, &count, sizeof(count)) )
    {
        PERROR("Error when reading batch size");
        return -1;
    }
...
so I guess the data are not well-written for this guest (since the file is smaller than the original guest memory) and that's why the error occurs. As you can see there's nothing in xend.log except "failed to get the suspend evtchn port" message:

[2010-06-21 15:59:55 4305] DEBUG (XendCheckpoint:126) [xc_save]: /usr/lib64/xen/bin/xc_save 56 5 0 0 4 [2010-06-21 15:59:55 4305] INFO (XendCheckpoint:410) xc_save: failed to get the suspend evtchn port
[2010-06-21 15:59:55 4305] INFO (XendCheckpoint:410)
[2010-06-21 15:59:55 4305] DEBUG (XendCheckpoint:381) suspend
[2010-06-21 15:59:55 4305] DEBUG (XendCheckpoint:129) In saveInputHandler suspend
[2010-06-21 15:59:55 4305] DEBUG (XendCheckpoint:131) Suspending 5 ...
[2010-06-21 15:59:55 4305] DEBUG (XendDomainInfo:521) XendDomainInfo.shutdown(suspend) [2010-06-21 15:59:55 4305] DEBUG (XendDomainInfo:1877) XendDomainInfo.handleShutdownWatch [2010-06-21 15:59:55 4305] INFO (XendDomainInfo:538) HVM save:remote shutdown dom 5! [2010-06-21 15:59:55 4305] INFO (XendDomainInfo:2074) Domain has shutdown: name=migrating-rhel5-32fv-stubdom id=5 reason=suspend.
[2010-06-21 15:59:55 4305] INFO (XendCheckpoint:137) Domain 5 suspended.
[2010-06-21 15:59:56 4305] INFO (image:538) signalDeviceModel:restore dm state to running
[2010-06-21 15:59:56 4305] DEBUG (XendCheckpoint:146) Written done
[2010-06-21 16:00:02 4305] DEBUG (XendDomainInfo:3067) XendDomainInfo.destroy: domid=5 [2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2397) Destroying device model [2010-06-21 16:00:03 4305] INFO (image:615) migrating-rhel5-32fv-stubdom device model terminated
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2404) Releasing devices
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2410) Removing vif/0
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2410) Removing vbd/768
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2410) Removing vbd/2048
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/2048
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2410) Removing vfb/0
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:2410) Removing console/0
[2010-06-21 16:00:03 4305] DEBUG (XendDomainInfo:1272) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0

Any ideas why the save file is that small (it should be 1024M at least, right? ) ?

Thanks,
Michal

--
Michal Novotny<minovotn@xxxxxxxxxx>, RHCE
Virtualization Team (xen userspace), Red Hat


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.