[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] DomU fails to reboot with storage driver domain



Adding xen-devel to the Cc, since this is AFAICT a real bug.

On Thu, 24 Mar 2016, Alex Velazquez wrote:

> On Wed, Mar 23, 2016 at 6:56 AM, Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
> > Hello,
> >
> > On Mon, 21 Mar 2016, Alex Velazquez wrote:
> >> Hello,
> >>
> >> I am running Xen 4.6.0, with Ubuntu 14.04 as my Domain-0.
> >>
> >> I have a storage driver domain (PV guest running Ubuntu 14.04) that
> >> serves a disk backend to a PV DomU (also running Ubuntu 14.04).
> >>
> >> Here is the XL config file of StorageDom:
> >>
> >> > name = "StorageDom"
> >> > memory = 1024
> >> > maxmem = 1024
> >> > vcpus = 2
> >> > maxvcpus = 2
> >> > driver_domain = 1
> >> > pci = [ "84:00.0" ]
> >> > builder = "generic"
> >> > kernel = "/var/lib/xen/images/vmlinuz-3.19.0-56-generic"
> >> > ramdisk = "/var/lib/xen/images/initrd.img-3.19.0-56-generic"
> >> > cmdline = "root=/dev/sda1 ro"
> >>
> >>
> >> Here is the XL config file of ClientDom:
> >>
> >> > name = "ClientDom"
> >> > memory = 1024
> >> > maxmem = 1024
> >> > vcpus = 2
> >> > maxvcpus = 2
> >> > builder = "generic"
> >> > kernel = "/usr/local/lib/xen/boot/pv-grub-x86_64.gz"
> >> > cmdline = "(hd0,0)/boot/grub/menu.lst"
> >> > disk = [ 
> >> > "format=raw,vdev=xvda,access=rw,backend=StorageDom,target=/dev/sdb" ]
> >>
> >>
> >> When I start ClientDom, everything looks good. Here is the backend
> >> entry in xenstore:
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/1/backend/vbd
> >> > 2 = ""
> >> >  51712 = ""
> >> >   frontend = "/local/domain/2/device/vbd/51712"
> >> >   params = "/dev/sdb"
> >> >   script = "/etc/xen/scripts/block"
> >> >   frontend-id = "2"
> >> >   online = "1"
> >> >   removable = "0"
> >> >   bootable = "1"
> >> >   state = "4"
> >> >   dev = "xvda"
> >> >   type = "phy"
> >> >   mode = "w"
> >> >   device-type = "disk"
> >> >   discard-enable = "1"
> >> >   physical-device = "8:10"
> >> >   feature-flush-cache = "1"
> >> >   feature-discard = "0"
> >> >   feature-barrier = "1"
> >> >   feature-persistent = "1"
> >> >   feature-max-indirect-segments = "256"
> >> >   sectors = "1562824368"
> >> >   info = "2"
> >> >   sector-size = "512"
> >> >   physical-sector-size = "512"
> >> >   hotplug-status = "connected"
> >>
> >>
> >> And here is the corresponding frontend entry:
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/2/device/vbd
> >> > 51712 = ""
> >> >  backend = "/local/domain/1/backend/vbd/2/51712"
> >> >  backend-id = "1"
> >> >  state = "4"
> >> >  virtual-device = "51712"
> >> >  device-type = "disk"
> >> >  protocol = "x86_64-abi"
> >> >  ring-ref = "8"
> >> >  event-channel = "17"
> >> >  feature-persistent = "1"
> >>
> >>
> >> I run into problems if I try to reboot ClientDom (either from within
> >> the VM, or by calling "xl reboot ClientDom" from Domain-0). As
> >> ClientDom goes down, the backend entry is cleared out:
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/1/backend/vbd
> >> > 2 = ""
> >>
> >>
> >> Then ClientDom comes back up with ID 3, but the new backend/frontend
> >> are not created:
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/1/backend/vbd
> >> > 2 = ""
> >>
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/3/device/vbd
> >> > xenstore-ls: xs_directory (/local/domain/3/device/vbd): No such file or 
> >> > directory
> >>
> >>
> >> > user@ubuntu ~> $ sudo xenstore-ls /local/domain/3/device
> >> > suspend = ""
> >> >  event-channel = ""
> >>
> >>
> >> Connecting to ClientDom's console shows the PvGrub prompt, because it
> >> can't find its boot disk:
> >>
> >> >
> >> >     GNU GRUB  version 0.97  (1048576K lower / 0K upper memory)
> >> >
> >> >        [ Minimal BASH-like line editing is supported.   For
> >> >          the   first   word,  TAB  lists  possible  command
> >> >          completions.  Anywhere else TAB lists the possible
> >> >          completions of a device/filename. ]
> >> >
> >> > grubdom> root (hd0,0)
> >> >
> >> > Error 21: Selected disk does not exist
> >> >
> >> > grubdom>

Hello,

I've been able to reproduce this locally using the latest Xen unstable, 
and it is indeed a bug. The issue happens because libxl compares the JSON 
status with the list of backends it fetches from Domain 0 only, and then 
if the domain is using backends from a driver domain, it is not able to 
find those entries on Domain 0 and discards them. As an example 
libxl__append_disk_list_of_type hardcodes the backend path of Domain 0 as 
the only search path.

TBH, I don't see an easy way to solve this, I've thought about fetching 
the "backend" node from the xenstore frontend path of each device, but 
that's not safe since the guest can modify those entries.

Since it's not clear to me, why do we need to check the JSON internal data 
against the devices on xenstore? Is there a possibility that they became 
out of sync?

Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.