[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] PV DomU can't access disk from storage driver domain



On Tue, Sep 22, 2015 at 9:48 AM, Roger Pau Monnà <roger.pau@xxxxxxxxxx> wrote:
> El 21/09/15 a les 22.31, Alex Velazquez ha escrit:
>> On Mon, Sep 21, 2015 at 12:32 PM, Alex Velazquez
>> <alex.j.velazquez@xxxxxxxxx> wrote:
>>> On Mon, Sep 21, 2015 at 8:22 AM, Roger Pau Monnà <roger.pau@xxxxxxxxxx> 
>>> wrote:
>>>> El 18/09/15 a les 20.34, Alex Velazquez ha escrit:
>>>>> The last few lines in storagedd's kernel log are:
>>>>>
>>>>>  admin@storagedd:~$ sudo dmesg
>>>>>  [....]
>>>>>  [    4.012464] init: plymouth-upstart-bridge main process (163)
>>>>> killed by TERM signal
>>>>>  [    5.561811] init: plymouth-splash main process (1078) terminated
>>>>> with status 1
>>>>>  [   48.847611] xen-blkback:ring-ref 2047, event-channel 4, protocol 1
>>>>> (x86_64-abi)
>>>>>  [   52.758780] xen-blkback:backend/vbd/9/51712: prepare for reconnect
>>>>>  [   52.927883] xen-blkback:ring-ref 8, event-channel 10, protocol 1
>>>>> (x86_64-abi) persistent grants
>>>>>
>>>>> While the client is booting, the backend entry appears in xenstore, as 
>>>>> such:
>>>>>
>>>>>  xenuser@xenhost:~$ sudo xenstore-ls /local/domain/2/backend
>>>>>  vbd = ""
>>>>>   3 = ""
>>>>>    51712 = ""
>>>>>     frontend = "/local/domain/3/device/vbd/51712"
>>>>>     params = "/dev/loop0"
>>>>>     script = "/etc/xen/scripts/block"
>>>>>     frontend-id = "3"
>>>>>     online = "1"
>>>>>     removable = "0"
>>>>>     bootable = "1"
>>>>>     state = "4"
>>>>>     dev = "xvda"
>>>>>     type = "phy"
>>>>>     mode = "w"
>>>>>     device-type = "disk"
>>>>>     discard-enable = "1"
>>>>>     physical-device = "7:0"
>>>>>     hotplug-status = "connected"
>>>>>     feature-flush-cache = "1"
>>>>>     discard-granularity = "4096"
>>>>>     discard-alignment = "0"
>>>>>     discard-secure = "0"
>>>>>     feature-discard = "1"
>>>>>     feature-barrier = "1"
>>>>>     feature-persistent = "1"
>>>>>     feature-max-indirect-segments = "256"
>>>>>     sectors = "25165824"
>>>>>     info = "0"
>>>>>     sector-size = "512"
>>>>>     physical-sector-size = "512"
>>>>>
>>>>> However, interestingly, it clears out after a few seconds:
>>>>>
>>>>>  xenuser@xenhost:~$ sudo xenstore-ls /local/domain/2/backend
>>>>>  backend = ""
>>>>>   vbd = ""
>>>>>    3 = ""
>>>>
>>>> That's not expected, can you enable xenstored trace in order to know who
>>>> is cleaning this directory?
>>>>
>>>> On Debian systems you can enable xenstored tracing in the
>>>> /etc/default/xencommons file.
>>>>
>>>> Roger.
>>>>
>>>
>>> I tried to enable xenstored trace, but I'm not sure it's working (is
>>> the trace log different from the access log?). I added the following
>>> lines in /etc/default/xencommons:
>>>
>>>  XENSTORED_ARGS="--trace-file /var/log/xen/xenstored-trace.log"
>>>  XENSTORED_TRACE=1
>>>
>>> And after rebooting, the argument seems to be added correctly:
>>>
>>>  xenuser@xenhost:~$ ps -ef | grep xenstored
>>>  root      1414     1  0 11:02 ?        00:00:00
>>> /usr/local/sbin/oxenstored --pid-file /var/run/xenstored.pid -T
>>> /var/log/xen/xenstored-trace.log
>>>
>>> But the file /var/log/xen/xenstored-trace.log never gets created. I
>>> created it manually, but nothing gets written to it. There is,
>>> however, a file called /var/log/xenstored-access.log, so maybe that
>>> will help? It's a large file, so I've attached it to this email. The
>>> first mention of "/local/domain/2/backend" is at line 707.
>>>
>>> To recap the sequence of events.... This is the initial backend entry
>>> in xenstore:
>>>
>>>  admin@storagedd:~$ sudo xenstore-ls /local/domain/2/backend
>>>  vbd = ""
>>>   3 = ""
>>>    51712 = ""
>>>     frontend = "/local/domain/3/device/vbd/51712"
>>>     params = "/dev/loop0"
>>>     script = "/etc/xen/scripts/block"
>>>     frontend-id = "3"
>>>     online = "0"
>>>     removable = "0"
>>>     bootable = "1"
>>>     state = "5"
>>>     dev = "xvda"
>>>     type = "phy"
>>>     mode = "w"
>>>     device-type = "disk"
>>>     discard-enable = "1"
>>>     physical-device = "7:0"
>>>     feature-flush-cache = "1"
>>>     feature-discard = "0"
>>>     feature-barrier = "1"
>>>     feature-persistent = "1"
>>>     feature-max-indirect-segments = "256"
>>>     sectors = "25165824"
>>>     info = "0"
>>>     sector-size = "512"
>>>     physical-sector-size = "512"
>>>     hotplug-status = "connected"
>>>
>>> The client boots and seems to have no trouble reading from the disk.
>>> However, after a few seconds, this message prints and the client locks
>>> up:
>>>
>>>  Ubuntu 14.04.3 LTS client hvc0
>>>
>>>  client login: [    9.948994] vbd vbd-51712: 16 Device in use; refusing to 
>>> close
>>>  [    9.949099] vbd vbd-51712: failed to write error node for
>>> device/vbd/51712 (16 Device in use; refusing to close)
>>>
>>> Around this time, the backend entry in xenstore becomes cleared out:
>>>
>>>  admin@storagedd:~$ sudo xenstore-ls /local/domain/2/backend
>>>  vbd = ""
>>>   3 = ""
>>>
>>> The kernel log of storagedd:
>>>
>>>  admin@storagedd:~$ sudo dmesg | less
>>>  [....]
>>>  [   71.106439] xen-blkback:ring-ref 2047, event-channel 4, protocol 1
>>> (x86_64-abi)
>>>  [   75.981704] xen-blkback:backend/vbd/3/51712: prepare for reconnect
>>>  [   76.151381] xen-blkback:ring-ref 8, event-channel 10, protocol 1
>>> (x86_64-abi) persistent grants
>>>
>>> The xl devd log of storagedd:
>>>
>>>  admin@storagedd:~$ cat /var/log/xen/xldevd.log
>>>  libxl: error: libxl_device.c:953:device_backend_callback: unable to
>>> remove device with path /local/domain/2/backend/vbd/3/51712
>>>
>>> The message in Dom0 when client is destroyed:
>>>
>>>  xenuser@xenhost:~$ sudo xl destroy client
>>>  libxl: error: libxl_device.c:953:device_backend_callback: unable to
>>> remove device with path /local/domain/2/backend/vbd/3/51712
>>>  libxl: error: libxl.c:1654:devices_destroy_cb: libxl__devices_destroy
>>> failed for 3
>>>
>>> After destruction of client, the backend entry in xenstore is gone:
>>>
>>>  admin@storagedd:~$ sudo xenstore-ls /local/domain/2/backend
>>>  xenstore-ls: xs_directory (/local/domain/2/backend): No such file or 
>>> directory
>>>
>>> Finally, the xl dmesg output in Dom0 corresponding to the client booting:
>>>
>>>  xenuser@xenhost:~$ sudo xl dmesg | less
>>>  (XEN) memory.c:161:d0v6 Could not allocate order=9 extent: id=3
>>> memflags=0 (476 of 512)
>>>  (d3) Bootstrapping...
>>>  (d3) Xen Minimal OS!
>>>  (d3)   start_info: 0xba4000(VA)
>>>  (d3)     nr_pages: 0x40000
>>>  (d3)   shared_inf: 0xa1d1c000(MA)
>>>  (d3)      pt_base: 0xba7000(VA)
>>>  (d3) nr_pt_frames: 0xb
>>>  (d3)     mfn_list: 0x9a4000(VA)
>>>  (d3)    mod_start: 0x0(VA)
>>>  (d3)      mod_len: 0
>>>  (d3)        flags: 0x0
>>>  (d3)     cmd_line: (hd0,0)/boot/grub/menu.lst
>>>  (d3)        stack: 0x9630e0-0x9830e0
>>>  (d3) MM: Init
>>>  (d3)       _text: 0x0(VA)
>>>  (d3)      _etext: 0x75374(VA)
>>>  (d3)    _erodata: 0x90000(VA)
>>>  (d3)      _edata: 0x95d20(VA)
>>>  (d3) stack start: 0x9630e0(VA)
>>>  (d3)        _end: 0x9a36e0(VA)
>>>  (d3)   start_pfn: bb5
>>>  (d3)     max_pfn: 40000
>>>  (d3) Mapping memory range 0x1000000 - 0x40000000
>>>  (d3) setting 0x0-0x90000 readonly
>>>  (d3) skipped 1000
>>>  (d3) MM: Initialise page allocator for dad000(dad000)-40000000(40000000)
>>>  (d3) MM: done
>>>  (d3) Demand map pfns at 40001000-0x2040001000.
>>>  (d3) Heap resides at 2040002000-4040002000.
>>>  (d3) Initialising timer interface
>>>  (d3) Initialising console ... done.
>>>  (d3) gnttab_table mapped at 0x40001000.
>>>  (d3) Initialising scheduler
>>>  (d3) Thread "Idle": pointer: 0x0x2040002050, stack: 0x0xfc0000
>>>  (d3) Thread "xenstore": pointer: 0x0x2040002800, stack: 0x0xfd0000
>>>  (d3) xenbus initialised on irq 1 mfn 0x14aba5
>>>  (d3) Thread "shutdown": pointer: 0x0x2040002fb0, stack: 0x0xfe0000
>>>  (d3) main.c: dummy main: start_info=0x9831e0
>>>  (d3) Thread "main": pointer: 0x0x2040003760, stack: 0x0xff0000
>>>  (d3) vbd 51712 is hd0
>>>  (d3) ******************* BLKFRONT for device/vbd/51712 **********
>>>  (d3)
>>>  (d3)
>>>  (d3) backend at /local/domain/2/backend/vbd/3/51712
>>>  (d3) 25165824 sectors of 512 bytes
>>>  (d3) **************************
>>>  (d3) ============= Init TPM Front ================
>>>  (d3) Tpmfront:Error Unable to read device/vtpm/0/backend-id during
>>> tpmfront initialization! error = ENOENT
>>>  (d3) Tpmfront:Info Shutting down tpmfront
>>>  (d3) close blk: backend=/local/domain/2/backend/vbd/3/51712
>>> node=device/vbd/51712
>>>  (d3) go!
>>>  (d3) mapping kernel into physical memory
>>>  (d3) about to get started...
>>>  (XEN) grant_table.c:1491:d3v0 Expanding dom (3) grant table from (4)
>>> to (5) frames.
>>>  (XEN) grant_table.c:1491:d3v0 Expanding dom (3) grant table from (5)
>>> to (6) frames.
>>>  (XEN) grant_table.c:1491:d3v0 Expanding dom (3) grant table from (6)
>>> to (7) frames.
>>>
>>> Thanks for your help troubleshooting this. Let me know if there's any
>>> other info that might be helpful.
>>>
>>> Thanks,
>>> Alex
>>
>> Looks like storagedd is the one that clears the backend directory (D2
>> is storagedd, D3 is client):
>
> Yes, this is a bug in the current devd implementation. I'm preparing a
> patch to solve it, but in the meantime you can workaround this by using
> pygrub instead of pvgrub.
>
> Roger.
>

Great, I can confirm that using pygrub instead of pvgrub to boot the
client works. Looking forward to the patch.

Thanks,
Alex

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.