[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] block-iscsi with Xen 4.5 / 4.6



On Wed, May 4, 2016 at 11:18 AM, Steven Haigh <netwiz@xxxxxxxxx> wrote:
> On 4/05/2016 7:37 PM, Roger Pau Monné wrote:
>> On Wed, May 04, 2016 at 06:41:26PM +1000, Steven Haigh wrote:
>>> On 4/05/2016 5:34 PM, Roger Pau Monné wrote:
>>>> On Wed, May 04, 2016 at 03:06:23PM +1000, Steven Haigh wrote:
>>>> It is important for us to use the '-e' in order to make sure all the 
>>>> failure
>>>> points are correctly handled, without the '-e' some command might fail and
>>>> the script wouldn't realize.
>>>
>>> I honestly think this is pretty nasty. While it may not be true of all
>>> scripts, the block-iscsi script can only really fail in a couple of
>>> places - yet we have this set of procedures called:
>>>
>>> parse_target -> check_tools -> prepare -> add -> attach -> find_device
>>> -> write_dev.
>>>
>>> At least check_tools, prepare, add, attach, find_device could all be
>>> rolled into a single function - as the majority of the rest is 1-4 lines
>>> of code.
>>
>> No, check_tools is used by both the attach and the detach path, so it cannot
>> be rolled into a single function together with the other ones, and the same
>> applies to mostly all other functions (find_device is also shared between
>> the add and remove functions).
>>
>> IMHO, I think the current code is fine because each function has a small
>> logical task to accomplish, so it's easy to make sure each function does
>> what it's supposed to do, nothing more and nothing less. Batching everything
>> into one big function would make this harder.
>>
>> That doesn't mean that I'm not open to improving it, so if you think it
>> would be better/easier using some other logical organization patches are
>> welcome :).
>
> Right now, my changes are here:
>         http://paste.fedoraproject.org/362462/62356799/
>
> It works perfectly well if you're the ONLY device connecting to the
> specified iSCSI target, but falls apart when something else has the lock
> and doesn't clean up after itself.
>
> From using set -x in there, I can see this when it runs:
> ++ dirname /etc/xen/scripts/block-iscsi-lock
> + dir=/etc/xen/scripts
> + . /etc/xen/scripts/block-common.sh
> +++ dirname /etc/xen/scripts/block-iscsi-lock
> ++ dir=/etc/xen/scripts
> ++ . /etc/xen/scripts/xen-hotplug-common.sh
> ++++ dirname /etc/xen/scripts/block-iscsi-lock
> +++ dir=/etc/xen/scripts
> +++ . /etc/xen/scripts/hotplugpath.sh
> ++++ sbindir=/usr/sbin
> ++++ bindir=/usr/bin
> ++++ LIBEXEC=/usr/lib/xen
> ++++ LIBEXEC_BIN=/usr/lib/xen/bin
> ++++ libdir=/usr/lib64
> ++++ SHAREDIR=/usr/share
> ++++ XENFIRMWAREDIR=/usr/lib/xen/boot
> ++++ XEN_CONFIG_DIR=/etc/xen
> ++++ XEN_SCRIPT_DIR=/etc/xen/scripts
> ++++ XEN_LOCK_DIR=/var/lock
> ++++ XEN_RUN_DIR=/var/run/xen
> ++++ XEN_PAGING_DIR=/var/lib/xen/xenpaging
> ++++ XEN_DUMP_DIR=/var/lib/xen/dump
> +++ . /etc/xen/scripts/logging.sh
> +++ . /etc/xen/scripts/xen-script-common.sh
> ++++ set -e

Ah, there's the culprit. :-)

> +++ . /etc/xen/scripts/locking.sh
> ++++ LOCK_BASEDIR=/var/run/xen-hotplug
> +++ exec
> Entered lock_device
>   SUN       COMSTAR           1.0
>   Peripheral device type: disk
> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus:
> /etc/xen/scripts/block-iscsi-lock add [22016] exited with error status 99
> libxl: error: libxl.c:3127:local_device_attach_cb: unable to add vbd
> with id 268446976: No such file or directory
> libxl: error: libxl_bootloader.c:408:bootloader_disk_attached_cb: failed
> to attach local disk for bootloader execution
> libxl: error: libxl_bootloader.c:279:bootloader_local_detached_cb:
> unable to detach locally attached disk
> libxl: error: libxl_create.c:1142:domcreate_rebuild_done: cannot
> (re-)build domain: -3
> libxl: error: libxl.c:1591:libxl__destroy_domid: non-existant domain 57
> libxl: error: libxl.c:1549:domain_destroy_callback: unable to destroy
> guest with domid 57
> libxl: error: libxl.c:1476:domain_destroy_cb: destruction of domain 57
> failed
>
> On a side note, it looks like the -e is not very uniformly used:
> block:#!/bin/bash
> block-enbd:#!/bin/bash
> block-iscsi:#!/bin/bash -e
> block-iscsi-lock:#!/bin/bash -x
> block-nbd:#!/bin/bash
> block-tap:#!/bin/bash -e
> external-device-migrate:#!/bin/bash
> vif2:#!/bin/bash
> vif-bridge:#!/bin/bash
> vif-nat:#!/bin/bash
> vif-openvswitch:#!/bin/bash
> vif-route:#!/bin/bash
> vif-setup:#!/bin/bash
>
> but probably gets set everywhere via xen-script-common.sh - hence things
> dying easily.

Yeah, set -e should probably certainly be set only in the top-level
script, not something included 3 levels down.

But as Roger said, the reason you're having problems is that you're
not calling sg_persist in a way that allows bash to know that failure
is being handled.  Putting the stg_persist in an if statement, rather
than just running it and checking the error code afterwards, should
mitigate your problem with -e.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.