[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] block-iscsi with Xen 4.5 / 4.6



On 4/05/2016 5:34 PM, Roger Pau Monné wrote:
> Hello,
> 
> I'm re-adding xen-devel in case someone else also wants to provide feedback.
> 
> On Wed, May 04, 2016 at 03:06:23PM +1000, Steven Haigh wrote:
>> Hi Roger,
>>
>> I've been getting some good progress with iSCSI thanks to your insights.
>>
>> I'm now trying to add support for locking via Persistent Reservations to
>> ensure that only one Dom0 can attach / use a single iSCSI target at once.
> 
> This might be problematic with migrations. IIRC there's a point during the 
> migration where both the sending and the receiving side have the disk open 
> at the same time. However Xen always makes sure that only one guest is 
> actually accessing the disk, either the one on the receiving side (if 
> everything has gone OK) or the one on the senders side (if migration has 
> failed).

True - however I'd like to eventually attempt to commit changes to the
project and allow locking to be done as an option - just like iqn /
portal / multipath.

In my specific use case, its to stop someone accidentally starting the
same VM on multiple Dom0's at the same time - which from what I've seen
causes disk corruption and all kinds of issues. It leads to people not
having a good time.

The iSCSI system has a limit to the max connections - however it seems
that only applies *per host* meaning max connections = 1 will allow one
connection per Dom0.

>> In a nutshell, my thoughts are to use the following to 'lock' a device:
>>      ## Create a hex key for the lock from the systems IP.
>>      key=$(gethostip -x $(uname -n))
>>      sg_persist -d ${dev} -o -G -S ${key}
>>      sg_persist -d ${dev} -o -R -K ${key} -T 6
>>
>> This registers the device, and sets an Exclusive Access (-T 6) flag on
>> the iSCSI device which means nothing else will be able to open the
>> device until the lock is removed.
>>
>> To unlock the device, on remove, we should do something like:
>>      key=$(gethostip -x $(uname -n))
>>         sg_persist -d ${dev} -o -L -K ${key} -T 6
>>         sg_persist -d ${dev} -o -G -K ${key} -S 0
>>
>> This releases the device for other things to use.
>>
>> I've tried putting these in block-iscsi - by using a lock_device and
>> unlock_device function and calling it after find_device in both attach()
>> and remove().
>>
>> My problems:
>> 1) -e is set on the script - and maybe elsewhere - so any time something
>> returns non-zero, you can't clean up. For example, if you can't get a
>> lock, you should make sure all locks are removed from the host in
>> question and then detach the iSCSI target.
> 
> You can avoid this by adding something like:
> 
> sg_persist ... || true
> 
> Of course you can replace the "true" command with something else, like a 
> fatal message or some cleanup code. You can also place the command inside of 
> a conditional if you know it might fail:
> 
> if ! sg_persist ...; then
>       fatal ...
> fi
> 
> It is important for us to use the '-e' in order to make sure all the failure 
> points are correctly handled, without the '-e' some command might fail and 
> the script wouldn't realize.

I honestly think this is pretty nasty. While it may not be true of all
scripts, the block-iscsi script can only really fail in a couple of
places - yet we have this set of procedures called:

parse_target -> check_tools -> prepare -> add -> attach -> find_device
-> write_dev.

At least check_tools, prepare, add, attach, find_device could all be
rolled into a single function - as the majority of the rest is 1-4 lines
of code.

There are situations where you may want to evaluate the result of
sg_persist beyond a simple "worked or failed" - and that seems to be the
idea of fatal "The reason that I died is X".

>> 2) I can't find an easy way to clean up by doing an iscsiadm --logout if
>> the locking fails.
> 
> I'm not really following here, maybe because I don't know that much about 
> iSCSI. Can you just put whatever code is needed in order to unlock before 
> doing the logout? Or that's not how it works?

Yes, but if one of the two unlocks fails, the script terminates. It
makes different error checking *VERY* difficult. If I remove the -e from
line #1, the script still acts as if -e is still set - so something else
is enforcing that.

>> I'm wondering if there is a reason that the script is currently in the
>> stucture that it is - or if it just evolved like this? It may be a good
>> candidate for a complete re-write :\
> 
> TBH, I thought this was one of the most clean and well structured block 
> scripts that Xen has ;).

Please don't scare me ;)

-- 
Steven Haigh

Email: netwiz@xxxxxxxxx
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.