[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Elaboration of "Question about sharing spinlock_t among VMs in Xen"
>> >> *** The question is as follows *** >> >> Suppose I have two Linux VMs sharing the same spinlock_t lock (through >> >> the sharing memory) on the same host. Suppose we have one process in >> >> each VM. Each process uses the linux function spin_lock(&lock) [1] to >> >> grab & release the lock. >> >> Will these two processes in the two VMs have race on the shared lock? >> >> > You can't do this: depending on which Linux version you use you will >> > find that kernel uses ticketlocks or qlocks locks which keep track of >> > who is holding the lock (obviously this information is internal to VM). >> > On top of this on Xen we use pvlocks which add another (internal) >> > control layer. >> >> I wanted to see if this can be done with the correct combination of >> versions and parameters. We are using 4.1.0 for all domains, which >> still has the CONFIG_PARAVIRT_SPINLOCK option. I've recompiled the >> guests with this option set to n, and have also added the boot >> parameter xen_nopvspin to both domains and dom0 for good measure. A >> basic ticketlock holds all the information inside the struct itself to >> order the requests, and I believe this is the version I'm using. > > Hm, weird. B/c from arch/x86/include/asm/spinlock_types.h: > 6 #ifdef CONFIG_PARAVIRT_SPINLOCKS > 7 #define __TICKET_LOCK_INC 2 > 8 #define TICKET_SLOWPATH_FLAG ((__ticket_t)1) > 9 #else > 10 #define __TICKET_LOCK_INC 1 > 11 #define TICKET_SLOWPATH_FLAG ((__ticket_t)0) > 12 #endif > 13 > > Which means that one of your guests is adding '2' while another is > adding '1'. Or one of them is putting the 'slowpath' flag > which means that the paravirt spinlock is enabled. Interesting. I went back to check on one of my guests, and the .config from the source tree I used, as well as the one in /boot/ for the current build both have it "not set" which shows as unchecked in make menuconfig, where the option was disabled. So this domain appears to be correctly configured. The thing is, the other domain is literally a copy of this domain. Either both are wrong or neither are. >> >> Do you think this *should* work? I am still getting a deadlock issue >> but I do not believe its due to blocking vcpus, especially after the >> above changes. Instead, I believe the spinlock struct is getting >> corrupted. To be more precise, I only have two competing domains as a >> test, both domUs. I print the raw spinlock struct out when I create it >> and after a lock/unlock test. I get the following: >> >> Init: [ 00 00 00 00 ] >> Lock: [ 00 00 02 00 ] >> Unlock: [ 02 00 02 00 ] >> Lock: [ 02 00 04 00 ] >> Unlock: [ 04 00 04 00 ] >> >> It seems clear from the output and reading I've done that the first 2 >> bytes are the "currently servicing" number and the next two are the >> "next number to draw" value. With only two guests, one should always >> be getting serviced while another waits, so I would expect these two >> halves to stay nearly the same (within one grab actually) and end with >> both values equal when both are done with their locking/unlocking. >> Instead, after what seems to be deadlock I destroy the VMs and print >> the spinlock values an its this: [ 11 1e 14 1e ]. Note the 11 and 14, >> should these be an odd number apart? The accesses I see keep them >> even. Please correct me if I am wrong! Seems practically every time >> there is this issue, the first pair of bytes are 3 off and the last >> pair match. Could this have something to do with the issue? > > The odd number would suggest that the TICKET_SLOWPATH_FLAG has been set. It would seem so, and from the default behavior where increments show an increase of two, both of these suggest paravirt spinlocking is still in use. Any idea how to turn these off? I would try disabling any paravirtual options in the configuration but I still need access to XenStore and grant pages, which I feel I would lose by doing so. Its odd that my boot config points to this option being not set, yet the behavior is that it is... >> >> >> My speculation is that it should have the race on the shard lock when >> >> the spin_lock() function in *two VMs* operate on the same lock. >> >> >> >> We did some quick experiment on this and we found one VM sometimes see >> >> the soft lockup on the lock. But we want to make sure our >> >> understanding is correct. >> >> >> >> We are exploring if we can use the spin_lock to protect the shared >> >> resources among VMs, instead of using the PV drivers. If the >> >> spin_lock() in linux can provide the host-wide atomicity (which will >> >> surprise me, though), that will be great. Otherwise, we probably have >> >> to expose the spin_lock in Xen to the Linux? >> >> > I'd think this has to be via the hypervisor (or some other third party). >> > Otherwise what happens if one of the guests dies while holding the lock? >> > -boris >> >> This is a valid point against locking in the guests, but itself won't >> prevent a spinlock implementation from working! We may move this >> direction for several reasons but I am interested in why the above is >> not working when I've disabled the PV part that sleeps vcpus. Regards, Dagaen Golomb _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |