[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 11/11] vpci/msix: add MSI-X handlers



>>> On 16.03.18 at 12:03, <roger.pau@xxxxxxxxxx> wrote:
> On Fri, Mar 16, 2018 at 01:36:53AM -0600, Jan Beulich wrote:
>> >>> On 15.03.18 at 17:33, <roger.pau@xxxxxxxxxx> wrote:
>> > On Thu, Mar 15, 2018 at 06:45:58AM -0600, Jan Beulich wrote:
>> >> >>> On 15.03.18 at 13:01, <roger.pau@xxxxxxxxxx> wrote:
>> >> > On Wed, Mar 14, 2018 at 11:04:00AM -0600, Jan Beulich wrote:
>> >> >> >>> On 14.03.18 at 15:04, <roger.pau@xxxxxxxxxx> wrote:
>> >> >> > +            process_pending_softirqs();
>> >> >> 
>> >> >> Careful - is this valid with a spin lock held? Note how e.g.
>> >> >> dump_domains() holds an RCU lock only.
>> >> > 
>> >> > It works ATM, but I guess there could be issues if at some point the
>> >> > softirqs need to use the vpci lock. I will add a pair of unlock/lock
>> >> > around it.
>> >> 
>> >> Provided that is safe.
>> > 
>> > Hm, msix could be freed under our feet, but I don't see any other
>> > obvious solution to this issue ATM. I think as a follow up I should
>> > move the vpci lock outside of the vpci struct.
>> 
>> Well, looking at patch 9 again, it has a similar issue - pdev could
>> disappear under your feet. Resolving that may be even uglier
>> than resolving the issue here, where you could simply record
>> where inside the MSI-X array you've interrupted the dumping,
>> validating - after re-acquiring of the lock - that pdev->vpci still
>> points at the same structure instance (and note that it would be
>> only confusing, but not otherwise harmful if pdev->vpci had
>> changed twice in between, ending up with the same value as
>> before). But the fundamental idea would be the same -
>> remember what pdev you were at, and restart scanning after
>> having dropped whatever lock is necessary, skipping
>> everything up to the point where you find the right pdev.
> 
> Given the current state of the pdev related locking I'm not sure it
> makes much sense to re-start scanning for the device. For example
> pci_get_pdev_by_domain doesn't take any lock while iterating over the
> list of devices. I have plans to improve this, but at the moment the
> pdev locking is all quite ad-hoc, and AFAICT several places have races
> if devices are removed. IIRC we already spoke about this lack of
> locking before.
> 
>> If
>> you don't find it anymore, you should simply indicate so in a
>> log message, making it clear to the one having sent the debug
>> key that they should re-issue it.
> 
> I think a compromise solution ATM (until all the pdev locking is
> improved) is to use:
> 
> if ( !(i % 64) && )
> {
>     spin_unlock(&msix->pdev->vpci->lock);
>     process_pending_softirqs();
>     if ( !msix->pdev->vpci || !spin_lock(&msix->pdev->vpci->lock) )
>         return -EBUSY;
> }
> 
> And change vpci_msix_arch_print from returning void to int.

Along those lines, but not exactly like you say: msix gets set from
pdev->vpci->msix in the caller, hence with vpci gone, msix will be
gone too. Hence you need to latch pdev before dropping the lock,
and start from there after re-acquiring. Perhaps add a comment
clarifying that right now we assume pdev not to go away for an
alive domain.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.