[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] vpci: introduce per-domain lock to protect vpci structure


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 14 Feb 2022 12:25:39 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kGS02Bj8gu4kyj6h4LbulHz3XtLOXZDVhRTZrzGK7yU=; b=KLzlGP6HpUPJ1KIm68W7+Rq4sUyOXZgmEweOVtqy17kIY0VB60yhVVg0b1rQcq2NfYv75Q13yqeGG992UEOnhzcTV/9qL0ZNfZig5fbuCAl7InWA7eXt0MwT+fhezhXzNWUDrQFBrNo186vXnggRdvrtKQoaewvGxh0FsXdctG/nBuob04Vh+cYr6icjnYY1CWBbI6t2j7cWqt7kY+nSAabDhuuQnViv+xDhlem7jHLMpk2rZcqbLsjtFKRQ7pVeiCJd051csO6PwKykHq/3ft/iquAvvH5usgC8P38ZsVF0fDGF01Gm9mgTCb4D+jNq9yHmTjgzakJpHv/5w5G0ug==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MQCGWPP0gQSIuWDQd/gdQ6//HORooSHho+NGArm0bP1gzl51dw14Dpv3I6rMAzhE6C0HsNhSrSvJCU5Alts5a2pXZgIp9W8gNlGlLPWSNgRyt6ZqJTCThG48HtOdjy3TQpaerbCI38DDKulLmkvStlvTrgLqhsInUkj8pVwlpUTHsFfDh2fLIJDeDzqLlARQcf0p9P1U1ug7JsvxUWnt1hZ7rb0QQU8n2Sp8w9hGpSo04PkICcDLW0FxYgjPCLhf5UI0ESg8MQpCUCLIxaRr5sSICbkXxueaCqrgN9WcgNZtMH6ybUo4x/68BI6u2CK8zH1g8P5YQf7lis//+bVLsA==
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>
  • Delivery-date: Mon, 14 Feb 2022 11:26:06 +0000
  • Ironport-data: A9a23:u3wS2K1nBFbJiF+LTfbD5d13kn2cJEfYwER7XKvMYLTBsI5bpz0By WEXXzyDa6qJa2ujLtB1ad6y9ENTvpCByoRjHgdlpC1hF35El5HIVI+TRqvS04J+DSFhoGZPt Zh2hgzodZhsJpPkS5PE3oHJ9RGQ74nRLlbHILOCanAZqTNMEn9700o5wrJh2+aEvPDia++zk YKqyyHgEAfNNw5cagr4PIra9XuDFNyr0N8plgRWicJj5TcypFFMZH4rHomjLmOQf2VhNrXSq 9Avbl2O1jixEx8FUrtJm1tgG6EAaua60QOm0hK6V0U+6/TrS+NbPqsTbZIhhUlrZzqhxt5O8 c0VjJeKYjgxGo7v39Q0bil6OnQrVUFG0OevzXmXtMWSywvNcmf2wuUoB0YzVWEa0r8pWycUr 6VecW1TKEDY7w616OvTpu1EnMMsIdOtJIoCknph0SvYHbAtRpWrr6Diu4MAgGtr1p8m8fD2X Ps4VAhLc1f7UhxxZgpIBZgiu8imiSyqG9FfgA3M/vdmi4TJ9yRP17zqPMvQa8a9b8xflUaFp UrL5238RBodMbS37j6I8WmlgOPVqh/qQ4IZFLC+9flCjUWawyoYDxh+fXKhvfS8vWuvVNteJ lI89zInqO4580nDZsP0XwC85mWFuBEcc9NKFqsx7wTl4qDZ+RqDD24ICDtIcsU7tdQeTCYvk FSOmrvBFTFp9bGYV3+Z3rOVti+pfzgYK3cYYi0JRhdD5MPsyKkUih/MVd9lHLSCp9v5Ayzrw zuKoS49gJ0elccOka68+DjvgTihu5zIRQ4d/RjMUySu6QYRTJW+e4Wi5Fzf7PBBBIWUVF+Mu D4Dgcf2xOITCZCAkgSdTeNLG6umj8tpKxWF3wQpRcN4sW3wpTjzJui8/Q2SOm8wPPk9YWfQa 3X4hht1x7wUDFKXXY1eNtfZ59sR8YDsEtHsV/bxZ9VIY4RseALvwByCdXJ8zEi2zhFyzPhX1 YOzNJ/1UC1EUfgPIC+eGr9FuYLH0BzS0o86qXrT6x28mYSTa3eOIVvuGAvfN7tphE9oTei8z jq+Cydo40gFOAEdSnOOmWL2EbztBSJlba0aU+QNKoa+zvNOQQnN8cP5z7I7YJBClK9IjOrO9 XzVchYGlAag1S2adF3TMSoLhFbTsXFX9yxTAMDRFQzwhyhLjXiHsM/ziKfbjZF4rbc+nJaYv tEOetmaA+Qnd9g00291UHUJl6Q7LE7DrVvXZ0KNOWFjF7Y9F12h0oK1JWPHqXhRZhdbQONj+ tVMICuAGsFdL+mjZe6LAM+SI6SZ4yRDxrIoAhegzxs6UByEzbWG4hfZ15cfC8oNNQ/C1n2d0 QOXCg0fvu7Dv8k+99yhuExOh9jB/zJWEhUIEm/Fw6yxMCWGrGOvzZUZCLSDfCzHVXOy86KnP L0Hw/b5OfwBvVBLr4sjTOo7kfNgv4Pi9+1A0wBpPHTXdFD3WLluFWaLgJtUvapXy74H5QbvA hCT+sNXMKmiMd/+FAJDPxIsa+mOjKlGmjTb4fkvDl/94St7oOiOXUlIZkHegy1BNrpldogix L556sIR7gW+jDssM8qH0X8IpzjdcCRYXvx+5J8AAYLthg468X14YMTRWn3s/ZWCS9RQKU12c DWas7XP2uZHzU3YfnttSXWUhbhBhY4DsQxhxUMZIwjbgcLMg/I60UED8TkzSQgJnBxL3/gqZ zpuPkxxY66P4y1plI5IWGX1Q1NNAxiQ+0rQzVoVlTKGExn0BzKVdGBta/yQ+E058n5HemkJ9 b6V/2/pTDL2cZyjxSA1Q0Nk96TuQNEZGtcuQyx78xBpx6UHXAc=
  • Ironport-hdrordr: A9a23:ScNwTavuzrDShlXaTKITW6O47skC7oMji2hC6mlwRA09TyXGra +TdaUguSMc1gx9ZJhBo7G90KnpewK6yXdQ2/hqAV7EZniahILIFvAY0WKG+VPd8kLFh4xgPM tbAs1D4ZjLfCRHZKXBkXiF+rQbsaC6GcmT7I+0pRcdLj2CKZsQlzuRYjzrbHGeLzM2Y6bReq Dsgvau8FGbCAsqh4mAdzI4dtmGg+eOuIPtYBYACRJiwA6SjQmw4Lq/NxSDxB8RXx5G3L9nqA H+4kHEz5Tml8v+5g7X1mfV4ZgTsNz9yuFbDMjJrsQOMD3jhiuheYwkcbyfuzIepv2p9T8R4Z PxiiZlG/42x2Laf2mzrxeo8w780Aw243un8lOciWuLm72OeBsKT+56wa5JeBrQ7EQt+Ptm1r hQ4m6fv51LSTvdgSXU/bHzJl9Xv3vxhUBnvf8YjnRZX4dbQqRWt5Yj8ERcF4pFND7m6bogDP JlAKjnlblrmGuhHjDkV1RUsZ+RtixZJGbFfqFCgL3Y79FupgE586NCr/Zv20vp9/oGOu55Dq r/Q+BVfYp1P7wrhJRGdZM8qPuMexzwqC33QRCvyHTcZeg60iH22tbKCItc3pDeRHVP9up0pK j8
  • Ironport-sdr: ojCVVkJR/IOdNp5aqHlLdkK2v3oVTT7gfpHFNW4z++Rh0oYJsIKyCnAUNKNxVwDRjLGm7nIl3e meptWAyG7lT7C/m7qLy1Lr+dQICU60i/4RaJZ5YP0PGu/cKAzUdUFzrfxlHh8ayWgjFLOXNkhr uyo7mvUOkdBm3a40ebcZPX+5YzJwN6HRaJdDViHeU12HF+tOAPbNDeLr/SRfrDSce+ChE01dJJ W3XyAx2e7eWjF9YS2Uxf+mCCOi0dN5w/6A+SxUNGrKWpz/nItACS7uzZHbLrawLzxcX+RGFhRM wvt4gtjTzRp+0kgxF9hf6YC9
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Feb 14, 2022 at 11:15:27AM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 14.02.22 13:11, Roger Pau Monné wrote:
> > On Mon, Feb 14, 2022 at 10:53:43AM +0000, Oleksandr Andrushchenko wrote:
> >>
> >> On 14.02.22 12:34, Roger Pau Monné wrote:
> >>> On Mon, Feb 14, 2022 at 09:36:39AM +0000, Oleksandr Andrushchenko wrote:
> >>>> On 11.02.22 13:40, Roger Pau Monné wrote:
> >>>>> +
> >>>>>>>>          for ( i = 0; i < msix->max_entries; i++ )
> >>>>>>>>          {
> >>>>>>>>              const struct vpci_msix_entry *entry = &msix->entries[i];
> >>>>>>> Since this function is now called with the per-domain rwlock read
> >>>>>>> locked it's likely not appropriate to call process_pending_softirqs
> >>>>>>> while holding such lock (check below).
> >>>>>> You are right, as it is possible that:
> >>>>>>
> >>>>>> process_pending_softirqs -> vpci_process_pending -> read_lock
> >>>>>>
> >>>>>> Even more, vpci_process_pending may also
> >>>>>>
> >>>>>> read_unlock -> vpci_remove_device -> write_lock
> >>>>>>
> >>>>>> in its error path. So, any invocation of process_pending_softirqs
> >>>>>> must not hold d->vpci_rwlock at least.
> >>>>>>
> >>>>>> And also we need to check that pdev->vpci was not removed
> >>>>>> in between or *re-created*
> >>>>>>> We will likely need to re-iterate over the list of pdevs assigned to
> >>>>>>> the domain and assert that the pdev is still assigned to the same
> >>>>>>> domain.
> >>>>>> So, do you mean a pattern like the below should be used at all
> >>>>>> places where we need to call process_pending_softirqs?
> >>>>>>
> >>>>>> read_unlock
> >>>>>> process_pending_softirqs
> >>>>>> read_lock
> >>>>>> pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn);
> >>>>>> if ( pdev && pdev->vpci && is_the_same_vpci(pdev->vpci) )
> >>>>>> <continue processing>
> >>>>> Something along those lines. You likely need to continue iterate using
> >>>>> for_each_pdev.
> >>>> How do we tell if pdev->vpci is the same? Jan has already brought
> >>>> this question before [1] and I was about to use some ID for that purpose:
> >>>> pdev->vpci->id = d->vpci_id++ and then we use pdev->vpci->id  for checks
> >>> Given this is a debug message I would be OK with just doing the
> >>> minimal checks to prevent Xen from crashing (ie: pdev->vpci exists)
> >>> and that the resume MSI entry is not past the current limit. Otherwise
> >>> just print a message and move on to the next device.
> >> Agree, I see no big issue (probably) if we are not able to print
> >>
> >> How about this one:
> >>
> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >> index 809a6b4773e1..50373f04da82 100644
> >> --- a/xen/drivers/vpci/header.c
> >> +++ b/xen/drivers/vpci/header.c
> >> @@ -171,10 +171,31 @@ static int __init apply_map(struct domain *d, const 
> >> struct pci_dev *pdev,
> >>                                struct rangeset *mem, uint16_t cmd)
> >>    {
> >>        struct map_data data = { .d = d, .map = true };
> >> +    pci_sbdf_t sbdf = pdev->sbdf;
> >>        int rc;
> >>
> >> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock));
> >> +
> >>        while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == 
> >> -ERESTART )
> >> +    {
> >> +
> >> +        /*
> >> +         * process_pending_softirqs may trigger vpci_process_pending which
> >> +         * may need to acquire pdev->domain->vpci_rwlock in read mode.
> >> +         */
> >> +        write_unlock(&pdev->domain->vpci_rwlock);
> >>            process_pending_softirqs();
> >> +        write_lock(&pdev->domain->vpci_rwlock);
> >> +
> >> +        /* Check if pdev still exists and vPCI was not removed or 
> >> re-created. */
> >> +        if (pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn) != 
> >> pdev)
> >> +            if ( vpci is NOT the same )
> >> +            {
> >> +                rc = 0;
> >> +                break;
> >> +            }
> >> +    }
> >> +
> >>        rangeset_destroy(mem);
> >>        if ( !rc )
> >>            modify_decoding(pdev, cmd, false);
> >>
> >> This one also wants process_pending_softirqs to run so it *might*
> >> want pdev and vpci checks. But at the same time apply_map runs
> >> at ( system_state < SYS_STATE_active ), so defer_map won't be
> >> running yet, thus no vpci_process_pending is possible yet (in terms
> >> it has something to do yet). So, I think we just need:
> >>
> >>           write_unlock(&pdev->domain->vpci_rwlock);
> >>           process_pending_softirqs();
> >>           write_lock(&pdev->domain->vpci_rwlock);
> >>
> >> and this should be enough
> > Given the context apply_map is called from (dom0 specific init code),
> > there's no need to check for the pdev to still exits, or whether vpci
> > has been recreated, as it's not possible. Just add a comment to
> > explicitly note that the context of the function is special, and thus
> > there's no possibility of either the device or vpci going away.
> Does it really need write_unlock/write_lock given the context?...

I think it's bad practice to call process_pending_softirqs while
holding any locks. This is a very specific context so it's likely fine
to not drop the lock, but would still seem incorrect to me.

> I think it doesn't as there is no chance defer_map is called, thus
> process_pending_softirqs -> vpci_process_pending -> read_lock

Indeed, there's no chance of that because process_pending_softirqs
will never try to do a scheduling operation that would result in our
context being scheduled out.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.