[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 18 Nov 2021 15:04:03 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AropeSepIw3hrBomuaKAfY8kV1MLCtyXtX7n/30PMH8=; b=hRs4oGtIQtBNIt4mL0Yr8gawXktRm1SRh+sDBSfy4raQwSMDvOjHgyUwfzSUOApeAUKw8vrE+y1WNOxoPpRF+t2wFDDYOauouZ2FChaaUWibQ4YptLfg8cAUQgDhKud1QsqV23QSPkVptpGQLstFsYpbv8eOvOSVY90EpgWZm4I6kG1PkN7zu8IkztPCKXfc5poFzFEGELyd6kdQ8m42DVPMtmtBdAR2coplffTYZACk1aoBbQ8/A1fu+w1SR/2Aw78213XGgxxfp/YGokr3gjzvA4kpbmRnTx0Tyfi5l8u81FJFYaVycltADb3AeqsttELPrbZM9pLZTxrtRyGdkw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oGdF7+dPcT9tavaR2XrnQGpZypB6E1fKZtcdRbhOeBO+72pzWguv0L05sCZHkoXt6EQ9S2gUSUZ1b8fJSb5xEI/pfcC4sz1I4/Zy01xUbG1Anm+d0Q3UWRORHxu/gU05v7fYfPoyw7yDnQqvN3QjwcBOSNI5gGvoQobyiyZf2YoJD3OuxZrDnEeIhh4+kWM5c/gF7nxcZYgdZz8Lpl6EC+AkTijRtwkNSjgM4KqOGjVJcp0Mz871+m1uFpzq23VaNMSuHi+XwaGgUn6qc2l+bSQE5VHQ3h42B9L+67RVALAlCed57t5EjJgU5KxDFVLnnMvwUhyiVaiXiE2DZJR2ZQ==
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "paul@xxxxxxx" <paul@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 18 Nov 2021 14:04:47 +0000
  • Ironport-data: A9a23:739hhq9mL+V0g8AfZgZeDrUDrXiTJUtcMsCJ2f8bNWPcYEJGY0x3n WROWW6GMq3cZGbxLtB/Otnl8ktVuZ/dnIJnGlBvrHw8E34SpcT7XtnIdU2Y0wF+jyHgoOCLy +1EN7Es+ehtFie0Si9AttENlFEkvU2ybuOU5NXsZ2YhGmeIdA970Ug6wrdg0tYz6TSEK1jlV e3a8pW31GCNg1aYAkpMg05UgEoy1BhakGpwUm0WPZinjneH/5UmJMt3yZWKB2n5WuFp8tuSH I4v+l0bElTxpH/BAvv9+lryn9ZjrrT6ZWBigVIOM0Sub4QrSoXfHc/XOdJFAXq7hQllkPg2z Mptsc3qUj52P/LRiMA+Djx+NnpXaPguFL/veRBTsOSWxkzCNXDt3+9vHAc9OohwFuRfWD8Us 6ZCcXZUM07F17neLLGTE4GAguw5K8bmJsUHs2xIxjDFF/c2B5vERs0m4PcFjG9g1pEQRJ4yY eI2ZGpTfR/gZiFuJ3UzVolvhOChvyLWJmgwRFW9+vNsvjm7IBZK+KDkLd79atGMA8JPkS6wj 3ja8mHOJwAVPd2S1xKI6nupwOTImEvTUo8ICKex8PIshVSJ33ESEzUfT179qv684mauVtQaJ 0EK9y4Gqakp6FftXtT7Rwe/onOPolgbQdU4O/cz6ByJjLHV5QmZLmEeS3hKb9lOnPExQTsmx 1qYheTDDDZksKCWYX+F/7LSpjS3UQAXJ2IfYS4PTSMe/sLu5oo0i3rnadJuE7W8iNHvLhj2z yqXtyg1h7gVjskj2r2y+BbMhDfEjprUSg844C3HU2Tj6Rl2DKaCY4Gr8lHd4ex3EJeCTlKBs X4HnOCT9OkLS5qKkUSlW/4RFbuk4/KENjz0glN1GZQlsTO39BaekZt4uW8kYh0za4BdJGGvM BS7VR5tCIF7e3+1TasmOZmIVMV10LjENonAcNWPR48bCnRuTzOv8CZrbE+W+mnilkkwjK0yU aumndaQ4WUyUvo+kmfvLwsJ+fpyn31lmzuPLXzu503/ieL2WZKDdVsS3LJihMgd5bjMngja+ s032yCim0QGC72WjsU6HOcuwbE2wZoTWcCeRy9/LLfrzu9a9IcJUaO5LVQJIdINokitvr2Ul kxRo2cBoLYFuVXJKB+RdldoY671UJB0oBoTZHJ3Yg/3hyV/Mdv/ts/zkqfbm5F9qISPKtYuE ZE4lzioWKwTGlwrBRxDBXUCkGCSXEvy3l/fV8ZUSDM+Y4RhV2T0FizMJWPSGN01JnPv76MW+ uT4viuCGMZrb1kyXa7+NaP0p3vs7Cd1pQ6HdxaRSjWlUB63q9YCxu2YpqJfHvzg3j2fnGbHj FjPXk9DzQQPyqdsmOT0aWm/h97BO8N1H1ZAHnmd6rCzNCLA+XGkz5MGW+GNFQ0xnkuukEl7T ekKnfz6LtMdm1NG79h1H7pxlPps7Nrzvb5KiA9jGSyTPVisD7phJFiA3NVO6fIRluMI51PuV xLd4MReNJWIJNjhTAwbKj06Y7nRzvoTgDTTs6g4eR2o+C9t8bObekxOJB3Q2jdFJb54Pdp9k +csscIb8SKljR8uPorUhyxY7T3UfHcBT78mptcRB4qy0lgnzVRLYJr9DC7q4c7QN4UQYxdye jLN3fjMnbVRwEbGYkEfL3mV0LoPn4kKtTBL0EQGewaDlO3ai6JlxxZW6zk2EFhYl00Vz+JpN 2F3HERpPqHSrSxwjc1OUm3wSQFMABqVph74x1cTzTCLSkCpUirGLXEnOPbL90ccqjoOcj9e9 bCe6WDkTTe1I52hgnpsARZo+675UNh81gzeg8T2Tc2KEq4zbSfhnqLzN3EDrAHqAJ9piUDKz QWwED2ctUEv2fYsnpAG
  • Ironport-hdrordr: A9a23:0rrSG6Cj5Z7B6VjlHegwsceALOsnbusQ8zAXPh9KJyC9I/b2qy nxppgmPH/P6Ar4WBkb6La90Y27MA7hHPlOkPUs1NaZLXPbUQ6TTb2KgrGSpgEIdxeOktK1kJ 0QDJSWa+eAfWSS7/yKmDVQeuxIqLLsndHK9IWuvEuFDzsaEJ2Ihz0JezpzeXcGPTWua6BJc6 Z1saF81kSdkDksH4mGL0hAe9KGi8zAlZrgbxJDLxk76DOWhTftzLLhCRCX0joXTjsKmN4ZgC f4uj28wp/mn+Cwyxfa2WOWx5NKmOH5wt8GIMCXkMAaJhjllw7tToV8XL+puiwzvYiUmRoXue iJhy1lE9V46nvXcG3wiRzx2zP42DJr0HPmwU/wuwqqneXJABYBT+ZRj4NQdRXUr2A6ustn7a 5N12WF87JKEBLphk3Glpj1fiAvsnDxjWspkOYVgXAae5AZcqVtoYsW+14QOIscHRj99JssHI BVfYDhDc5tABGnhk3izyxSKITGZAV2Iv7GeDlNhiWt6UkUoJgjpHFog/D2nR87hdsAotd/lq L5259T5cRzp/ktHNRA7dc6MLmK41P2MGbx2UKpUB/a/fI8SjjwQ6Ce2sRD2AjtQu1Q8KcP
  • Ironport-sdr: gLxUiSsALcGkYvWxhjW1a2cFwlA6zGfSu6WPZgUrgxkZulLj5FZBMgIxlj9DT0tbWX/E1C9IHU LEtrVtMTulOo/BouJSDHzU8DS2ZBqVjSPpg358HDuax76FPhUYwABbm3KrW0kOcWHu60MyS5Xq h85VThE/PuhNkxMiP5m3N2WFUXL0Ohhu5tjKfmK6F8hJUXLVY48otXutDXPo0lSEpE1cf4wXj9 OxnxOkAHvQrfYX5YtJ7mJLG/ecTlh9kKEhB3iKhZubUaRheUM0j339L/yeIpDK2yjjjBErf6e4 ffeRr4IE5QxWuxX6+9U8mp+3
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Sorry, I've been quite busy with other staff.

On Thu, Nov 18, 2021 at 01:48:06PM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 18.11.21 15:25, Jan Beulich wrote:
> > On 18.11.2021 10:32, Oleksandr Andrushchenko wrote:
> >>
> >> On 18.11.21 11:15, Jan Beulich wrote:
> >>> On 18.11.2021 09:54, Oleksandr Andrushchenko wrote:
> >>>> On 18.11.21 10:36, Jan Beulich wrote:
> >>>>> On 18.11.2021 08:49, Oleksandr Andrushchenko wrote:
> >>>>>> On 17.11.21 10:28, Jan Beulich wrote:
> >>>>>>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
> >>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> >>>>>>>>
> >>>>>>>> When a vPCI is removed for a PCI device it is possible that we have
> >>>>>>>> scheduled a delayed work for map/unmap operations for that device.
> >>>>>>>> For example, the following scenario can illustrate the problem:
> >>>>>>>>
> >>>>>>>> pci_physdev_op
> >>>>>>>>        pci_add_device
> >>>>>>>>            init_bars -> modify_bars -> defer_map -> 
> >>>>>>>> raise_softirq(SCHEDULE_SOFTIRQ)
> >>>>>>>>        iommu_add_device <- FAILS
> >>>>>>>>        vpci_remove_device -> xfree(pdev->vpci)
> >>>>>>>>
> >>>>>>>> leave_hypervisor_to_guest
> >>>>>>>>        vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci 
> >>>>>>>> == NULL
> >>>>>>>>
> >>>>>>>> For the hardware domain we continue execution as the worse that
> >>>>>>>> could happen is that MMIO mappings are left in place when the
> >>>>>>>> device has been deassigned
> >>>>>>>>
> >>>>>>>> For unprivileged domains that get a failure in the middle of a vPCI
> >>>>>>>> {un}map operation we need to destroy them, as we don't know in which
> >>>>>>>> state the p2m is. This can only happen in vpci_process_pending for
> >>>>>>>> DomUs as they won't be allowed to call pci_add_device.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Oleksandr Andrushchenko 
> >>>>>>>> <oleksandr_andrushchenko@xxxxxxxx>
> >>>>>>> Thinking about it some more, I'm not convinced any of this is really
> >>>>>>> needed in the presented form.
> >>>>>> The intention of this patch was to handle error conditions which are
> >>>>>> abnormal, e.g. when iommu_add_device fails and we are in the middle
> >>>>>> of initialization. So, I am trying to cancel all pending work which 
> >>>>>> might
> >>>>>> already be there and not to crash.
> >>>>> Only Dom0 may be able to prematurely access the device during "add".
> >>>>> Yet unlike for DomU-s we generally expect Dom0 to be well-behaved.
> >>>>> Hence I'm not sure I see the need for dealing with these.
> >>>> Probably I don't follow you here. The issue I am facing is Dom0
> >>>> related, e.g. Xen was not able to initialize during "add" and thus
> >>>> wanted to clean up the leftovers. As the result the already
> >>>> scheduled work crashes as it was not neither canceled nor interrupted
> >>>> in some safe manner. So, this sounds like something we need to take
> >>>> care of, thus this patch.
> >>> But my point was the question of why there would be any pending work
> >>> in the first place in this case, when we expect Dom0 to be well-behaved.
> >> I am not saying Dom0 misbehaves here. This is my real use-case
> >> (as in the commit message):
> >>
> >> pci_physdev_op
> >>        pci_add_device
> >>            init_bars -> modify_bars -> defer_map -> 
> >> raise_softirq(SCHEDULE_SOFTIRQ)
> >>        iommu_add_device <- FAILS
> >>        vpci_remove_device -> xfree(pdev->vpci)
> >>
> >> leave_hypervisor_to_guest
> >>        vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == 
> >> NULL
> > First of all I'm sorry for having lost track of that particular case in
> > the course of the discussion.
> No problem, I see on the ML how much you review every day...
> >
> > I wonder though whether that's something we really need to take care of.
> > At boot (on x86) modify_bars() wouldn't call defer_map() anyway, but
> > use apply_map() instead. I wonder whether this wouldn't be appropriate
> > generally in the context of init_bars() when used for Dom0 (not sure
> > whether init_bars() would find some form of use for DomU-s as well).
> > This is even more so as it would better be the exception that devices
> > discovered post-boot start out with memory decoding enabled (which is a
> > prereq for modify_bars() to be called in the first place).
> Well, first of all init_bars is going to be used for DomUs as well:
> that was discussed previously and is reflected in this series.
> 
> But the real use-case for the deferred mapping would be the one
> from PCI_COMMAND register write emulation:
> 
> void vpci_cmd_write(const struct pci_dev *pdev, unsigned int reg,
>                      uint32_t cmd, void *data)
> {
> [snip]
>          modify_bars(pdev, cmd, false);
> 
> which in turn calls defer_map.
> 
> I believe Roger did that for a good reason not to stall Xen while doing
> map/unmap (which may take quite some time) and moved that to
> vpci_process_pending and SOFTIRQ context.

Indeed. In the physdevop failure case this comes from an hypercall
context, so maybe you could do the mapping in place using hypercall
continuations if required. Not sure how complex that would be,
compared to just deferring to guest entry point and then dealing with
the possible cleanup on failure.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.