[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vpci: Need for vpci_cancel_pending


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 28 Oct 2021 12:17:06 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0esejLYCgZOnikz06LoxTp7WD4zGjIBFKxEULQIsOSM=; b=OZDQhAmZDnyYOheDkeKhVT5YzvuMH6EeK2y+9GyiX7W391GSRB3XJsTP10oYR001ZlGAXNYEmMA74alGYzc4rDRL0DCRzF05UyJHn00O7NtnyVanrJlRgrz/L2o68OiHgvnsP3U9R8B377Nmq1J2rCr8mhA84EbmA4FFVirnmy3hofWaW4VQ0bIBms7xAeACskBKBZcB4RTb6fqEwj7xEZ2daNgDRW76rrKGQVwvwjabfDL8Qb/fuDwFyOU876om7xIAJG91/bY7b3bIkPOhIDYdbz7tnsm5nV1dq8zOR9S1Aei669UNrh+TIXVGbdhayKpzgSsF3OTPm9tya4oSuw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=C2D6vA0nNcGLC3MC8yiv68NBDrw0QKUxjix+2PO9H6kyjIyx9MMXFlIN3MA0B3NCXMeuk4pRTB2K3JX/nZlcPutibj4Kbya8vYKzJ6bPdbUjp3qkWxqyCfenSOz51YdCZwF9+q5BkIcGXfCD2/Qd4oVUrOjRzHvFsBSGLaGicHXWtWiwFekF7V4Bt4hyIC2bWo63q0T/94X5qYgm69YtWdlRipUlCNoqozQNt8EchrO8zjshx365c1f4sm5qZgLL7U/vUkDN+Xn7Mm/lj1Qkks3PHKdfgrS5cJfMLAiYcSceDTuDaPeY9bLNKCwN+Ys2QMAN7TC28dV1y90wrZsGzg==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 28 Oct 2021 10:17:29 +0000
  • Ironport-data: A9a23:HUl+EavliSf+u0kwhjGii4hIc+fnVLFZMUV32f8akzHdYApBsoF/q tZmKWrUb6uPZjH8c9pxOY23oExQ78PdnIVqSwRkrCk2HikQ+JbJXdiXEBz9bniYRiHhoOOLz Cm8hv3odp1coqr0/0/1WlTZQP0VOZigHtIQMsadUsxKbVIiGHhJZS5LwbZj29cw24jhWmthh PupyyHhEA79s9JLGjp8B5Kr8HuDa9yr5Vv0FnRnDRx6lAe2e0s9VfrzFonoR5fMeaFGH/bSe gr25OrRElU1XfsaIojNfr7TKiXmS1NJVOSEoiI+t6OK2nCuqsGuu0qS2TV1hUp/0l20c95NJ Npls8G2dSE2ebfwmeU5FAtGFhx+Oupp9+qSSZS/mZT7I0zudnLtx7NlDV0sPJ1e8eFyaY1M3 aVGcnZXNEnF3r/ohuLgIgVvrp1LwM3DJoQQt2sm1TjEJf0nXYrCU+PB4towMDIY254TRqqOO 5dxhTxHQjLDIBRMPE8rJbERweCS10PDS35IgQfAzUYwyzeKl1EguFT3C/LFd9rPSchLk0Kwo mPd43+/EhwcLMaYyzeO7jSrnOCntQT/VYEJHbu07MlDhlGJ23cTAx0bU1i8ifShg0v4UNVaQ 3H44QJ38/J0rhbyCICgAVvo+xZooyLwRfJWTusGzh7K0JbV4iLeFHYBcz5GL/Mp4ZpeqSMR6 neFmNbgBDpKubKTSG6A+rr8kQ5eKRT5PkdZOndaFVptD83L5dhp1EqWH4kL/Lud14WtQVnNL ya2QD/Sbln5pfUA0Lmn5hj5ijaoq4mhouUdt1iPADzNAu+UYueYi22UBbrzsakowGWxFADpU J04dy62t7lm4Xalz3XlfQn1NOv1j8tpyRWF6bKVI7Ev9i6251modp1K7Td1KS9Ba5hfJGS3O BWI51IBuPe/2UdGi4csOupd7OxxlMDd+SnNDKiIPrKinLAoLGdrAx2ClWbPhjuwwSDAYIk0O IuBcNbEMJrpIf8P8dZCfM9EieVD7nlnnQv7HMmnpzz6gev2TCPEEt8tbQrRBt3VGYvZ+W05B f4EbJDUo/ieOcWjChTqHXk7dgFXcyJjX8mq+6S6tIere2JbJY3oMNeIqZsJcI15haVF0ODO+ 3C2QEhDz1Tjw3bALG23hrpLMtsDhL5z8iA2OzICJ1Gt1yRxaIqj9v5HJZA2YaMm5KpoyvstF 6sJfMCJA/JuTDXb+mtCMcmh/dI6LBn71xiTOyeFYSQke8IyTQL+5dK5LBDk8zMDD3TruJJm8 aGgzA7SXbEKWx9mUJTNcPuqwl7o5Sodlet+UlHmON5WfEmwooFmJzao1q08It0WKAWFzTyfj l7EDRAdrOjLgok07NiW2vzU89b3S7NzRxMIEXPa4LC6MTjh0lCimYIQAvyVeT39VX/v/Pnwb +ti0PyhYuYMm0xHstQgHu8zn74+/dbmu5RT0h9gQCfQd12uB75tfiuG0M1IuvEfz7NVo1LrC EeG+90cMrSVIsL1VlUWIVN9POiE0PgVnBjU7Og0fxqmtHMmouLfXBUAJQSIhQxcMKBxYdEsz uoWscIL7xCy10gxOdGcgyEIr2mBIxTsiUn8Wk321GMztjcW9w==
  • Ironport-hdrordr: A9a23:p21tFqHxgRjzhmd0pLqE1MeALOsnbusQ8zAXPidKOHhom62j5q WTdZsgpHzJYVoqOU3I+urvBEDjewK6yXcd2+B4V9qftWHdyQ2VxepZnOnfKlPbexEW39QtrJ uJLMNFY+EYd2IUsS9R2njBLz9a+rW6zJw=
  • Ironport-sdr: UmGIX/SdaBhY25QhGWEtlOpFIf8F0DyDVt+Jf7VoKUj9l331FN5v9JAxnq34KfmSSQJ6CVUXmf Qh2WfuNnY0zzUhLWkVxQOFqJlXApK/AaXJJkpWh9zujZQhqIQpP4d6Fje2JeKttzdM0AfdDTxm Fq+4cSVHOqqNVNW7QMNB/dpQGPzBtYu+Suw2SHuYPYTDfAtpTWUtlbxJEcl/nccKvVFyFKxtTc SaxlTb6sbVbUNiUc6AR9m4LoKbjAgYCIaNLywIp4N3/dZZSLuLQe1EfGPyBiU/5rlFlo/MKNjs uy1Um9uZqhlAjSnWZEoXfnWF
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Oct 28, 2021 at 10:04:20AM +0000, Oleksandr Andrushchenko wrote:
> Hi, all!
> 
> While working on PCI passthrough on Arm I stepped onto a crash
> with the following call chain:
> 
> pci_physdev_op
>    pci_add_device
>        init_bars -> modify_bars -> defer_map -> 
> raise_softirq(SCHEDULE_SOFTIRQ)
>    iommu_add_device <- FAILS
>    vpci_remove_device -> xfree(pdev->vpci)
> 
> Then:
> leave_hypervisor_to_guest
>    vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL
> 
> Which results in the crash below:
> 
> (XEN) Data Abort Trap. Syndrome=0x6
> (XEN) Walking Hypervisor VA 0x10 on CPU0 via TTBR 0x00000000481dd000
> (XEN) 0TH[0x0] = 0x00000000481dcf7f
> (XEN) 1ST[0x0] = 0x00000000481d9f7f
> (XEN) 2ND[0x0] = 0x0000000000000000
> (XEN) CPU0: Unexpected Trap: Data Abort
> ...
> (XEN) Xen call trace:
> (XEN)    [<00000000002246d8>] _spin_lock+0x40/0xa4 (PC)
> (XEN)    [<00000000002246c0>] _spin_lock+0x28/0xa4 (LR)
> (XEN)    [<000000000024f6d0>] vpci_process_pending+0x78/0x128
> (XEN)    [<000000000027f7e8>] leave_hypervisor_to_guest+0x50/0xcc
> (XEN)    [<0000000000269c5c>] entry.o#guest_sync_slowpath+0xa8/0xd4
> 
> So, it seems that if pci_add_device fails and calls vpci_remove_device
> the later needs to cancel any pending work.

Indeed, you will need to check that v->vpci.pdev == pdev before
canceling the pending work though, or else you could be canceling
pending work from a different device.

> If this is a map operation it seems to be straightforward: destroy
> the range set and do not map anything.
> 
> If vpci_remove_device is called and unmap operation was scheduled
> then it can be that:
> - guest is being destroyed for any reason and skipping unmap is ok
>    as all the mappings for the whole domain will be destroyed anyways
> - guest is still going to stay alive and then unmapping must be done
> 
> I would like to hear your thought what would be the right approach
> to take in order to solve the issue.

For the hardware domain it's likely better to do nothing, and just try
to continue execution. The worse that could happen is that MMIO mappings
are left in place when the device has been deassigned.

For unprivileged domains that get a failure in the middle of a vPCI
{un}map operation we need to destroy them, as we don't know in which
state the p2m is. This can only happen in vpci_process_pending for
domUs I think, as they won't be allowed to call pci_add_device. Please
see the FIXME in vpci_process_pending related to this topic.

Regards, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.