[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Purpose of translate MSI interrupt into INTx for guest passthrough

  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 13 Jan 2021 21:00:47 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8ZZ8szRmKBRvwzfNR53C4cv/rYFIInvXqRurisfEsV0=; b=M4zmNY/ywUhgyILoGhI+0vfuVGN7xwPatHIrTxQupimwQ0XV6RRiNoPwSzD1pNBN3uIeaC3rX0lmGA4sfj7hnb2795y4kcyBfZaPNwr9ogHVBCB7mOH/S5E3HQCxQ7W9OWAX++9/2lluL66dHBT7ol/LsoevoWrmGpcnfjtVeOrz6WyNyAnNJjsUJHi9KnHDTtY5XJTW2Vam9d2ouTz6Kv8C/UD1haPVGyJoiIjkolyqQtyfefp7JydtiqmFYh5lkWF4m9cZ4p1U0aMsk4FBizA9Yt0ii2idLKTcI7Q39mgX50dX+TLmn9z9Z5MscU7X+QbpbeJwIY1/wn1J8kV5dQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XR5xRZdt1J/shUw+zcP5rEUJeNLH9f5e8b1dqVI7WZjzrJEhFf6v0MX00L/Pt7eeZv73PxfyvOHP4JoT+4s4/MbxFR/eNGeNM2+EFi+XUOeXSfu05p9JfRspESBXeiB3p1XDV716SZZYaM9MnHmqkC4DOICvl5u8AyLdg0GGGHFgBUFsQ7RaDCfZ4yfRf+lh/zUZNYyREeKHofTj60svrvVycG1yAlhRBoor2atuivA/FcjVMOBi9hRDoiD72GTDrTBnADBLW1ewqTgKpa25rjr67kqxiOC0vlsUdSjbylEw3COaJg1MUPMAoCFdieDjlmIgjvypsy7esoprvSGV+w==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 13 Jan 2021 21:02:00 +0000
  • Ironport-sdr: ZsqB+DoB5lUrsYITRL4a6PIFOhl8/djoLYKbwu3kfqH7da4/TMTAuikKQOzVg67StXL2vhIZK2 emcw5bo2pm6+lf2SgzV19if0rw1k4pRyosiufm0D6TMS0/C5Lzf+IF+GrkZgrUEynoYLzIyj0t Dci62QCIz/6u0xZe0m1cao+7Ua4NLN8lCz+p+e1XAfiZhT8o9J9hBzQ4eNZl0KPnl3eeR0fUNY ujr2RcCdZw1nn59SzEVD5yKpeqJ/cngUAvASpXQlrCYsyTml2EjVjbfnZ01+WJai9RcP5PFrjn h+Q=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 12/01/2021 15:51, Roger Pau Monné wrote:
> On Tue, Jan 12, 2021 at 09:48:17AM -0500, Jason Andryuk wrote:
>> On Tue, Jan 12, 2021 at 9:25 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>>> Dropping Qing He as this address bounces.
>>> On Tue, Jan 12, 2021 at 03:10:57PM +0100, Roger Pau Monné wrote:
>>>> Hello,
>>>> While trying to do some cleanup of the Xen interrupt support for pci
>>>> pass though I came across the MSI to INTx translation that Xen is in
>>>> theory capable of performing (ie: use a physical MSI interrupt source
>>>> and inject that as an INTx to a guest).
>>>> AFAICT such functionality is not wired up to the toolstack, so it's
>>>> hard to tell what's the indented purpose, or whether it has seen any
>>>> usage.
>>> So apparently it is wired up to the toolstack for qemu-traditional,
>>> albeit it's disabled by default. There's some documentation in
>>> xl-pci-configuration:
>>> "When enabled, MSI-INTx translation will always enable MSI on the PCI
>>> device regardless of whether the guest uses INTx or MSI."
>>> So the main purpose seem to be to always use the MSI interrupt source
>>> regardless of whether the guest is using INTx or MSI. Maybe the
>>> purpose was to workaround some bugs when using INTx? Or buggy devices
>>> with INTx interrupts?
>>> qemu-upstream doesn't seem to support it anymore, so I would still
>>> like to remove it if we get consensus.
>> The cover letter from
>> http://old-list-archives.xenproject.org/archives/html/xen-devel/2009-01/msg00228.html
>> """
>> This patchset enables MSI-INTx interrupt translation for HVM
>> domains. The intention of the patch is to use MSI as the physical
>> interrupt mechanism for passthrough device as much as possible,
>> thus reducing the pirq sharing among domains.
>> When MSI is globally enabled, if the device has the MSI capability
>> but doesn't used by the guest, hypervisor will try to user MSI as
>> the underlying pirq and inject translated INTx irq to guest
>> vioapic. When guest itself enabled MSI or MSI-X, the translation
>> is automatically turned off.
>> Add a config file option to disable/enable this feature. Also, in
>> order to allow the user to override the option per device, a
>> per-device option mechanism is implemented and an MSI-INTx option
>> is added
>> """
>> It seems like it could be a good idea, but I don't know if it presents
>> compatibility issues when actually used.
> Hm, MSI interrupts are edge triggered, while INTx is (usually) level.
> Also devices capable of multiple MSI vectors will be limited to a
> single one, and I'm not sure whether the transition from translated
> MSI to INTx into multiple MSIs would work correctly, as seems tricky.
>> As you say, it's not supported by qemu-upstream, so maybe it should
>> just be dropped.
> I don't really see much value in forcing Xen to always use MSI
> regardless of whether the guest is using INTx or MSI, and it's likely
> to cause more issues than benefits.
> IMO I think we should get rid of this, as the only real value is
> having Xen using MSI intend of INTx, but it's not introducing any kind
> of functionality from a guest PoV.

I find this feature very dubious.

While I agree that reducing INTx sharing between domains is obviously a
good thing, I don't see how the device can possibly be expected to work
if the in-guest driver doesn't have an accurate idea of what's going on.

There are up to 4 INTx lines, but absolutely nothing to suggest that
these would logically map to the first 4 MSIs enabled on the device. 
Even in the simplified case of only INTA and a single MSI, there's
nothing to suggest that the device will behave in the same way when it
comes to generating interrupts.

The difference between edge and line interrupts forces the guest driver
to explicitly de-assert the (what it thinks is a) line interrupt in a
device-specific manner, which in turn risks confusing the device which
is configured in MSI mode.

Also, it means Xen's emulation of edge=>line semantics has no clue when
to correctly deassert the line, without having device-specific knowledge
in the hypervisor, and appropriate trap&emulate intercepts on whatever
the devices deassert mechanism is.

I don't see how a feature like this can ever have worked, other than by
shear luck.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.