[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Design session "MSI-X support with Linux stubdomain" notes


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 29 Sep 2022 13:44:28 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dcZOI9nh7RvLNCNIgUq3HQSOUXZLasPQX4dPaC0wD+0=; b=NLg2jvqkYvAmTL5yq8zWNkoPxXM5ydHWbP6SAaEdmvl0DteO1GeTtuoF4z8a9pLyuX4cs/YdSrNbtzjoEBTp2WFn6eACfZt1idg7iwx/kh58GogeJH72dguo36YSxHYnYhk9UiSJjtyvf1L0iO7yg+ROWbshsiP1gGn62MsxTckcsbtLkjjP/rTmHDQw/wrEevWDCAdYChYtXa7EdYvOqHCqEf1ptSKqNSLbQIMAVRV1NoY6a2jKMMg/y+8XPmzeekzwWfpOCWKCN5CL113MlsArpem//gd3y+DkUquHf/Iom7NY2mtYgEUzvj1my/30Bt3Pcyq16eyAKRzn/05hPw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B/N9mLbiDXl/ORyakW87VFrCL7wAKGxNweYj7GFmTJ7NCKK3SF0QM/VHQB1C6ZHvj4HYuVYFwPJGA3pmqJWfOwf/xCZM1KMy5ZAyWbXu3fzbJ2GGskVmxawcOcKARQpC/EdiiPQtE8r17xsgIx7xeNPFdXtnbUskgeL0Ax2UPESOfDY3HIZf1ViLFNbqVpm5iPE18861yKX5Bl2+ILhL2Bd1/OKy6C1mgOhH8hWl1uv6WAqxhk+iwVudhFGWQxHTAnnkkGOEtP9ih18yrMMGgEFn6z7H/IFSJDjGH9DfCH1ULjx+KZWp3vC5hmKGgkM+ZfLsyV1Xv0mM86XIfN9rFw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, George Dunlap <george.dunlap@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Thu, 29 Sep 2022 11:44:41 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.09.2022 12:57, Marek Marczykowski-Górecki wrote:
> On Mon, Sep 26, 2022 at 02:47:55PM +0200, Jan Beulich wrote:
>> On 26.09.2022 14:43, Marek Marczykowski-Górecki wrote:
>>> On Thu, Sep 22, 2022 at 08:00:00PM +0200, Jan Beulich wrote:
>>>> On 22.09.2022 18:05, Anthony PERARD wrote:
>>>>> WARNING: Notes missing at the beginning of the meeting.
>>>>>
>>>>> session description:
>>>>>> Currently a HVM with PCI passthrough and Qemu Linux stubdomain doesn’t
>>>>>> support MSI-X. For the device to (partially) work, Qemu needs a patch 
>>>>>> masking
>>>>>> MSI-X from the PCI config space. Some drivers are not happy about that, 
>>>>>> which
>>>>>> is understandable (device natively supports MSI-X, so fallback path are
>>>>>> rarely tested).
>>>>>>
>>>>>> This is mostly (?) about qemu accessing /dev/mem directly (here:
>>>>>> https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_msi.c#L579) - lets
>>>>>> discuss alternative interface that stubdomain could use.
>>>>>
>>>>>
>>>>>
>>>>> when qemu forward interrupt,
>>>>>     for correct mask bit, it read physical mask bit.
>>>>>     an hypercall would make sense.
>>>>>     -> benefit, mask bit in hardware will be what hypervisor desire, and 
>>>>> device model desire.
>>>>>     from guest point of view, interrupt should be unmask.
>>>>>
>>>>> interrupt request are first forwarded to qemu, so xen have to do some 
>>>>> post processing once request comes back from qemu.
>>>>>     it's weird..
>>>>>
>>>>> someone should have a look, and rationalize this weird path.
>>>>>
>>>>> Xen tries to not forward everything to qemu.
>>>>>
>>>>> why don't we do that in xen.
>>>>>     there's already code in xen for that.
>>>>
>>>> So what I didn't pay enough attention to when talking was that the
>>>> completion logic in Xen is for writes only. Maybe something similar
>>>> can be had for reads as well, but that's to be checked ...
>>>
>>> I spent some time trying to follow that part of qemu, and I think it
>>> reads vector control only on the write path, to keep some bits
>>> unchanged, and also detect whether Xen masked it behind qemu's back.
>>> My understanding is, since 484d7c852e "x86/MSI-X: track host and guest
>>> mask-all requests separately" it is unnecessary, because Xen will
>>> remember guest's intention, so qemu can simply use its own internal
>>> state and act on that (guest writes will go through qemu, so it should
>>> have up to date view from guest's point of view).
>>>
>>> As for PBA access, it is read by qemu only to pass it to the guest. I'm
>>> not sure whether qemu should use hypercall to retrieve it, or maybe
>>> Xen should fixup value itself on the read path.
>>
>> Forwarding the access to qemu just for qemu to use a hypercall to obtain
>> the value needed seems backwards to me. If we need new code in Xen, we
>> can as well handle the read directly I think, without involving qemu.
> 
> I'm not sure if I fully follow what qemu does here, but I think the
> reason for such handling is that PBA can (and often do) live on the same
> page as the actual MSI-X table. I'm trying to adjust qemu to not
> intercept this read, but at this point I'm not yet sure of that's even
> possible on sub-page granularity.
> 
> But, to go forward with PoC/debugging, I hardwired PBA read to
> 0xFFFFFFFF, and it seems it doesn't work. My observation is that the
> handler in the Linux driver isn't called. There are several moving
> part (it could very well be bug in the driver, or some other part in the
> VM). Is there some place in Xen I can see if an interrupt gets delivered
> to the guest (some function I can add debug print to), or is it
> delivered directly to the guest?

I guess "iommu=no-intpost" would suppress "direct" delivery (if hardware
is capable of that in the first place). And wait - this option actually
default to off.

As to software delivery - I guess you would want to start from
do_IRQ_guest() and then see where things get lost. (Adding logging to
such a path of course has a fair risk of ending up overly chatty.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.