[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Design session "MSI-X support with Linux stubdomain" notes


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 29 Sep 2022 13:52:14 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LgHz8MTh19YK53dT4jZv66nPGZ4MpH2L//3viF4uduw=; b=NLrvIqOm172/6eSRHi4m7dUSSio13yVwOZ4OxsvkB6eaKWFOqxN/1sRX4j/xFezHf0CBBLRHtCWQhpee+jzZXQIDEHqqXEjfzNi5plBLxp2gZ6PbDEV9Jgh8zF2fyH6PZhUKc4MQnAjPC3CQbw8x6n8w12Dj3c0AP5MWpVa2LfIFprmLWLtYCV9l729naP/ycBz/HdlVGMXwW7wPEUyKiY0IXExRQGUflQkCNMYOWM58meckBio3qv+MVo+/OIMn57EqGrzmraBfQfkgILGrNpQVd8RIELCpcz4qCyvaECTJc7KELpIhJez1f53W88ZnR5b2UO6rtuevBAKGX6JWAQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=duLq3JHCUo1EKJjhhuEhA+eZ2kOfVVkV5BbIhQj6zPdoJBXLrWT5YclXbBQKcZGmaJCkJr+s7jQokr6fZPcW2bA5gTBs5yIFEUf2MN1chAQP6V39u/3KtuukMx7j0/q7AT+wCeenpUCHkCXyntUUeK6HtE4Ka8EneIPOunaZ2ZrT1IXzqTRiOG+XZbeJyHULdOSLZQp1W1XPywMWKltpUbtr9G2LPoQDwho1+LFVRZYYA4+/AZS2CJ2b+IzwEcZLEUhNeSSBSC0cTX2jGbEBgAVt7wTPNc/H2t/xaIyLQvBN6gS7XXGoKqjD5TY5FkR2lZbGGXSlPO1URUjjEDZHRw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, George Dunlap <george.dunlap@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Thu, 29 Sep 2022 11:52:26 +0000
  • Ironport-data: A9a23:6FXwXalSHVxFcZhIaMtlkOvo5gy3J0RdPkR7XQ2eYbSJt1+Wr1Gzt xIWW2nQP/2JazGked8nYdux9BkOu5/XydA1TFNp+SE8RCMWpZLJC+rCIxarNUt+DCFhoGFPt JxCN4aafKjYaleG+39B55C49SEUOZmgH+a6UqicUsxIbVcMYD87jh5+kPIOjIdtgNyoayuAo tq3qMDEULOf82cc3lk8tuTS83uDgNyo4GlC5g1kNKgR1LPjvyJ94Kw3dPnZw0TQGuG4LsbiL 87fwbew+H/u/htFIrtJRZ6iLyXm6paLVeS/oiI+t5qK23CulQRrukoPD9IOaF8/ttm8t4sZJ OOhF3CHYVxB0qXkwIzxWvTDes10FfUuFLTveRBTvSEPpqFvnrSFL/hGVSkL0YMkFulfWUREy tY9CDQxbEq+27Kkh+OXU/VFmZF2RCXrFNt3VnBI6xj8VKxjbbWdBqLA6JlfwSs6gd1IEbDGf c0FZDFzbRPGJRpSJlMQD5F4l+Ct7pX9W2QA9BTJ+uxqsi6Kk1YZPLvFabI5fvSQQspYhACAr 3/u9GXlGBAKcteYzFJp91r837KezH2gBur+EpXmp9tOqnmhgVcXBV4Ge2ewiL6WkgmXDoc3x 0s8v3BGQbIJ3E6hQ8T5Xha4iGWZpRNaUN1Ve8U/4RuIw7DZ4C6YAHYFVT9LbNE6tM4wSicu3 1XPlNTsbRR/vbvQRX+D+7O8qTKpJTNTPWIEfTUDTwYO/5/kuo5bs/7UZtNqEarwgtirHzj1m mqOtHJn2O9VitMX3aKm+1yBmyirupXCUg8y4EPQQ36h6QR6IoWiYuRE9GTm0BqJF67BJnHpg ZTOs5H2ADwmZX1VqBGwfQ==
  • Ironport-hdrordr: A9a23:R9SGrqhlOU8nMgGD6iSJFRn0xXBQX0F13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03I+eruBEBPewK4yXdQ2/hoAV7EZnichILIFvAa0WKG+VHd8kLFltK1uZ 0QEJSWTeeAd2SS7vyKnzVQcexQp+VvmZrA7Ym+854ud3ANV0gJ1XYENu/xKDwTeOApP+taKH LKjfA32gZINE5nGPiTNz0gZazuttfLnJXpbVovAAMm0hCHiXeN5KThGxaV8x8CW3cXqI1SuV Ttokjc3OGOovu7whjT2yv66IlXosLozp9mCNaXgsYYBz3wgkKDZZhnWZeFoDcpydvfoWoCoZ 3pmVMNLs5z43TeciWcpgbs4RDp1HIU53rr2Taj8A7eiP28YAh/J9tKhIpffBecwVEnpstA3K VC2H/cn4ZLDDvb9R6Nq+TgZlVPrA6ZsHAimekcgzh0So0FcoJcqoQZ4Qd8DIoAJiTn84oqed MeQf003MwmP29yUkqp/1WGmLeXLzQO91a9MwI/U/WuondrdCsT9Tpa+CQd9k1whq7VBaM0pd gsCZ4Y5I2mfvVmE56VO91xMPdfKla9NS4kY1jiVmjPJeUgB0/njaLRzfEc2NyKEaZ4v6fa3q 6xG29liQ==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Sep 29, 2022 at 01:44:28PM +0200, Jan Beulich wrote:
> On 29.09.2022 12:57, Marek Marczykowski-Górecki wrote:
> > On Mon, Sep 26, 2022 at 02:47:55PM +0200, Jan Beulich wrote:
> >> On 26.09.2022 14:43, Marek Marczykowski-Górecki wrote:
> >>> On Thu, Sep 22, 2022 at 08:00:00PM +0200, Jan Beulich wrote:
> >>>> On 22.09.2022 18:05, Anthony PERARD wrote:
> >>>>> WARNING: Notes missing at the beginning of the meeting.
> >>>>>
> >>>>> session description:
> >>>>>> Currently a HVM with PCI passthrough and Qemu Linux stubdomain doesn’t
> >>>>>> support MSI-X. For the device to (partially) work, Qemu needs a patch 
> >>>>>> masking
> >>>>>> MSI-X from the PCI config space. Some drivers are not happy about 
> >>>>>> that, which
> >>>>>> is understandable (device natively supports MSI-X, so fallback path are
> >>>>>> rarely tested).
> >>>>>>
> >>>>>> This is mostly (?) about qemu accessing /dev/mem directly (here:
> >>>>>> https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_msi.c#L579) - 
> >>>>>> lets
> >>>>>> discuss alternative interface that stubdomain could use.
> >>>>>
> >>>>>
> >>>>>
> >>>>> when qemu forward interrupt,
> >>>>>     for correct mask bit, it read physical mask bit.
> >>>>>     an hypercall would make sense.
> >>>>>     -> benefit, mask bit in hardware will be what hypervisor desire, 
> >>>>> and device model desire.
> >>>>>     from guest point of view, interrupt should be unmask.
> >>>>>
> >>>>> interrupt request are first forwarded to qemu, so xen have to do some 
> >>>>> post processing once request comes back from qemu.
> >>>>>     it's weird..
> >>>>>
> >>>>> someone should have a look, and rationalize this weird path.
> >>>>>
> >>>>> Xen tries to not forward everything to qemu.
> >>>>>
> >>>>> why don't we do that in xen.
> >>>>>     there's already code in xen for that.
> >>>>
> >>>> So what I didn't pay enough attention to when talking was that the
> >>>> completion logic in Xen is for writes only. Maybe something similar
> >>>> can be had for reads as well, but that's to be checked ...
> >>>
> >>> I spent some time trying to follow that part of qemu, and I think it
> >>> reads vector control only on the write path, to keep some bits
> >>> unchanged, and also detect whether Xen masked it behind qemu's back.
> >>> My understanding is, since 484d7c852e "x86/MSI-X: track host and guest
> >>> mask-all requests separately" it is unnecessary, because Xen will
> >>> remember guest's intention, so qemu can simply use its own internal
> >>> state and act on that (guest writes will go through qemu, so it should
> >>> have up to date view from guest's point of view).
> >>>
> >>> As for PBA access, it is read by qemu only to pass it to the guest. I'm
> >>> not sure whether qemu should use hypercall to retrieve it, or maybe
> >>> Xen should fixup value itself on the read path.
> >>
> >> Forwarding the access to qemu just for qemu to use a hypercall to obtain
> >> the value needed seems backwards to me. If we need new code in Xen, we
> >> can as well handle the read directly I think, without involving qemu.
> > 
> > I'm not sure if I fully follow what qemu does here, but I think the
> > reason for such handling is that PBA can (and often do) live on the same
> > page as the actual MSI-X table. I'm trying to adjust qemu to not
> > intercept this read, but at this point I'm not yet sure of that's even
> > possible on sub-page granularity.
> > 
> > But, to go forward with PoC/debugging, I hardwired PBA read to
> > 0xFFFFFFFF, and it seems it doesn't work. My observation is that the
> > handler in the Linux driver isn't called. There are several moving
> > part (it could very well be bug in the driver, or some other part in the
> > VM). Is there some place in Xen I can see if an interrupt gets delivered
> > to the guest (some function I can add debug print to), or is it
> > delivered directly to the guest?
> 
> I guess "iommu=no-intpost" would suppress "direct" delivery (if hardware
> is capable of that in the first place). And wait - this option actually
> default to off.
> 
> As to software delivery - I guess you would want to start from
> do_IRQ_guest() and then see where things get lost. (Adding logging to
> such a path of course has a fair risk of ending up overly chatty.)

Having dealt with interrupt issues before, try to limit logging to the
IRQ you are interested on only - using xentrace might be a better
option depending on what you need to debug, albeit it's kind of a pain
to add new trace points as you also need to modify xenalyze to print
them.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.