[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] x86/vMSI-X: avoid missing first unmask of vectors
>>> On 01.04.16 at 12:56, <Paul.Durrant@xxxxxxxxxx> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >> Sent: 01 April 2016 10:59 >> To: xen-devel >> Cc: Andrew Cooper; Anthony Perard; Paul Durrant; Stefano Stabellini; Keir >> (Xen.org) >> Subject: Re: [PATCH RFC] x86/vMSI-X: avoid missing first unmask of vectors >> >> >>> On 01.04.16 at 11:15, <JBeulich@xxxxxxxx> wrote: >> > Recent changes to Linux result in there just being a single unmask >> > operation prior to expecting the first interrupts to arrive. However, >> > we've had a chicken-and-egg problem here: Qemu invokes >> > xc_domain_update_msi_irq(), ultimately leading to >> > msixtbl_pt_register(), upon seeing that first unmask operation. Yet >> > for msixtbl_range() to return true (in order to msixtbl_write() to get >> > invoked at all) msixtbl_pt_register() must have completed. >> > >> > Deal with this by snooping suitable writes in msixtbl_range() and >> > triggering the invocation of msix_write_completion() from >> > msixtbl_pt_register() when that happens in the context of a still in >> > progress vector control field write. >> > >> > Note that the seemingly unrelated deletion of the redundant >> > irq_desc->msi_desc checks in msixtbl_pt_register() is to make clear to >> > any compiler version used that the "msi_desc" local variable isn't >> > being used uninitialized. (Doing the same in msixtbl_pt_unregister() is >> > just for consistency reasons.) >> > >> > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >> > --- >> > TODO: How to deal with REP MOVS to the MSI-X table (in msixtbl_range())? >> >> Some more detail on the thoughts I so far had for this aspect: >> It has always been puzzling me that the hypervisor doesn't >> see _all_ the MSI-X table accesses (which is a result of the >> addresses only getting registered via XEN_DOMCTL_bind_pt_irq); >> it's quite natural that this is an at least latent issue possibly >> causing guest misbehavior. I cannot, however, currently see any >> way to address this without altering both Xen and qemu, since for >> Xen to see all accesses it would need to become aware of the >> GPA of the MSI-X table much earlier (read: before the domain >> actually start, or at the latest when the domain first enables >> memory decoding on the device). >> >> The mapping of the MMIO BARs of the device into guest memory, >> however, intentionally excludes the page(s) covering the MSI-X >> table, so the hypervisor can't become aware of them by just >> looking at data it gets presented today. Hence either we need to >> add some new hypercall for qemu to invoke, or we need to make >> qemu map the full BAR ranges, filtering out MSI-X table pages >> in the hypervisor (using those mapping requests just to learn the >> GPA of the MSI-X table, without entering them into P2M). >> >> Unless someone can think of a way which doesn't require altering >> both qemu and Xen (creating the well known compatibility issues >> between unmatched pairs), I think the patch as presented should >> be okay without handling this case, i.e. best possible effort, and a >> subsequent change then ought to be to deal with this by changing >> both components. In which case I'd suggest that the change here >> go into 4.7, but the full fix would then likely need deferring until >> 4.8. > > I guess it could be handled entirely in Xen if we are willing to snoop on > PCI configuration. It would not be too hard to snoop guest writes to the BARs > in config space so that Xen can keep track of where they are. Snooping on the > MSI-X capability could then tell Xen when to start interposing on the table, > and allow it to discover the GPA at that point (via the BIR and offset > values). Well, that's a possibility, but won't - afaict - work without qemu's help at another point: So far we don't know the guest's PCI bus topology, hence we can't correlate vBAR writes we might snoop with the physical devices they correspond to. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |