[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PVH]: Help: msi.c



On Wed, 19 Dec 2012, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 19, 2012 at 11:19:22AM +0000, Stefano Stabellini wrote:
> > On Tue, 18 Dec 2012, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Dec 13, 2012 at 02:25:16PM +0000, Stefano Stabellini wrote:
> > > > On Thu, 13 Dec 2012, Jan Beulich wrote:
> > > > > >>> On 13.12.12 at 13:19, Stefano Stabellini 
> > > > > >>> <stefano.stabellini@xxxxxxxxxxxxx>
> > > > > wrote:
> > > > > > On Thu, 13 Dec 2012, Jan Beulich wrote:
> > > > > >> >>> On 13.12.12 at 02:43, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 
> > > > > >> >>> wrote:
> > > > > >> > On Wed, 12 Dec 2012 17:15:23 -0800
> > > > > >> > Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:
> > > > > >> > 
> > > > > >> >> On Tue, 11 Dec 2012 12:10:19 +0000
> > > > > >> >> Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > > > > >> >> 
> > > > > >> >> > On Tue, 11 Dec 2012, Mukesh Rathor wrote:
> > > > > >> >> > > On Mon, 10 Dec 2012 09:43:34 +0000
> > > > > >> >> > > "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
> > > > > >> >> > 
> > > > > >> >> > That's strange because AFAIK Linux is never editing the MSI-X
> > > > > >> >> > entries directly: give a look at
> > > > > >> >> > arch/x86/pci/xen.c:xen_initdom_setup_msi_irqs, Linux only 
> > > > > >> >> > remaps
> > > > > >> >> > MSIs into pirqs using hypercalls. Xen should be the only one 
> > > > > >> >> > to
> > > > > >> >> > touch the real MSI-X table.
> > > > > >> >> 
> > > > > >> >> So, this is what's happening. The side effect of :
> > > > > >> >> 
> > > > > >> >>         if ( rangeset_add_range(mmio_ro_ranges, 
> > > > > >> >> dev->msix_table.first,
> > > > > >> >>                                 dev->msix_table.last) )
> > > > > >> >>             WARN();
> > > > > >> >>         if ( rangeset_add_range(mmio_ro_ranges, 
> > > > > >> >> dev->msix_pba.first,
> > > > > >> >>                                 dev->msix_pba.last) )
> > > > > >> >>             WARN();
> > > > > >> >> 
> > > > > >> >> in msix_capability_init() in xen is that the dom0 EPT entries 
> > > > > >> >> that
> > > > > >> >> I've mapped are going from RW to read only. Then when dom0 
> > > > > >> >> accesses
> > > > > >> >> it, I get EPT violation. In case of pure PV, the PTE entry to 
> > > > > >> >> access
> > > > > >> >> the iomem is RW, and the above rangeset adding doesn't affect 
> > > > > >> >> it. I
> > > > > >> >> don't understand why? Looking into that now...
> > > > > >> 
> > > > > >> As far as I was able to tell back at the time when I implemented
> > > > > >> this, existing code shouldn't have mappings for these tables in
> > > > > >> place at the time these ranges get added here. But I noted in
> > > > > >> the patch description that this is a potential issue (and may need
> > > > > >> fixing if deemed severe enough - back then, apparently nobody
> > > > > >> really cared, perhaps largely because passthrough to PV guests
> > > > > >> isn't considered fully secure anyway).
> > > > > >> 
> > > > > >> Now - did that change? I.e. can you describe where the mappings
> > > > > >> come from that cause the problem here?
> > > > > > 
> > > > > > The generic Linux MSI-X handling code does that, before calling the
> > > > > > arch specific msi setup function, for which we have a xen version
> > > > > > (xen_initdom_setup_msi_irqs):
> > > > > > 
> > > > > > pci_enable_msix -> msix_capability_init -> msix_map_region
> > > > > 
> > > > > Ah, okay, (of course?) I had looked only at the forward ported
> > > > > version of this. Is all that fiddling with the mask bits really
> > > > > being suppressed properly when running under Xen? Otherwise
> > > > > pv-ops is quite broken in this regard at present... And if it is,
> > > > > I don't see what the respective ioremap() is good for here in
> > > > > the Xen case.
> > > > 
> > > > Actually I think that you might be right: just looking at the code it
> > > > seems that the mask bits get written to the table once as part of the
> > > > initialization process:
> > > > 
> > > > pci_enable_msix -> msix_capability_init -> msix_program_entries
> > > > 
> > > > Unfortunately msix_program_entries is called few lines after
> > > > arch_setup_msi_irqs, where we call PHYSDEVOP_map_pirq to map the MSI as
> > > > a pirq.
> > > > However after that is done, all the masking/unmask is done via irq_mask
> > > > that we handle properly masking/unmasking the corresponding event
> > > > channels.
> > > > 
> > > > 
> > > > Possible solutions on top of my head:
> > > 
> > > There is also the potential to piggyback on Joerg's patches
> > > that introduce a new x86_msi_ops: compose_msi_msg.
> > > 
> > > See here: https://lkml.org/lkml/2012/8/20/432
> > > (I think there was also a more recent one posted at some point).
> > 
> > Given that dom0 should never write to the MSI-X table, introducing a new
> 
> How does this work with QEMU setting up MSI and MSI-X on behalf of
> guests? Or is that actually handled by Xen hypervisor?

In the case of HVM guests, QEMU emulates the PCI config space and the
table, so it is OK for the guest to write to it.


> > msi_ops that replaces msix_program_entries (or at least the part of
> > msix_program_entries that masks all the entried) is the only solution
> > left.
> 
> so this one (__msix_mask_irq):
> 
>          mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
>  198         if (flag)
>  199                 mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
>  200         writel(mask_bits, desc->mask_base + offset);
> 

Yes, that's the one. Once could argue that __msix_mask_irq should call
mask_irq rather than writing to the table directly.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.