[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PVH]: Help: msi.c



On Wed, Dec 19, 2012 at 11:19:22AM +0000, Stefano Stabellini wrote:
> On Tue, 18 Dec 2012, Konrad Rzeszutek Wilk wrote:
> > On Thu, Dec 13, 2012 at 02:25:16PM +0000, Stefano Stabellini wrote:
> > > On Thu, 13 Dec 2012, Jan Beulich wrote:
> > > > >>> On 13.12.12 at 13:19, Stefano Stabellini 
> > > > >>> <stefano.stabellini@xxxxxxxxxxxxx>
> > > > wrote:
> > > > > On Thu, 13 Dec 2012, Jan Beulich wrote:
> > > > >> >>> On 13.12.12 at 02:43, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 
> > > > >> >>> wrote:
> > > > >> > On Wed, 12 Dec 2012 17:15:23 -0800
> > > > >> > Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:
> > > > >> > 
> > > > >> >> On Tue, 11 Dec 2012 12:10:19 +0000
> > > > >> >> Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > > > >> >> 
> > > > >> >> > On Tue, 11 Dec 2012, Mukesh Rathor wrote:
> > > > >> >> > > On Mon, 10 Dec 2012 09:43:34 +0000
> > > > >> >> > > "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
> > > > >> >> > 
> > > > >> >> > That's strange because AFAIK Linux is never editing the MSI-X
> > > > >> >> > entries directly: give a look at
> > > > >> >> > arch/x86/pci/xen.c:xen_initdom_setup_msi_irqs, Linux only remaps
> > > > >> >> > MSIs into pirqs using hypercalls. Xen should be the only one to
> > > > >> >> > touch the real MSI-X table.
> > > > >> >> 
> > > > >> >> So, this is what's happening. The side effect of :
> > > > >> >> 
> > > > >> >>         if ( rangeset_add_range(mmio_ro_ranges, 
> > > > >> >> dev->msix_table.first,
> > > > >> >>                                 dev->msix_table.last) )
> > > > >> >>             WARN();
> > > > >> >>         if ( rangeset_add_range(mmio_ro_ranges, 
> > > > >> >> dev->msix_pba.first,
> > > > >> >>                                 dev->msix_pba.last) )
> > > > >> >>             WARN();
> > > > >> >> 
> > > > >> >> in msix_capability_init() in xen is that the dom0 EPT entries that
> > > > >> >> I've mapped are going from RW to read only. Then when dom0 
> > > > >> >> accesses
> > > > >> >> it, I get EPT violation. In case of pure PV, the PTE entry to 
> > > > >> >> access
> > > > >> >> the iomem is RW, and the above rangeset adding doesn't affect it. 
> > > > >> >> I
> > > > >> >> don't understand why? Looking into that now...
> > > > >> 
> > > > >> As far as I was able to tell back at the time when I implemented
> > > > >> this, existing code shouldn't have mappings for these tables in
> > > > >> place at the time these ranges get added here. But I noted in
> > > > >> the patch description that this is a potential issue (and may need
> > > > >> fixing if deemed severe enough - back then, apparently nobody
> > > > >> really cared, perhaps largely because passthrough to PV guests
> > > > >> isn't considered fully secure anyway).
> > > > >> 
> > > > >> Now - did that change? I.e. can you describe where the mappings
> > > > >> come from that cause the problem here?
> > > > > 
> > > > > The generic Linux MSI-X handling code does that, before calling the
> > > > > arch specific msi setup function, for which we have a xen version
> > > > > (xen_initdom_setup_msi_irqs):
> > > > > 
> > > > > pci_enable_msix -> msix_capability_init -> msix_map_region
> > > > 
> > > > Ah, okay, (of course?) I had looked only at the forward ported
> > > > version of this. Is all that fiddling with the mask bits really
> > > > being suppressed properly when running under Xen? Otherwise
> > > > pv-ops is quite broken in this regard at present... And if it is,
> > > > I don't see what the respective ioremap() is good for here in
> > > > the Xen case.
> > > 
> > > Actually I think that you might be right: just looking at the code it
> > > seems that the mask bits get written to the table once as part of the
> > > initialization process:
> > > 
> > > pci_enable_msix -> msix_capability_init -> msix_program_entries
> > > 
> > > Unfortunately msix_program_entries is called few lines after
> > > arch_setup_msi_irqs, where we call PHYSDEVOP_map_pirq to map the MSI as
> > > a pirq.
> > > However after that is done, all the masking/unmask is done via irq_mask
> > > that we handle properly masking/unmasking the corresponding event
> > > channels.
> > > 
> > > 
> > > Possible solutions on top of my head:
> > 
> > There is also the potential to piggyback on Joerg's patches
> > that introduce a new x86_msi_ops: compose_msi_msg.
> > 
> > See here: https://lkml.org/lkml/2012/8/20/432
> > (I think there was also a more recent one posted at some point).
> 
> Given that dom0 should never write to the MSI-X table, introducing a new

How does this work with QEMU setting up MSI and MSI-X on behalf of
guests? Or is that actually handled by Xen hypervisor?

> msi_ops that replaces msix_program_entries (or at least the part of
> msix_program_entries that masks all the entried) is the only solution
> left.

so this one (__msix_mask_irq):

         mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 198         if (flag)
 199                 mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
 200         writel(mask_bits, desc->mask_base + offset);





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.