[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Need some advices on how to workaround a hardware bug


  • To: xen-devel@xxxxxxxxxxxxx
  • From: Chao Gao <chao.gao@xxxxxxxxx>
  • Date: Fri, 30 Mar 2018 13:14:29 +0800
  • Delivery-date: Fri, 30 Mar 2018 05:19:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

I met an EPT violation and then the guest was destroyed by Xen
after assigning a device to the guest. After some investigation, I found
it is caused by the device isn't a standard PCI device -- its MSI-x PBA
locates in the same 4k-byte page with other CSR. When the driver in
guest writes the registers in that page, an EPT violation happens because
the PBA page is marked as read-only by the below line in
msix_capability_init()
        if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
                                        msix->pba.last) )
The reason why Xen marks this page read-only I think is PCI SPEC says:
        Software should never write, and should only read Pending Bits.
        If software writes to Pending Bits, the result is undefined
Then Xen goes through all registered MMIO range and finds this address
hasn't been registered. Thus it destroys the guest.

I plan to work out a workaround for this issue to allow Xen guest (also
dom0 if dom0 uses EPT? not sure) to use devices efficiently which
violate PCI SPEC in this way. Currently, there are two options (EPT SPP
might provide a perfect solution) :

One is trapping the page where PBA locates and ignoring writes to PBA and
apply writes to other fields in the same page. It would incur significant
performance degradation if this page is accessed frequently. In order
to mitigate the performance drop, a patch to trap PBA lazily like what
qemu does [1] is needed.

The other is Do not trap accesses to the page where PBA locates. In this
option, all accesses to the page will go to hardware device without
Xen's interception. I think one concern would be whether this option
would lead to bring some security holes, compared with trapping these accesses.
In my mind, the answer is no because Xen even doesn't read PBA. A corner
case for this option might be PBA resides in the same page with MSIx table,
which is allowed according to the following description in PCI SPEC:
        The MSI-X Table and MSI-X PBA are permitted to co-reside within a
        naturally aligned 4-KB address range, though they must not overlap with
        each other.

Which one do you think is better? or any other thoughts about how to
workaround this case?

[1]:https://git.qemu.org/?p=qemu.git;a=commit;h=95239e162518dc6577164be3d9a789aba7f591a3

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.