[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Need some advices on how to workaround a hardware bug

On Fri, Mar 30, 2018 at 02:23:13AM -0600, Jan Beulich wrote:
>>>> Chao Gao <chao.gao@xxxxxxxxx> 03/30/18 7:19 AM >>>
>>I met an EPT violation and then the guest was destroyed by Xen
>>after assigning a device to the guest. After some investigation, I found
>>it is caused by the device isn't a standard PCI device -- its MSI-x PBA
>>locates in the same 4k-byte page with other CSR. When the driver in
>>guest writes the registers in that page, an EPT violation happens because
>>the PBA page is marked as read-only by the below line in
>>if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
>>msix->pba.last) )
>>The reason why Xen marks this page read-only I think is PCI SPEC says:
>>Software should never write, and should only read Pending Bits.
>>If software writes to Pending Bits, the result is undefined
>>Then Xen goes through all registered MMIO range and finds this address
>>hasn't been registered. Thus it destroys the guest.
>>I plan to work out a workaround for this issue to allow Xen guest (also
>>dom0 if dom0 uses EPT? not sure) to use devices efficiently which
>>violate PCI SPEC in this way. Currently, there are two options (EPT SPP
>>might provide a perfect solution) :
>>One is trapping the page where PBA locates and ignoring writes to PBA and
>>apply writes to other fields in the same page. It would incur significant
>>performance degradation if this page is accessed frequently. In order
>>to mitigate the performance drop, a patch to trap PBA lazily like what
>>qemu does [1] is needed.
>>The other is Do not trap accesses to the page where PBA locates. In this
>>option, all accesses to the page will go to hardware device without
>>Xen's interception. I think one concern would be whether this option
>>would lead to bring some security holes, compared with trapping these 
>>In my mind, the answer is no because Xen even doesn't read PBA. A corner
>>case for this option might be PBA resides in the same page with MSIx table,
>>which is allowed according to the following description in PCI SPEC:
>>The MSI-X Table and MSI-X PBA are permitted to co-reside within a
>>naturally aligned 4-KB address range, though they must not overlap with
>>each other.
>>Which one do you think is better? or any other thoughts about how to
>>workaround this case?
>First of all, I don't think the qemu change you point out is an equivalent for
>the situation here: We don't emulate PBA, we only control access.
>Not trapping write accesses to PBA is okay only under one of two conditions:
>For Dom0 (which we trust) or if the host admin gave their consent. This extends
>to both variants you suggest - any non-spec compliant behavior bears the risk
>of undermining security of the entire system beyond the well known issues with
>pass-through. Hence apart from EPT SPP (which we don't have yet), only a
>command line or guest config controlled approach of making exceptions from
>the base policy is viable imo.

Got it. Thanks for your kind suggestion.

I will use a command line, for example, "pba_quirk" -- specify a
list of SBDF of devices. When assigning devices in this list to guest,
reading or writing the page where MSI-X PBA resides are allowed.
This option provides a workaround for nonstandard PCI devices whose
MSI-X PBA shares the same 4K-byte page with other registers. Note that
adding an untrusted device to this option would undermine security of the
entire system.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.