[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Need some advices on how to workaround a hardware bug



>>> Chao Gao <chao.gao@xxxxxxxxx> 03/30/18 7:19 AM >>>
>I met an EPT violation and then the guest was destroyed by Xen
>after assigning a device to the guest. After some investigation, I found
>it is caused by the device isn't a standard PCI device -- its MSI-x PBA
>locates in the same 4k-byte page with other CSR. When the driver in
>guest writes the registers in that page, an EPT violation happens because
>the PBA page is marked as read-only by the below line in
>msix_capability_init()
>if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
>msix->pba.last) )
>The reason why Xen marks this page read-only I think is PCI SPEC says:
>Software should never write, and should only read Pending Bits.
>If software writes to Pending Bits, the result is undefined
>Then Xen goes through all registered MMIO range and finds this address
>hasn't been registered. Thus it destroys the guest.
>
>I plan to work out a workaround for this issue to allow Xen guest (also
>dom0 if dom0 uses EPT? not sure) to use devices efficiently which
>violate PCI SPEC in this way. Currently, there are two options (EPT SPP
>might provide a perfect solution) :
>
>One is trapping the page where PBA locates and ignoring writes to PBA and
>apply writes to other fields in the same page. It would incur significant
>performance degradation if this page is accessed frequently. In order
>to mitigate the performance drop, a patch to trap PBA lazily like what
>qemu does [1] is needed.
>
>The other is Do not trap accesses to the page where PBA locates. In this
>option, all accesses to the page will go to hardware device without
>Xen's interception. I think one concern would be whether this option
>would lead to bring some security holes, compared with trapping these accesses.
>In my mind, the answer is no because Xen even doesn't read PBA. A corner
>case for this option might be PBA resides in the same page with MSIx table,
>which is allowed according to the following description in PCI SPEC:
>The MSI-X Table and MSI-X PBA are permitted to co-reside within a
>naturally aligned 4-KB address range, though they must not overlap with
>each other.
>
>Which one do you think is better? or any other thoughts about how to
>workaround this case?

First of all, I don't think the qemu change you point out is an equivalent for
the situation here: We don't emulate PBA, we only control access.

Not trapping write accesses to PBA is okay only under one of two conditions:
For Dom0 (which we trust) or if the host admin gave their consent. This extends
to both variants you suggest - any non-spec compliant behavior bears the risk
of undermining security of the entire system beyond the well known issues with
pass-through. Hence apart from EPT SPP (which we don't have yet), only a
command line or guest config controlled approach of making exceptions from
the base policy is viable imo.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.