[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Is: qemu-xen mishandling upper 64-bit BAR compared to qemu-tradWas:Re: Dom0 linux 4.0 + devel/for-linus-4.1 branch: p2m.c:884:d0v0 gfn_to_mfn failed! gfn=ffffffff001ed type:4



On Wed, Apr 15, 2015 at 11:14:04PM +0200, Sander Eikelenboom wrote:
> 
> Wednesday, April 15, 2015, 10:58:50 PM, you wrote:
> 
> >> Hi Konrad,
> >> 
> >> xen version is at last changeset 3a28f760508fb35c430edac17a9efde5aff6d1d5 
> >> (previous unstable master before the force push which includes 
> >> 1aeb1156fa43fe2cd2b5003995b20466cd19a622 which causes the trouble reported 
> >> here:
> >> http://lists.xen.org/archives/html/xen-devel/2015-04/msg01336.html )
> >> 
> >> qemu-xen is at last changeset 727b998448e852a5e8eb570ac3a259ef62fbdacb 
> >> plus the revert of 7665d6ba98e20fb05c420de947c1750fd47e5c07 
> >> (due to other problem reported here: 
> >> http://lists.xen.org/archives/html/xen-devel/2015-04/msg01121.html )
> 

I was able to reproduce this with an older NIC that exposes 64-bit BARs.

The problem is that the XSA120-addendm patch (which does not clear
the PCI_COMMAND register anymore), causes an missing functionality in 
qemu-xen to be exposed. This missing functionality is implemented in
qemu-traditional which is why it works there.

The problem is that qemu-xen for any write to the BAR regions
updates them to the hypervisor - but only if the real hardware has
them enabled (see pci_update_mappings in pci_default_write_config which
is called by xen_pt_pci_write_config). Specifically pci_bar_address
checks PCI_COMMAND register for PCI_COMMAND_MEMORY. If it is disabled
(so no XSA-120 addendum patch), it returns -1 (default value resulting
in no changes in the internal structures). If it is enabled, then
it updates the d->config space with the value written by the guest.
Once xen_pt_pci_write_config is done it syncs up the changes (if there
are any) which results in the QEMU calling hypervisor to update the P2M mapping.
However 

That is, the hvmloader performs for 64-bit BARs two writes
of ~0 - to the BARx and BARx+1 to get the full size of PCI device.
That means d->config[BARx] is updated to 0xffffff, and since the
PCI_COMMAND_MEMORY is set, it actually merges that with the real
host value read (so the size of the device), which then
later goes through an gauntlet of masking (see xen_pt_bar_reg_write)
which picks the right value to return to the guest. 
The d->config[BARx] remains unchanged (I think, need to verify).

The next write by hvmloader for the upper BAR (BARx+1) results
in d->config[BARx+1] to be updated with 0xffffffff as well.
This means we have d->config[BARx] to be 0xfffffff and
d->config[BARx+1] to be 0xfffffff (see pci_default_write_config).

pci_update_mappings does not deal in d->config but instead
iterates over the PCIIORegion (r->addr and such). It asks
pci_bar_address(d, BARx) for the new value, and since this is
a 64-bit value, pci_bar_address hands over
 d->config[BARx]| d->config[BARx+1] & (Some filtering).

The end result is that we have now 0xffffff000000e or such which
r->addr is updated with (the default value was -1).

And then once xen_pt_pci_write_config is done the hypercall
is done which tries to update the MFN for the BAR with a new
r->addr (0xffffff0000..) which fails.

Prior to the XSA120-addendum patch, none of this would happen
because pci_bar_address would always return -1 (uint32_t) for
any region and since the default r->addr is -1, we would never
do the hypercall.

If you have gotten to this point and are not exactly sure
what I am saying - here is a function char to help:

xen_pt_pci_write_config (BARx+1, ~0)
        pci_default_write_config(BARx+1, ~0)
                pci_update_mappings()
                        pci_bar_address [for each BAR]
                                if (!(cmd & PCI_COMMAND_MEMORY))
                                        don't update r->addr
                                else
                                        update r->addr with d->config+bar 
offset (quad read)
        xen_pt_bar_reg_write (BARx+1, ~0)
        if any of the r->addr changed, make the hypercall.


Note that these reads are four bytes read. 
        

The qemu-trad had some extra code to deal with the
upper bar by always making r->addr = -1 if anybody ever
wrote -1 to it.  That is - ignore whatever the host had and
always make it the default value - so that we would never
make the hypercall.

Stefano, thoughts on the fix? Just try doing what qemu-trad did?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.