[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH 0 of 3] Patches for PCI passthrough with modified E820 (v3) - resent.



On Mon, May 09, 2011 at 10:00:30AM +0100, Ian Campbell wrote:
> On Wed, 2011-05-04 at 15:17 +0100, Konrad Rzeszutek Wilk wrote:
> > Hello,
> > 
> > This set of v3 patches allows a PV domain to see the machine's
> > E820 and figure out where the "PCI I/O" gap is and match it with the 
> > reality.
> > 
> > Changelog since v2 posting:
> >  - Moved 'libxl__e820_alloc' to be called from do_domain_create and if
> >    machine_e820 == true.
> >  - Made no_machine_e820 be set to true, if the guest has no PCI devices 
> > (and is PV)
> >  - Used Keir's re-worked code for E820 creation.
> > Changelog since v1 posting:
> >  - Squashed the "x86: make the pv-only e820 array be dynamic" and 
> >    "x86: adjust the size of the e820 for pv guest to be dynamic" together.
> >  - Made xc_domain_set_memmap_limit use the 'xc_domain_set_memory_map'
> >  - Moved 'libxl_e820_alloc' and 'libxl_e820_sanitize' to be an internal
> >    operation and called from 'libxl_device_pci_parse_bdf'.
> >  - Expanded 'libxl_device_pci_parse_bdf' API call to have an extra argument
> >    (optional).
> > 
> > The short end is that with these patches a PV domain can:
> > 
> >  - Use the correct PCI I/O gap. Before these patches, Linux guest would
> >    boot up and would tell:
> >    [    0.000000] Allocating PCI resources starting at 40000000 (gap: 
> > 40000000:c0000000)
> >    while in actuality the PCI I/O gap should have been:
> >    [    0.000000] Allocating PCI resources starting at b0000000 (gap: 
> > b0000000:4c000000)
> 
> The reason it needs to be a particular gap is that we can't (easily? at
> all?) rewrite the device BARs to match the guest's idea of the hole, is
> that right? So it needs to be consistent with the underlying host hole.

correct.
> 
> I wonder if it is time to enable IOMMU for PV guests by default.

Would be nice. I thought if the IOMMU was present it wouldautomatically do that?

> Presumably in that case we can manufacture any hole we like in the e820,
> which is useful e.g. when migrating to not-completely-homogeneous hosts.


Hmm. I want to say yes, but not entirely sure what are all the pieces that
this would entail.

> 
> >  - The PV domain with PCI devices was limited to 3GB. It now can be booted
> >    with 4GB, 8GB, or whatever number you want. The PCI devices will now 
> > _not_ conflict
> >    with System RAM. Meaning the drivers can load.
> > 
> >  - With 2.6.39 kernels (which has the 1-1 mapping code), the VM_IO flag 
> > will be
> >    now automatically applied to regions that are considerd PCI I/O regions. 
> > You can
> >    find out which those are by looking for '1-1' in the kernel bootup.
> > 
> > To use this patchset, the guest config file has to have the parameter 
> > 'pci=['<BDF>',...]'
> > enabled.
> > 
> > This has been tested with 2.6.18 (RHEL5), 2.6.27(SLES11), 2.6.36, 2.6.37, 
> > 2.6.38,
> > and 2.6.39 kernels. Also tested with PV NetBSD 5.1.
> > 
> > Tested this with the PCI devices (NIC, MSI), and with 2GB, 4GB, and 6GB 
> > guests
> > with success.
> > 
> >  libxc/xc_domain.c      |   77 +++++++++++-----
> >  libxc/xc_e820.h        |    3 
> >  libxc/xenctrl.h        |   11 ++
> >  libxl/libxl.idl        |    1 
> >  libxl/libxl_create.c   |    8 +
> >  libxl/libxl_internal.h |    1 
> >  libxl/libxl_pci.c      |  230 
> > +++++++++++++++++++++++++++++++++++++++++++++++++
> >  libxl/xl_cmdimpl.c     |    3 
> >  8 files changed, 309 insertions(+), 25 deletions(-)
> > 
> > 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.