Re: [Xen-devel] [RFC PATCH 1/1] Add pci_hole_min_size

On Fri, Mar 7, 2014 at 7:28 PM, Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx> wrote:
> On Tue, Mar 04, 2014 at 01:57:26PM -0500, Don Slutz wrote:
>> On 03/04/14 08:25, George Dunlap wrote:
>> >On Fri, Feb 28, 2014 at 8:15 PM, Don Slutz <dslutz@xxxxxxxxxxx> wrote:
>> >>This allows growing the pci_hole to the size needed.
>> >You mean, it allows the pci hole size to be specified at boot
>> Yes.
>> >  -- the
>> >pci hole still cannot be enlarged dynamically in hvmloader, correct?
>> If I am correctly understanding you, this is in reference to:
>> /*
>>      * At the moment qemu-xen can't deal with relocated memory regions.
>>      * It's too close to the release to make a proper fix; for now,
>>      * only allow the MMIO hole to grow large enough to move guest memory
>>      * if we're running qemu-traditional.  Items that don't fit will be
>>      * relocated into the 64-bit address space.   */
>> so the answer is no, however using pci_hole_min_size can mean that
>> allow_memory_relocate is not needed for upstream QEMU.
>> >What's your intended use case for this?
>> >
>> >  -George
>> If you add enough PCI devices then all mmio may not fit below 4G which may
>> not be the layout the user wanted. This allows you to increase the below 4G
>> address space that PCI devices can use and therefore in more cases not have
>> any mmio that is above 4G.
>> There are real PCI cards that do not support mmio over 4G, so if you want
>> to emulate them precisely, you may also need to increase the space below 4G
>> for them.  There are drivers for these cards that also do not work if they
>> have their mmio space mapped above 4G.
> Would it be better if the HVM guests had something similar to what we
> manufacture for PV guests with PCI passthrough: an filtered version of
> the host's E820?
> That way you don't have to worry about resizing just right and instead
> the E820 looks like the hosts one. Granted you can't migrate, but I
> don't think that is a problem in your use-case?

Having the guest PCI hole the same size as the host PCI hole also gets
rid of a whole class of (unfortunately very common) bugs in PCI
hardware, such that if guest paddrs collide overlap with device IO
ranges the PCI hardware sends the DMA requests to the wrong place.
(In other words, VT-d as implemented in a very large number of
motherboards is utterly broken -- total fail on someone's part.)

The main disadvantage of this is that it unnecessarily reduces the
amount of lowmem available -- and for 32-bit non-PAE guests, reduces
the total amount of memory available at all.

I think long-term, it would be best to:
* Have the pci hole be small for VMs without devices passed through
* Have the pci hole default to the host pci hole for VMs with devices
passed through
* Have the pci hole size able to be specified, either as a size, or as "host".

As long as the size specification can be extended to this
functionality easily, I think just having a size to begin with is OK.

I think the qemu guys didn't like the term "pci_hole" and wanted
something like "lowmem" instead -- that will need to be sorted out.


