[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memoryo




> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@xxxxxxxxxxxxx]
> Sent: 2013å5æ30æ 18:28
> To: Hanweidong
> Cc: Stefano Stabellini; George Dunlap; xudong.hao@xxxxxxxxx;
> Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> Anthony Perard; xen-devel@xxxxxxxxxxxxx; xiantao.zhang@xxxxxxxxx
> Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> with 4G memoryo
> 
> On Thu, 30 May 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@xxxxxxxxxxxxx]
> > > Sent: 2013å5æ30æ 0:18
> > > To: Hanweidong
> > > Cc: Stefano Stabellini; George Dunlap; xudong.hao@xxxxxxxxx;
> > > Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> > > Anthony Perard; xen-devel@xxxxxxxxxxxxx; xiantao.zhang@xxxxxxxxx
> > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> configured
> > > with 4G memory
> > >
> > > On Thu, 25 Apr 2013, Hanweidong wrote:
> > > > > -----Original Message-----
> > > > > From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> > > > > bounces@xxxxxxxxxxxxx] On Behalf Of Hanweidong
> > > > > Sent: 2013å3æ26æ 17:38
> > > > > To: Stefano Stabellini
> > > > > Cc: George Dunlap; xudong.hao@xxxxxxxxx; Yanqiangjun;
> Luonengjun;
> > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > > devel@xxxxxxxxxxxxx; xiantao.zhang@xxxxxxxxx
> > > > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is
> > > configured
> > > > > with 4G memory
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Stefano Stabellini
> [mailto:stefano.stabellini@xxxxxxxxxxxxx]
> > > > > > Sent: 2013å3æ18æ 20:02
> > > > > > To: Hanweidong
> > > > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun;
> Luonengjun;
> > > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard;
> xen-
> > > > > > devel@xxxxxxxxxxxxx; xudong.hao@xxxxxxxxx;
> > > xiantao.zhang@xxxxxxxxx
> > > > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> > > configured
> > > > > > with 4G memory
> > > > > >
> > > > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU
> uses
> > > below
> > > > > > code to init
> > > > > > > RAM in xen_ram_init:
> > > > > > >
> > > > > > >     ...
> > > > > > >     block_len = ram_size;
> > > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > > >         /* Xen does not allocate the memory continuously,
> and
> > > keep
> > > > > a
> > > > > > hole at
> > > > > > >          * HVM_BELOW_4G_MMIO_START of
> HVM_BELOW_4G_MMIO_LENGTH
> > > > > > >          */
> > > > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > > > >     }
> > > > > > >     memory_region_init_ram(&ram_memory, "xen.ram",
> block_len);
> > > > > > >     vmstate_register_ram_global(&ram_memory);
> > > > > > >
> > > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > > > >     } else {
> > > > > > >         below_4g_mem_size = ram_size;
> > > > > > >     }
> > > > > > >     ...
> > > > > > >
> > > > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change
> > > HVM_BELOW_4G_RAM_END
> > > > > > to e0000000,
> > > > > > > Which it's consistent with hvmloader when assigning a GPU,
> and
> > > then
> > > > > > guest worked
> > > > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > > > consistent with
> > > > > > > hvmloader.
> > > > > > >
> > > > > > > In addition, we found QEMU uses hardcode 0xe0000000 in
> > > pc_init1()
> > > > > as
> > > > > > below.
> > > > > > > Should keep these places handle the consistent mmio hole or
> not?
> > > > > > >
> > > > > > >     if (ram_size >= 0xe0000000 ) {
> > > > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > > > >         below_4g_mem_size = 0xe0000000;
> > > > > > >     } else {
> > > > > > >         above_4g_mem_size = 0;
> > > > > > >         below_4g_mem_size = ram_size;
> > > > > > >     }
> > > > > >
> > > > > > The guys at Intel sent a couple of patches recently to fix
> this
> > > issue:
> > > > > >
> > > > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > > > >
> > > > > > Do they solve your problem?
> > > > >
> > > > > These two patches didn't solve our problem.
> > > > >
> > > >
> > > > I debugged this issue with above two patches. I want to share
> some
> > > information and discuss solution here. This issue is actually
> caused by
> > > that a VM has a large pci hole (mmio size) which results in QEMU
> sets
> > > memory regions inconsistently with hvmloader (QEMU uses hardcode
> > > 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual
> device
> > > with 1GB mmio size to debug this issue. Firstly, QEMU set memory
> > > regions except pci hole region in pc_init1() and xen_ram_init(),
> then
> > > hvmloader calculated pci_mem_start as 0x80000000, and wrote it to
> TOM
> > > register, which triggered QEMU to update pci hole region with
> > > 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows
> 7 VM
> > > (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in
> > > QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the
> problem
> > > was gone.
> > > >
> > > > Althrough above two patches will pass actual pci hole start
> address
> > > to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init()
> already
> > > set the other memory regions, and obviously the pci hole might
> overlap
> > > with ram regions in this case. So I think hvmloader should setup
> pci
> > > devices and calculate pci hole first, then QEMU can map memory
> regions
> > > correctly from the beginning.
> > > >
> > >
> > > Thank you very much for your detailed analysis of the problem.
> > >
> > > After reading this, I wonder how is possible that qemu-xen-
> traditional
> > > does not have this issue, considering that AFAIK there is no way
> for
> > > hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> > >
> > > The only difference between upstream QEMU and qemu-xen-traditional
> is
> > > that the former would start the PCI hole at 0xf0000000 while the
> latter
> > > would start the PCI hole at 0xe0000000.
> > >
> > > So I would expect that your test, where hvmloader is updating the
> PCI
> > > hole region to start at 0x80000000, would fail on qemu-xen-
> traditional
> > > too.
> >
> > Yes, I think so.
> >
> > >
> > > Of course having the PCI hole starting unconditionally at
> 0xf0000000
> > > makes it much easier to run into problems than starting it at
> > > 0xe0000000.
> > >
> > >
> > > Assuming that everything above is correct, this is what I would do:
> > >
> > > 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to
> match
> > > qemu-xen-unstable in terms of configuration and not to introduce
> any
> > > regressions. Do this for the Xen 4.3 release.
> >
> > It's a quick improvement before implementing a thorough solution.
> 
> Cool.
> Can you confirm that the following patch solves your original problem?
> 

Actually I already modified code like your below patch. It worked for me when I 
passthrough one GPU whose mmio size is about 200MB.

There is hardcode 0xe0000000 in pc_init1() in pc_piix.c file. I suggest to 
replace it by QEMU_XEN_BELOW_4G_RAM_END. I think the memory layout calculation 
should be consistent between xen_ram_init() and pc_init1(). 

weidong
        
> 
> diff --git a/xen-all.c b/xen-all.c
> index daf43b9..259f862 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -35,6 +35,9 @@
>      do { } while (0)
>  #endif
> 
> +#define QEMU_XEN_BELOW_4G_RAM_END       0xe0000000
> +#define QEMU_XEN_BELOW_4G_MMIO_LENGTH   ((1ULL << 32) -
> QEMU_XEN_BELOW_4G_RAM_END)
> +
>  static MemoryRegion ram_memory, ram_640k, ram_lo, ram_hi;
>  static MemoryRegion *framebuffer;
>  static bool xen_in_migration;
> @@ -160,18 +163,18 @@ static void xen_ram_init(ram_addr_t ram_size)
>      ram_addr_t block_len;
> 
>      block_len = ram_size;
> -    if (ram_size >= HVM_BELOW_4G_RAM_END) {
> +    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
>          /* Xen does not allocate the memory continuously, and keep a
> hole at
> -         * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> +         * QEMU_XEN_BELOW_4G_RAM_END of QEMU_XEN_BELOW_4G_MMIO_LENGTH
>           */
> -        block_len += HVM_BELOW_4G_MMIO_LENGTH;
> +        block_len += QEMU_XEN_BELOW_4G_MMIO_LENGTH;
>      }
>      memory_region_init_ram(&ram_memory, "xen.ram", block_len);
>      vmstate_register_ram_global(&ram_memory);
> 
> -    if (ram_size >= HVM_BELOW_4G_RAM_END) {
> -        above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> -        below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> +    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
> +        above_4g_mem_size = ram_size - QEMU_XEN_BELOW_4G_RAM_END;
> +        below_4g_mem_size = QEMU_XEN_BELOW_4G_RAM_END;
>      } else {
>          below_4g_mem_size = ram_size;
>      }
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.