[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] fix memory allocation from NUMA node for VT-d.



>We can get lower latency by using proximity domain info of DMAR. But it
>needs more modifications than my patch. I will not work for this.
>If anyone develops this function, I will not be against it.

Proximity domain info support is on our to-do list.  We are planning to 
implement it in Q1 of next year.  It is a little more change than your patch 
but I don't think it is that more much, just need to parse the proximity domain 
info in ACPI and then use the node id for allocating memory for root, context 
and page tables.

Allen

>-----Original Message-----
>From: Yuji Shimada [mailto:shimada-yxb@xxxxxxxxxxxxxxx] 
>Sent: Wednesday, November 26, 2008 12:32 AM
>To: Espen Skoglund; Kay, Allen M
>Cc: 'Keir Fraser'; xen-devel@xxxxxxxxxxxxxxxxxxx
>Subject: Re: [Xen-devel] [PATCH] fix memory allocation from 
>NUMA node for VT-d.
>
>Hi, Espen & Kay,
>
>> Are you assuming guest will ping the guest to a physical CPU?
>
>Yes.
>On Xen, memory is automatically assigned from the same NUMA node as a
>physical CPU. Memory isn't moved from initial assignment. But if we
>doesn't ping a guest to physical CPUs, the guest can run on every
>physical CPUs. So, when the user uses NUMA machine, it is better to
>ping the guest to physical CPUs. Because the latency becomes lower.
>
>> How does the user figure out which devices are closer to which
>> physical CPU in the platform in a QPI system without using proximity
>> domain info?
>
>The users can read machine spec to assign I/O device from the same
>NUMA node as CPU to a guest.
>
>> By at least keeping the IOMMU page tables local to the node 
>you'll get
>> lower latencies for the page table walker.
>
>We can get lower latency by using proximity domain info of DMAR. But it
>needs more modifications than my patch. I will not work for this.
>If anyone develops this function, I will not be against it.
>
>Thanks,
>--
>Yuji Shimada
>
>
>On Thu, 20 Nov 2008 20:00:08 +0000
>Espen Skoglund <espen.skoglund@xxxxxxxxxxxxx> wrote:
>
>> You only require more memory if you duplicate the structures per
>> IOMMU.  While this is indeed possible (and may even be the desired
>> solution) it is not what I suggested.
>>
>> And you're making the assumption here that the guest is assigned to
>> the node of the IOMMU.  As Allen points out, how does a user 
>make this
>> decision?  And in many cases I would expect that you would 
>not want to
>> assign many guests to the same node anyway.  By at least keeping the
>> IOMMU page tables local to the node you'll get lower 
>latencies for the
>> page table walker.
>>
>>       eSk
>>
>
>On Wed, 19 Nov 2008 10:57:10 -0800
>"Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
>
>>
>> >Xen's user will assign a device to a closer guest. So, node of the
>> >guest and node connected to IOMMU will be the same.
>> >As a result, the memory performance will be improved with my patch.
>>
>> Are you assuming guest will ping the guest to a physical CPU?  How
>> does the user figure out which devices are closer to which physical
>> CPU in the platform in a QPI system without using proximity domain
>> info?
>>
>> Allen
>>
>> >> -----Original Message-----
>> >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
>> >> Yuji Shimada
>> >> Sent: Wednesday, November 19, 2008 12:26 AM
>> >> To: Espen Skoglund
>> >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; 'Keir Fraser'
>> >> Subject: Re: [Xen-devel] [PATCH] fix memory allocation from
>> >> NUMA node for VT-d.
>> >>
>> >> Hi Espen,
>> >>
>> >> Your suggestion allocating memory from one of the nodes where the
>> >> IOMMU is attached improves performance more. But more memory is
>> >> needed, because structures are needed per IOMMU.
>> >>
>> >> My patch keeps the current implementation, one Device Assignment
>> >> Structure and Address Translation Structure per guest.
>> >>
>> >> Xen's user will assign a device to a closer guest. So, node of the
>> >> guest and node connected to IOMMU will be the same.
>> >> As a result, the memory performance will be improved with 
>my patch.
>> >>
>> >> Thanks,
>> >> --
>> >> Yuji Shimada
>> >>
>> >> On Tue, 18 Nov 2008 12:00:37 +0000
>> >> Espen Skoglund <espen.skoglund@xxxxxxxxxxxxx> wrote:
>> >>
>> >>> Given an FSB based system the IOMMUs sit in the 
>north-bridge.  How
>> >>> does this work qith QPI?  Where in the system do the 
>different IOMMUs
>> >>> sit?  Wouldn't it make more sense to allocate memory 
>from one of the
>> >>> nodes where the IOMMU is attached?  Having the memory 
>allocated from
>> >>> the node of the guest only helps when the guest needs to 
>update its
>> >>> page tables.  I'd rather optimize for page table walks 
>in the IOMMU.
>> >>>
>> >>>   eSk
>> > >
>> > >
>> > >_______________________________________________
>> > >Xen-devel mailing list
>> > >Xen-devel@xxxxxxxxxxxxxxxxxxx
>> > >http://lists.xensource.com/xen-devel
>> > >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> > http://lists.xensource.com/xen-devel
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.