[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR



Dan Magenheimer wrote:
> Well, although it might be nice to be able to use
> rdtscp and TSC_AUX to determine pcpu/vcpu/pnode/vnode
> information, I think Jeremy and Jan convinced me in
> another thread a couple of months ago that in userland:
> 
> x = vgetcpu()
> do_other_stuff();
> y = vgetcpu()
> 
> if x==1 and y==2, there's no way to determine that
> do_other_stuff() was executed on cpu 1 vs cpu 2,
> or (though unlikely) even on cpu 3.  And if
> x==y==4, there's  no guarantee that do_other_stuff()
> is executed on cpu 4.
> 
> If this is true the only safe use of TSC_AUX is for
> its originally designed intent: To determine if two
> successive rdtscp instructions were or were not
> executed on the same processor.  Since this cannot
> be guaranteed in a VM, that's a reasonable argument
> that TSC_AUX shouldn't be exposed at all (meaning the
> rdtscp bit in cpuid should be turned off by Xen).

Why do you think this is the design intent of this instruction ?  

For guest NUMA support,  it should be a must to pin each vcpu of one VM to some 
logical proceossors which belong to one specific node(disable vcpu migration 
between nodes), I think, otherwise, virutal numa may suffer from performance 
loss.  For example, in a numa system which has two nodes and each node has 4G 
memory and 8 logical processors. And in this Xen-configured system,  if we 
carete a VM with 2 G memory with4  vcpu support,  Xen system may allocate 1 G 
memory from physical node 0 and another 1 G memory from physical node 1.  And 
in this case, if we virtualize numa for this VM, vcpu0 and vcpu1 can be 
assinged to virtual node0 , vcpu2 and vcpu3 can be configured for virtual 
node1, certainly, we also can safely pin vcpu0 and vpcu1 to the physical 
node0's 8 locial processors and accordingly pin vcpu2 and vcpu3 to the physical 
node1's 8 physical processors.  Since virtual TSC_AUX is virtualized for each 
vcpu, and the value is saved/restored for the vcpu when its migration occurs, 
so if one application always runs on a virtual processors, it should get a 
fixed value when it calls vgetcpu, envn if this vcpu often migrates among 
logical processors of one node.   

Back to this topic, in all,  we can't mix the virtual  TSC_AUX of guest with 
the host's TSC_AUX.  If switch to HVM's vcpu context,  load this vcpu's virtual 
 TSC_AUX_MSR to physical TSC_AUX_MSR, and when it is sheduled out,  host's 
TSC_AUX_MSR(which maybe used for pv guests) is loaded.  


> True, as long as the information is ONLY used
> heuristically to obtain pcpu/vcpu/pnode/vnode info,
> and no guarantee of correctness is implied or expected,
> it might be useful some of the time.
> 
> But frankly, if "performance sucks" when the heuristic
> fails due to the fact that the app is running on
> a VM instead of native OS, I'd see that as a problem
> and suggest the proper way to fix that is to define
> more App-to-Xen ABIs so that the app can get the
> real information, not a heuristic.  Which also argues
> for Xen leaving the rdtscp bit in cpuid turned off
> 
> Dan
> 
>> -----Original Message-----
>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
>> Sent: Friday, December 11, 2009 12:30 PM
>> To: Jeremy Fitzhardinge; Dan Magenheimer
>> Cc: Keir Fraser; Zhang, Xiantao; Xu, Dongxiao;
>> xen-devel@xxxxxxxxxxxxxxxxxxx; Dugger, Donald D
>> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
>> 
>> 
>> Jeremy Fitzhardinge wrote on Fri, 11 Dec 2009 at 10:50:29:
>> 
>>> On 12/11/09 10:35, Dan Magenheimer wrote:
>>>>> However, the vcpu number is definitely useful to usermode apps,
>>>>> so they can get some idea how they're moved between (v)cpus.  I
>>>>> don't think it will matter to them that it isn't pcpu.
>>>>> 
>>>> My point is that an app running on native Linux can
>>>> safely assume that, if TSC_AUX==3 at time T1 and
>>>> TSC_AUX is still 3 at time T2,it is running
>>>> on the same processor and the same node at both T1
>>>> and T2.  In a virtual environment it cannot even
>>>> assume it is running on the same machine.
>>>> Further if the app sees that TSC_AUX==2 at time T3
>>>> and TSC_AUX==3 at time T4, on native Linux it
>>>> can safely assume that it is running on a different
>>>> processor.  While rarer, in a virtual environment,
>>>> this may also be a false assumption.
>>>> 
>>>> That's why I say the information is misleading.
>>>> 
>>>  Sure, but that info is, at best, of heuristic value, and won't
>>> cause any correctness problems if it is wrong.  The performance may
>>> suck, but that's part of the larger problem of running NUMA-aware
>>> code in a virtual environment. 
>>> 
>> 
>> And to utilize various NUMA optimizations in the kernel/apps
>> in the guest, we need "the virtual numa info bears some vague
>> resemblance to the real topology" (from Jeremy's email) with
>> the vcpus bound to the CPU/node.
>> 
>> I understand that enabling RDTSCP in HVM will disable the
>> pvrdtscp algorithm if used by the kernel. One way is to mask
>> off the feature in CPUID (by default). Then kernel won't use it.
>> 
>> Jun
>> ___
>> Intel Open Source Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.