[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [RFC] Physical hot-add cpus and TSC




>-----Original Message-----
>From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>Sent: Friday, May 28, 2010 10:35 PM
>To: Keir Fraser; Jiang, Yunhong; Xen-Devel (xen-devel@xxxxxxxxxxxxxxxxxxx); Ian
>Pratt
>Subject: RE: [Xen-devel] [RFC] Physical hot-add cpus and TSC
>
>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>> Sent: Friday, May 28, 2010 1:04 AM
>> To: Jiang, Yunhong; Dan Magenheimer; Xen-Devel (xen-
>> devel@xxxxxxxxxxxxxxxxxxx); Ian Pratt
>> Subject: Re: [Xen-devel] [RFC] Physical hot-add cpus and TSC
>>
>> On 28/05/2010 07:29, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>>
>> >> It is impossible to meet that level of TSC consistency when doing
>> CPU
>> >> physical-add, without emulating all guest TSCs. We may need to add
>> that as
>> >> an option, at least, to keep a small class of apps that care (like
>> Oracle's
>> >> DB, we assume) happy.
>> >
>> > So a option to make TSC_MODE_DEFAULT as d->arch.vtsc=0 ?.
>> > When CPU_hotadd, we should at least warning if that option is not
>> set, am I
>> > right?
>>
>> Xen-unstable:21469.
>
>Well, although it's better than nothing, it seems pretty
>lame to only put an advisory warning in xen's log about a
>condition that may possibly affect many guest OS's and
>applications with hard to identify symptoms/failures, and
>possibly randomly at some point in time that may be
>days/weeks/months after the event occurs.  Consider a cloud
>service provider for example.
>
>The advantage of turning hot-add-cpu off by default
>is that, if it is turned on at boot-time, TSC emulation
>can always be enabled for all guests at guest boot
>and the condition never arises.

Hi, Don, considering that hot-add-cpu is not a high-frequent scenerio, IMO, it 
may happens only under some special situation that can't be decided in advance. 
That is, the user has a system with CPU hot-add capability, but is not sure 
when/whether the CPU hot-add will really happen. it means:
1) If enabling this feature will always cause TSC emulation, it may not worth 
of it considering the low probability
2) If disable hot-add-cpu by default, user has to reboot the system to enable 
this feature, it means hot-add CPU is meaningless at all. if user need reboot 
the system, they don't need hot-plug at all, they just power-off the system and 
add it :)

One key point is, currently the CPU hot-add will not happen automatically. The 
step of CPU hot-add is:
a) A CPU is hot-added to the system, and OS kernel will be notified by ACPI 
driver
b) OS kernel will create the sysfs file for this new CPU under /sys/, but mark 
this CPU as offline, since this cpu is not added to Xen, in fact, Xen have no 
idea of this CPU at all.
c) a uevent will be sent to user space of the new added device
d) uevent script need to "echo 1 > 
/sys/device/system/xen_pcpu/xen_pcpuXXX/online", this store operation will 
trigger a hypercall ,and the CPU will be brought up in the end.

So my suggestion is, between step c/d, user space script can do more work 
before really bringup the CPU. For example, it can check if any special 
guest/application eixsting requiring strict TSC sequence, if xen has tsc_skew 
optoin passed when booting. Or worstly, it can simply does not notify Xen for 
CPU brought-up at all. I think this is more flexible, and is also reasonable.  
And this can be done by OSV release (like OVM ) easily.

>
>Are there any other questionable conditions that might
>arise from hot-adding physical CPUs?  For example (my
>favorite), are any order>0 allocations required?  Or

I don't remember >0 allocation,, will check it when back to office.

>what if the hot-added cpu results in mixed generations
>(e.g. a Nehalem is added to an all-Westmere system,
>where the apps are using AES instructions)?  Anything
>else?

What will happen if system is booting with mixed generation? For example, when 
AES is not supported found at AP, will BSP disable the AES?

>
>In other words, maybe it would be nice to be able
>to rule out other special dynamic checks for hot-add
>cpus that aren't done for simultaneously-reset cpus?
>Requiring a boot option to allow hot-add physical CPUs
>might make a future nasty support problem a lot easier.

I think a good uevent script will resolve the issue.
.
>
>> "Undetectable" by Dan's definition means undetectable by
>> a multi-threaded app on a multi-vcpu guest. Any detected
>> warp would therefore be a problem.
>
>This is actually Linux's definition, a requirement
>for selecting tsc as Linux's default clocksource,
>and measured by the same algorithm in Xen and Linux.
>
>Linux is a bit more flexible than apps in that, if
>Linux detects a problem, it can fallback from using
>tsc as the clocksource to some other clocksource.
>But it remains to be seen how well this will work
>in a virtual environment, where there are a number
>of conditions that a bare-metal OS can detect
>that a virtualized guest OS (or an app running
>on a physical or virtualized OS) cannot.
>
>But to summarize, IMHO, correctness comes first,
>performance second, and functionality that might
>be needed on only a small fraction of systems
>comes third.  I think enterprise customers dependent
>on Xen would agree.

Agree that correctness is most important, what I suggested is, let 
dom0/adminstrator tools to guard the correctness, not hypervisor, to keop the 
flexibility. Any idea.

Thanks
--jyh


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.