[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVMlite gains



On 16/03/16 03:18, Andy Lutomirski wrote:
> On Mar 15, 2016 3:05 PM, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 15/03/2016 21:52, Andy Lutomirski wrote:
>>> On Tue, Mar 15, 2016 at 2:50 PM, Andrew Cooper
>>> <andrew.cooper3@xxxxxxxxxx> wrote:
>>>> On 15/03/2016 21:36, Andy Lutomirski wrote:
>>>>>>>   e) Can timing use RDTSC?
>>>>>> I don't understand this question in the context of the others.  RDTSC
>>>>>> has (as far as I can tell) always been advertised and available for
>>>>>> guest use.  RDTSCP is a different matter, and I have half-fixed that
>>>>>> brokenness; it should now work correctly in HVM guests.
>>>>>>
>>>>> These questions mostly came from me, and they weren't necessarily
>>>>> intended to make sense as a coherent whole :)  They were more of a
>>>>> random collection of things I was wondering about to varying extents.
>>>>>
>>>>> What I mean is:  if we point sched_clock at RDTSC and try to use the
>>>>> regular TSC timesource in a guest, will it work reasonably well,
>>>>> assuming that the underlying hardware supports it?  And, if the
>>>>> underlying hardware doesn't support it (e.g. not constant / invariant
>>>>> or no TSC offsetting available or similar), will the hypervisor tell
>>>>> the guest this fact via CPUID so that the standard guest clocksource
>>>>> code doesn't try to use a non-working TSC?
>>>> In principle yes, but it is rather more complicated than that.
>>>>
>>>> By default, if you want a guest to be migrateable and you can't
>>>> guarantee that you will have hardware TSC scaling support on every
>>>> future destination, you cannot advertise the TSC as stable to the
>>>> guest.  We err on the side of caution and don't advertise invariance by
>>>> default.
>>>>
>>>> In practice, if you are running on anything vaguely modern, the TSC will
>>>> be reliable between migrates.
>>> By "reliable" do you mean monotonic and not horribly jumpy?  I thought
>>> there was no shipping hardware with TSC scaling.
>> AMD have had TSC scaling for a long time (code added to Xen in 2011).
>> Intel are the ones late to the party in this case.
>>
>> There was a patch series from Joao around about Christmas "x86/time:
>> PVCLOCK_TSC_STABLE_BIT supportwhich identified several bugs with Xen's
>> TSC handling as visible in the PVCLK.  It would be nice to get those
>> bugs fixed.
>>
>>>> What the migration protocol currently lacks is a mechanism to identify
>>>> "This VM was advertised invariant TSC at frequency $X when it was
>>>> booted".  There is nominally a "no migrate" flag which can be set, at
>>>> which point invariance will be advertised if the host is capable.
>>>> However, there is no way for the toolstack to query this, so nothing in
>>>> the migrate code checks or acts upon it.
>>>>
>>>> Windows have worked around this limitation with the Viridian spec,
>>>> whereby the hypervisor can provide the current TSC frequency, and
>>>> promises that it won't change until the next suspend/resume, at which
>>>> point the frequency will be resampled.
>>>>
>>> That's simpler and maybe even better than the pvclock design, at least
>>> as implemented by KVM.  Sigh.
>> Updates to that also need fixing.  PVCLOCK is a Xen ABI which was
>> borrowed by KVM then locally modified.
>>
>> I believe the two are still compatible.
>>
>> But yes - the Viridian way does appear substantially more sane.
>>
> Hmm. Is migration synchronous enough for that approach to be reliable?
>  That is, when the TSC frequency changes, is there some way that the
> guest is guaranteed to be notified before it starts screwing up its
> timing calculations?

For VMs which do not have any Xen PV drivers, it is possible to migrate
them without any cooperation at all.  In this case, there is no
practical way to indicate that the TSC frequency has changed.

For VMs which do have PV drivers, migration requires guest cooperation,
or memory corruption will occur (pre-existing shared mappings can't have
writes tracked on them, so the guest driver is required to replay its
control ring on the destination side).  In this case, the guest always
passes through xen_suspend().  HYPERVISOR_suspend() is called in the
source side, and returns on the destination.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.