[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [for-4.9] Re: HVM guest performance regression



On 26/05/17 21:01, Stefano Stabellini wrote:
> On Fri, 26 May 2017, Juergen Gross wrote:
>> On 26/05/17 18:19, Ian Jackson wrote:
>>> Juergen Gross writes ("HVM guest performance regression"):
>>>> Looking for the reason of a performance regression of HVM guests under
>>>> Xen 4.7 against 4.5 I found the reason to be commit
>>>> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack")
>>>> in Xen 4.6.
>>>>
>>>> The problem occurred when dom0 had to be ballooned down when starting
>>>> the guest. The performance of some micro benchmarks dropped by about
>>>> a factor of 2 with above commit.
>>>>
>>>> Interesting point is that the performance of the guest will depend on
>>>> the amount of free memory being available at guest creation time.
>>>> When there was barely enough memory available for starting the guest
>>>> the performance will remain low even if memory is being freed later.
>>>>
>>>> I'd like to suggest we either revert the commit or have some other
>>>> mechanism to try to have some reserve free memory when starting a
>>>> domain.
>>>
>>> Oh, dear.  The memory accounting swamp again.  Clearly we are not
>>> going to drain that swamp now, but I don't like regressions.
>>>
>>> I am not opposed to reverting that commit.  I was a bit iffy about it
>>> at the time; and according to the removal commit message, it was
>>> basically removed because it was a piece of cargo cult for which we
>>> had no justification in any of our records.
>>>
>>> Indeed I think fixing this is a candidate for 4.9.
>>>
>>> Do you know the mechanism by which the freemem slack helps ?  I think
>>> that would be a prerequisite for reverting this.  That way we can have
>>> an understanding of why we are doing things, rather than just
>>> flailing at random...
>>
>> I wish I would understand it.
>>
>> One candidate would be 2M/1G pages being possible with enough free
>> memory, but I haven't proofed this yet. I can have a try by disabling
>> big pages in the hypervisor.
> 
> Right, if I had to bet, I would put my money on superpages shattering
> being the cause of the problem.

Seems you would have lost your money...

Meanwhile I've found a way to get the "good" performance in the micro
benchmark. Unfortunately this requires to switch off the pv interfaces
in the HVM guest via "xen_nopv" kernel boot parameter.

I have verified that pv spinlocks are not to blame (via "xen_nopvspin"
kernel boot parameter). Switching to clocksource TSC in the running
system doesn't help either.

Unfortunately the kernel seems no longer to be functional when I try to
tweak it not to use the PVHVM enhancements. I'm wondering now whether
there have ever been any benchmarks to proof PVHVM really being faster
than non-PVHVM? My findings seem to suggest there might be a huge
performance gap with PVHVM. OTOH this might depend on hardware and other
factors.

Stefano, didn't you do the PVHVM stuff back in 2010? Do you have any
data from then regarding performance figures?


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.