[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Always save/restore performance counters when HVM guest switching VCPU



Am Montag 11 MÃrz 2013, 10:53:49 schrieb Konrad Rzeszutek Wilk:
> On Mon, Mar 11, 2013 at 11:11:02AM +0000, George Dunlap wrote:
> > On 08/03/13 15:11, Boris Ostrovsky wrote:
> > >----- george.dunlap@xxxxxxxxxxxxx wrote:
> > >
> > >>On 08/03/13 14:50, Boris Ostrovsky wrote:
> > >>>----- JBeulich@xxxxxxxx wrote:
> > >>>
> > >>>>>>>On 04.03.13 at 13:42, George Dunlap
> > >><George.Dunlap@xxxxxxxxxxxxx>
> > >>>>wrote:
> > >>>>>On Fri, Mar 1, 2013 at 8:49 PM,  <suravee.suthikulpanit@xxxxxxx>
> > >>>>wrote:
> > >>>>>>From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
> > >>>>>>
> > >>>>>>Currently, the performance counter registers are saved/restores
> > >>>>>>when the HVM guest switchs VCPUs only if they are running.
> > >>>>>>However, PERF has one check where it writes the MSR and read
> > >>back
> > >>>>>>the value to check if the MSR is working.  This has shown to
> > >>fails
> > >>>>>>the check if the VCPU is moved in between rdmsr and wrmsr and
> > >>>>>>resulting in the values are different.
> > >>>>>Many moons ago (circa 2005) when I used performance counters, I
> > >>>>found
> > >>>>>that adding them to the save/restore path added a non-neligible
> > >>>>>overhead -- something like 5% slow-down.  Do you have any reason
> > >>to
> > >>>>>believe this is no longer the case?  Have you done any benchmarks
> > >>>>>before and after?
> > >>>I was doing some VPMU tracing a couple of weeks ago and by looking
> > >>at
> > >>>trace timestamps I think I saw about 4000 cycles on VPMU save and
> > >>>~9000 cycles on restore. Don't remember what it was percentage-wise
> > >>of
> > >>>a whole context switch.
> > >>>
> > >>>This was on Intel.
> > >>That's a really hefty expense to make all users pay on every context
> > >>switch, on behalf of a random check in a piece of software that only a
> > >>handful of people are going to be actually using.
> > >I believe Linux uses perf infrastructure to implement the watchdog.
> 
> And by default it won't work as for Intel you need these flags:
> 
> cpuid=['0xa:eax=0x07300403,ebx=0x00000004,ecx=0x00000000,edx=0x00000603' ]

This cpuid config variable should not be needed if your cpu is supported in
vmx_vpmu_initialise() where you added a lot of processors with your patch.
If not supported and you should see a message in the xen logs.

> 
> What we get right now when booting PVHVM under Intel is:
> 
> [    0.160989] Performance Events: unsupported p6 CPU model 45 no PMU driver, 
> software events only.
> [    0.168098] NMI watchdog disabled (cpu0): hardware events not enabled

Did you add vpmu to the xen boot parameter list?

I installed opensuse-12.2 as a HVM guest with xen-unstable running and the 
kernel
log says:

Mar  7 15:06:18 linux kernel: [    0.183217] CPU0: Intel(R) Core(TM)2 Duo CPU   
  P8800  @ 2.66GHz stepping 0a
Mar  7 15:06:18 linux kernel: [    0.183980] Performance Events: 4-deep LBR, 
Core2 events, Intel PMU driver.
Mar  7 15:06:18 linux kernel: [    0.189994] ... version:                2
Mar  7 15:06:18 linux kernel: [    0.189997] ... bit width:              40
Mar  7 15:06:18 linux kernel: [    0.190000] ... generic registers:      2
Mar  7 15:06:18 linux kernel: [    0.190002] ... value mask:             
000000ffffffffff
Mar  7 15:06:18 linux kernel: [    0.190005] ... max period:             
000000007fffffff
Mar  7 15:06:18 linux kernel: [    0.190008] ... fixed-purpose events:   3
Mar  7 15:06:18 linux kernel: [    0.190011] ... event mask:             
0000000700000003
Mar  7 15:06:18 linux kernel: [    0.198203] NMI watchdog: enabled, takes one 
hw-pmu counter.

When I call perf:

# perf stat ls
acpid             cups      kdm.log        mail.err        news              
wtmp            zypper.log
alternatives.log  faillog   krb5           mail.info       ntp               
Xorg.0.log
boot.log          firewall  lastlog        mail.warn       pm-powersave.log  
Xorg.0.log.old
btmp              hp        localmessages  messages        samba             
YaST2
ConsoleKit        journal   mail           NetworkManager  warn              
zypp

 Performance counter stats for 'ls':

          7.840869 task-clock                #    0.590 CPUs utilized          
                59 context-switches          #    0.008 M/sec                  
                 0 CPU-migrations            #    0.000 K/sec                  
               304 page-faults               #    0.039 M/sec                  
         6,583,834 cycles                    #    0.840 GHz                     
[40.38%]
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
         2,168,931 instructions              #    0.33  insns per cycle         
[73.20%]
           525,628 branches                  #   67.037 M/sec                   
[79.06%]
            27,138 branch-misses             #    5.16% of all branches         
[83.55%]

       0.013283672 seconds time elapsed

As you can see performance counters are working for instructions, branches
and branch-misses.

When I call this command in the dom0 it's a bit different:

# perf stat ls
acpid             journal        messages           wpa_supplicant.log
alternatives.log  kdm.log        NetworkManager     wtmp
boot.log          krb5           news               xen
btmp              lastlog        ntp                Xorg.0.log
ConsoleKit        localmessages  pk_backend_zypp    Xorg.0.log.old
cups              mail           pk_backend_zypp-1  YaST2
faillog           mail.err       pm-powersave.log   zypp
firewall          mail.info      samba              zypper.log
hp                mail.warn      warn               zypper.log-20130307.xz

 Performance counter stats for 'ls':

          6.959326 task-clock                #    0.714 CPUs utilized          
                11 context-switches          #    0.002 M/sec                  
                 0 CPU-migrations            #    0.000 K/sec                  
               304 page-faults               #    0.044 M/sec                  
   <not supported> cycles                  
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
   <not supported> instructions            
   <not supported> branches                
   <not supported> branch-misses           

       0.009746152 seconds time elapsed

This is because the hardware events are not supported in PV.

Dietmar.


> Unless said above CPUID flag is provided.
> > 
> > Hmm -- well if it is the case that adding performance counters to
> > the vcpu context switch path will add a measurable overhead, then we
> > probably don't want them enabled for typical guests anyway.  If
> > people are actually using the performance counters to measure
> > performance, that makes sense; but for watchdogs it seems like Xen
> > should be able to provide something that is useful for a watchdog
> > without the extra overhead of saving and restoring performance
> > counters.
> > 
> > Konrad, any thoughts?
> 
> The other thing is that there is an Xen watchdog. The one that Jan Beulich
> wrote which should also work under PVHVM:
> 
> drivers/watchdog/xen_wdt.c
> 
> 
> > 
> >  -George

-- 
Company details: http://ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.