[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Branch Trace Storage for guestsandVPMUinitialization


  • To: <boris.ostrovsky@xxxxxxxxxx>
  • From: <Kevin.Mayer@xxxxxxxx>
  • Date: Thu, 26 Feb 2015 13:44:05 +0000
  • Accept-language: de-DE, en-US
  • Cc: xen-devel@xxxxxxxxxxxxx
  • Delivery-date: Thu, 26 Feb 2015 13:44:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHQUUlemT4gs/yjU0+olYu4KenVXZ0CpLog
  • Thread-topic: AW: AW: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization


> -----UrsprÃngliche Nachricht-----
> Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx]
> Gesendet: Mittwoch, 25. Februar 2015 23:20
> An: Mayer, Kevin
> Betreff: Re: AW: AW: [Xen-devel] Branch Trace Storage for guests
> andVPMUinitialization
> 
> On 02/25/2015 01:23 PM, Kevin.Mayer@xxxxxxxx wrote:
> >
> >> -----UrsprÃngliche Nachricht-----
> >> Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx]
> >> Gesendet: Mittwoch, 25. Februar 2015 17:32
> >> An: Mayer, Kevin
> >> Cc: xen-devel@xxxxxxxxxxxxx
> >> Betreff: Re: AW: [Xen-devel] Branch Trace Storage for guests and
> >> VPMUinitialization
> >>
> >> On 02/25/2015 10:12 AM, Kevin.Mayer@xxxxxxxx wrote:
> >>>> -----UrsprÃngliche Nachricht-----
> >>>> Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx]
> >>>> Gesendet: Dienstag, 24. Februar 2015 18:13
> >>>> An: Mayer, Kevin; xen-devel@xxxxxxxxxxxxx
> >>>> Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU
> >>>> initialization
> >>>>
> >>>> On 02/24/2015 10:27 AM, Kevin.Mayer@xxxxxxxx wrote:
> >>>>> Hi guys
> >>>>>
> >>>>> I`m trying to set up the BTS so that I can log the branches taken
> >>>>> in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7
> >>>>> Sandy Bridge.
> >>>>>
> >>>>> I added the vpmu=bts boot parameter to my grub2 configuration and
> >>>>> extended the libxl,libxc,domctl,â with an own command so that I
> >>>>> can trigger the activation of the BTS whenever I want.
> >>>>>
> >>>> I am not sure why you are doing all these changes to Xen code. BTS
> >>>> is supposed to be managed from the guest. For example, a Fedora
> HVM
> >>>> guest will produce this:
> >>>>
> >>>> [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e
> >>>> branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to
> >>>> write data ] [ perf record: Captured and wrote 0.704 MB perf.data
> >>>> (~30756 samples) ]
> >>>> [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f
> >>>> ip,addr,sym,dso,symoff --show-kernel-path
> >>>>     ffffffff8167c347 native_irq_return_iret+0x0 (/proc/kcore) =>
> >>>> 328c001590 [unknown] (/proc/kcore)
> >>>>     ffffffff8167c347 native_irq_return_iret+0x0 (/proc/kcore) =>
> >>>> 328c001590 [unknown] ([unknown])
> >>>>           328c001593 [unknown] ([unknown]) =>       328c004b70 [unknown]
> >>>> ([unknown])
> >>>> ...
> >>>>
> >>> I want to be able to log the taken branches (of the guest) without
> >>> the need
> >> to modify the guest at all.
> >>> This means I have to do all the logic in the hypervisor, or am I wrong?
> >> In that case, yes. But then you have to make sure that at least
> >>    * you don't load guest's VPMU (or, at least, BTS-related
> >> registers) on context switch
> >>    * You don't send the interrupt to the guest (meaning that you will
> >> need to somehow inform dom0 of the BTS interrupt)
> >>
> >> and probably more.
> >>
> >> Essentially, you want dom0 to profile the guest. I have been working
> >> on patches that would allow that but they are still under review.
> >>
> > Yes, this is exactly what I want to do.
> > Too bad that your patches are under review. Would have been pretty
> helpful I think.
> 
> To be honest, I never tested them for BTS so they may not work in that
> mode. In fact, as you will realize by reading what I said below, they probably
> don't ;-(
> 
> > Maybe I should point out that IÂm a total noob with xen and I definitely
> donât understand all parts yet.
> > So there may be some dumb mistakes in my assumptions.
> >
> >>>>> In this command I do the following:
> >>>>>
> >>>>> I set up the memory region for the BTS Buffer and the DS Buffer
> >>>>> Management Area using xzalloc_bytes
> >>>>>
> >>>> I don't think you should be allocating BTS buffers in the
> >>>> hypervisor, they
> >> are
> >>>> in guest's memory.
> >>> I agree. As I said I think this is where my main problem is at the moment.
> >>> Is there any way I can allocate memory in the hypervisor in a way
> >>> the guest
> >> can access it?
> >>
> >> I am not sure this is what you want since you seem to *not* want the
> >> guest to process the samples, right?
> >>
> >> But yes, you can. E.g. something like what map_vcpu_info() does. (I
> >> have no idea how you'd do this from Windows.)
> > Right again. As you said my goal is to profile the guest from dom0. So
> whenever the CPU is in guestmode and a branch is taken it should be stored
> in the BTS, but not when the CPU is running dom0. My idea was basically to
> set up the memory for the BTS and the GUEST_IA32_DEBUGCTL so when
> there is a vmexit the logging stops and starts again when there is a vmenter.
> As far as I understand the IA32_DEBUGCTL gets switched between the
> dom0-value and the guest-value (stored in vmcs) when there is a
> vmexit/vmenter, right?
> 
> Right. And now I am not longer sure whether your buffer should be in
> hypervisor or guest's space: after VMENTER the hardware will load guest's
> versions of IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA. I don't know
> whether you can prevent this from happening (need to look in the spec).
> And if that's the case then you might be able to:
> 
> 1. Map DS area and BTS buffer in both guest and hypervisor. I believe your
> guest will have to have this mapped since these ares will be accessed via
> guest's EPT. As I said, I don't know how you'd do this in Windows --- I know
> nothing about programming there. I assume it can be done since there are
> Windows PV drivers for Xen.
> 2. Have dom0 set appropriate bits in IA32_DEBUGCTLMSR to start tracing.
> You will need to first pause your guest's VCPUs, then update appropriate
> register in VMCS (bracketed with vmx_vmcs_enter/exit) and then unpause
> it.
> 3. If you program BTS to generate interrupts you may need to do something
> about it in vpmu_interrupt() to prevent those interrupts from going into the
> guest as this will likely confuse it and it will die (the interrupt I think 
> will be an
> NMI, making things real bad for the guest).
> 3. Now you should be able to read buffers from hypervisor.

Why should I prevent the loading of guest IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA?
The idea was to access/setup the guest IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA 
when in dom0. 
So when there is a VMENTER the guest registers get loaded and the BTS starts to 
log.
And stops of course when there is an VMEXIT.

Regarding 1.
I`m not sure how I know at which address the BTS is located in this case.
LetÂs say I setup the BTS in the guest at address x. To get this address x I 
need to 
read the guest MSR_IA32_DS_AREA, right?
For this I would need to access the vcpu->arch_vcpu-> hvm_vcpu->vpmu 
used by the guest since the MSR_IA32_DS_AREA isn`t part of the vmcs 
(and therefore cannot be accessed by the handy __vmread()).
Is there a good way to get this information during a vpmu_interrupt() (since I 
believe
The BTINT will have to be handled there), or maybe a VMEXIT?

2. I already use the vmx_vmcs_enter/exit but didnât think about pausing the 
vpcu.
I will add that.

3. I didnât look at the BTINT yet, but this sounds reasonable.

> 
> > This would be "the guest is logging the branch traces", but it is setup and
> controlled from the dom0. So more or less a hybrid I think.
> >
> >>> Of course the guest must not be able to use this memory in its
> >>> normal
> >> operations but just for BTS.
> >>> Is this even possible? I am rather confused at the moment. :-D
> >>>
> >>>>> Then I write the pointer to the BTS Buffer into the DS Buffer
> >>>>> Management Area at +0x0 and +0x8 (BTS Buffer Base and BTS Index)
> >>>>>
> >>>>> When I use vmx_msr_write_intercept to store the value in
> >>>>> MSR_IA32_DS_AREA the host reboots (my idea is he tries to access a
> >>>>> vpmu-struct that isnÂt there in the current vcpu and panics).
> >>
> >> Who is trying to write to MSR_IA32_DS_AREA? The guest or dom0? I
> >> thought you said that you want dom0 to do sampling. Or are you trying
> >> to setup DS area from your guest and control it from dom0? I am
> somewhat confused.
> >>
> > The dom0 writes to MSR_IA32_DS_AREA. I want to do all the setup and
> > controlling from dom0 in a way that enables the guest to store branch
> > traces in the BTS (that was setup by the dom0)
> 
> I think I understand why you crash hypervisor now. I mentioned above that
> writing into vmcs requires bracketing by vmx_vmcs_enter/exit. So, in
> addition to having new vcpu parameter to vmx_msr_write_intercept(), you
> need to add those two. See vmx_vlapic_msr_changed(), right above
> vmx_msr_write_intercept(). And don't forget to pause guest's vcpu (I am
> pretty sure you need that since your guest may be running somewhere else
> at this time).
> 
> 
> > Sorry if my explanations are a bit confusing. I myself am confused about
> this part of the Xen-code.
> >
> >>>> Can you post hypervisor log? (hard to say how helpful it will be
> >>>> without seeing your code changes though)
> >>>>
> >>> Right after enabling the BTS I get a triple fault.
> >>> hvm.c:1357:d2 Triple fault on VCPU0 - invoking HVM shutdown action 1.
> >>
> >> That's not host reboot, this is your guest dying.
> >
> > Yes
> > When I use my own vmx_msr_write_intercept (which explicitly uses the
> vcpu of my guest domain instead of the "current") and my own
> core2_vpmu_do_wrmsr , core2_vpmu_msr_common_check I donât get a
> host reboot, but a dying guest when I try to enable BTS. As you said most
> likely because the MSR_IA32_DS_AREA points to dom0-memory and the
> hypervisor is not amused when a guest tries to write stuff there.
> > When I use the build in ones (which all use struct vcpu *v = current;) I 
> > get a
> host reboot.
> > Maybe because of a missing vpmu-structs as I notice that only one vcpu_id
> gets initialized in vpmu_initialise during boot.
> > So when using the build in vmx_msr_write_intercept the writing ends in
> > vpmu_do_wrmsr at if ( vpmu->arch_vpmu_ops && vpmu-
> >arch_vpmu_ops->do_wrmsr )
> >          return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content); and
> > the host reboots.
> > Maybe I need some special kind of initialization before I call
> vmx_msr_write_intercept?
> > Even with
> > struct vcpu *current_v=current;
> > vpmu_initialise(current_v);
> > return_value= vmx_msr_write_intercept(MSR_IA32_DS_AREA,
> > ds_buffer_management_area); I get an instant host reboot at the above
> > mentioned return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content);
> 
> Right. Because you are trying to access VMCS from dom0 context. dom0
> doesn't have VMCS as it is a PV guest.
> 

I thought so, but isnât the if clause
if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->do_wrmsr )
supposed to catch that?

Kevin

> 
> -boris
> 
> >
> >>>>> When I use a modified version of vmx_msr_write_intercept I donât
> get
> >>>>> any crashes as long as I donât enable BTS and TR in the
> >>>>> GUEST_IA32_DEBUGCTL (BTR works). When I enable the BTS (and TR)
> >> the
> >>>>> guest crashes. I suppose he gets killed by the hypervisor for
> >>>>> accessing forbidden memory.
> >>>>>
> >>>> Possibly because DS area point to hypervisor memory.
> >>>>
> >>>>
> >>>> Having said all this, I am not sure how well BTS works. You did notice
> >>>> this in the hypervisor log:
> >>>>
> >>>> (XEN)
> >> ******************************************************
> >>>> (XEN) ** WARNING: Emulation of BTS Feature is switched on **
> >>>> (XEN) ** Using this processor feature in a virtualized **
> >>>> (XEN) ** environment is not 100% safe. **
> >>>> (XEN) ** Setting the DS buffer address with wrong values **
> >>>> (XEN) ** may lead to hypervisor hangs or crashes. **
> >>>> (XEN) ** It is NOT recommended for production use! **
> >>>> (XEN)
> >> ******************************************************
> >>> Yes, I saw that. It doesnât state that BTS is not working at all, just 
> >>> that it is
> >> not that safe to use.
> >>> As I understand it as long as I set the DS buffer address correctly I 
> >>> should
> be
> >> fine, right?
> >>
> >> Right. Except that I am not convinced you did set this buffer correctly,
> >> which is possibly why your hypervisor crashed (I am not sure I
> >> understood under what circumstances though).
> >>
> >> -boris
> > We are thinking very much alike. I also am not convinced I set the buffer
> correctly. ^^
> > But since I get a reboot as soon as
> > return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content); gets called
> I donât think that the setup of the buffer is the problem (when using the
> original vmx_msr_write_intercept), but rather something with the setup of
> the vpmu.
> > When I use my own vmx_msr_write_intercept with the d->vcpu[0] instead
> of current the writing succeeds but the guest crashes/gets killed when the
> BTS is enabled.
> > So in this second case the setup of the buffer seems to be the problem.
> >
> > Kevin
> >
> >>> Since I donât want to use for production that is fine with me. At least 
> >>> for
> >> now.
> >>>
> >>> Kevin
> >>>> -boris
> >>>>
> >>>>
> >>>>> The modified version of vmx_msr_write_intercept takes a vcpu-struct
> as
> >>>>> a parameter and uses this instead of the current vcpu.
> >>>>>
> >>>>> Instead of
> >>>>>
> >>>>> staticint vmx_msr_write_intercept(unsigned int msr, uint64_t
> >>>> msr_content)
> >>>>> {
> >>>>>
> >>>>>       struct vcpu *v = current;
> >>>>>
> >>>>> I just have
> >>>>>
> >>>>> staticint own_vmx_msr_write_intercept(unsigned int msr, uint64_t
> >>>>> msr_content, struct vcpu *v)
> >>>>>
> >>>>> I get this vcpu by d->vcpu[0] as I have limited my guest domain to one
> >>>>> vcpu atm.
> >>>>>
> >>>>> Of course I also use similarly modified version of the called
> >>>>> functions(vpmu_do_wrmsr,â).
> >>>>>
> >>>>> IÂm pretty sure that my problem is with a wrong scope/usage of the
> >>>>> vcpus/memory, but I have no idea how to fix this.
> >>>>>
> >>>>> I can see a potential problem with the memory allocation (in the host)
> >>>>> into which the cpu in guest-mode is supposed to write.
> >>>>>
> >>>>> Or maybe I got the principle of a vcpu/vpmu all wrong.
> >>>>>
> >>>>> Since I couldnât find any project that uses the BTS for the guest, I
> >>>>> am wondering if anyone has ever done this and if it is possible at all.
> >>>>>
> >>>>> Any input is welcome as I am pretty much stuck atmâ
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> Kevin
> >>>>>
> >>>>>
> >>>>> ____________
> >>>>> Virus checked by G Data MailSecurity
> >>>>> Version: AVA 25.404 dated 24.02.2015
> >>>>> Virus news: www.antiviruslab.com <http://www.antiviruslab.com>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Xen-devel mailing list
> >>>>> Xen-devel@xxxxxxxxxxxxx
> >>>>> http://lists.xen.org/xen-devel
> >>> ____________
> >>> Virus checked by G Data MailSecurity
> >>> Version: AVA 25.418 dated 25.02.2015
> >>> Virus news: www.antiviruslab.com
> > ____________
> > Virus checked by G Data MailSecurity
> > Version: AVA 25.420 dated 25.02.2015
> > Virus news: www.antiviruslab.com

____________
Virus checked by G Data MailSecurity
Version: AVA 25.433 dated 26.02.2015
Virus news: www.antiviruslab.com
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.