[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Branch Trace Storage for guestsandVPMUinitialization
> -----UrsprÃngliche Nachricht----- > Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx] > Gesendet: Mittwoch, 25. Februar 2015 23:20 > An: Mayer, Kevin > Betreff: Re: AW: AW: [Xen-devel] Branch Trace Storage for guests > andVPMUinitialization > > On 02/25/2015 01:23 PM, Kevin.Mayer@xxxxxxxx wrote: > > > >> -----UrsprÃngliche Nachricht----- > >> Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx] > >> Gesendet: Mittwoch, 25. Februar 2015 17:32 > >> An: Mayer, Kevin > >> Cc: xen-devel@xxxxxxxxxxxxx > >> Betreff: Re: AW: [Xen-devel] Branch Trace Storage for guests and > >> VPMUinitialization > >> > >> On 02/25/2015 10:12 AM, Kevin.Mayer@xxxxxxxx wrote: > >>>> -----UrsprÃngliche Nachricht----- > >>>> Von: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx] > >>>> Gesendet: Dienstag, 24. Februar 2015 18:13 > >>>> An: Mayer, Kevin; xen-devel@xxxxxxxxxxxxx > >>>> Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU > >>>> initialization > >>>> > >>>> On 02/24/2015 10:27 AM, Kevin.Mayer@xxxxxxxx wrote: > >>>>> Hi guys > >>>>> > >>>>> I`m trying to set up the BTS so that I can log the branches taken > >>>>> in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 > >>>>> Sandy Bridge. > >>>>> > >>>>> I added the vpmu=bts boot parameter to my grub2 configuration and > >>>>> extended the libxl,libxc,domctl,â with an own command so that I > >>>>> can trigger the activation of the BTS whenever I want. > >>>>> > >>>> I am not sure why you are doing all these changes to Xen code. BTS > >>>> is supposed to be managed from the guest. For example, a Fedora > HVM > >>>> guest will produce this: > >>>> > >>>> [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e > >>>> branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to > >>>> write data ] [ perf record: Captured and wrote 0.704 MB perf.data > >>>> (~30756 samples) ] > >>>> [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f > >>>> ip,addr,sym,dso,symoff --show-kernel-path > >>>> ffffffff8167c347 native_irq_return_iret+0x0 (/proc/kcore) => > >>>> 328c001590 [unknown] (/proc/kcore) > >>>> ffffffff8167c347 native_irq_return_iret+0x0 (/proc/kcore) => > >>>> 328c001590 [unknown] ([unknown]) > >>>> 328c001593 [unknown] ([unknown]) => 328c004b70 [unknown] > >>>> ([unknown]) > >>>> ... > >>>> > >>> I want to be able to log the taken branches (of the guest) without > >>> the need > >> to modify the guest at all. > >>> This means I have to do all the logic in the hypervisor, or am I wrong? > >> In that case, yes. But then you have to make sure that at least > >> * you don't load guest's VPMU (or, at least, BTS-related > >> registers) on context switch > >> * You don't send the interrupt to the guest (meaning that you will > >> need to somehow inform dom0 of the BTS interrupt) > >> > >> and probably more. > >> > >> Essentially, you want dom0 to profile the guest. I have been working > >> on patches that would allow that but they are still under review. > >> > > Yes, this is exactly what I want to do. > > Too bad that your patches are under review. Would have been pretty > helpful I think. > > To be honest, I never tested them for BTS so they may not work in that > mode. In fact, as you will realize by reading what I said below, they probably > don't ;-( > > > Maybe I should point out that IÂm a total noob with xen and I definitely > donât understand all parts yet. > > So there may be some dumb mistakes in my assumptions. > > > >>>>> In this command I do the following: > >>>>> > >>>>> I set up the memory region for the BTS Buffer and the DS Buffer > >>>>> Management Area using xzalloc_bytes > >>>>> > >>>> I don't think you should be allocating BTS buffers in the > >>>> hypervisor, they > >> are > >>>> in guest's memory. > >>> I agree. As I said I think this is where my main problem is at the moment. > >>> Is there any way I can allocate memory in the hypervisor in a way > >>> the guest > >> can access it? > >> > >> I am not sure this is what you want since you seem to *not* want the > >> guest to process the samples, right? > >> > >> But yes, you can. E.g. something like what map_vcpu_info() does. (I > >> have no idea how you'd do this from Windows.) > > Right again. As you said my goal is to profile the guest from dom0. So > whenever the CPU is in guestmode and a branch is taken it should be stored > in the BTS, but not when the CPU is running dom0. My idea was basically to > set up the memory for the BTS and the GUEST_IA32_DEBUGCTL so when > there is a vmexit the logging stops and starts again when there is a vmenter. > As far as I understand the IA32_DEBUGCTL gets switched between the > dom0-value and the guest-value (stored in vmcs) when there is a > vmexit/vmenter, right? > > Right. And now I am not longer sure whether your buffer should be in > hypervisor or guest's space: after VMENTER the hardware will load guest's > versions of IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA. I don't know > whether you can prevent this from happening (need to look in the spec). > And if that's the case then you might be able to: > > 1. Map DS area and BTS buffer in both guest and hypervisor. I believe your > guest will have to have this mapped since these ares will be accessed via > guest's EPT. As I said, I don't know how you'd do this in Windows --- I know > nothing about programming there. I assume it can be done since there are > Windows PV drivers for Xen. > 2. Have dom0 set appropriate bits in IA32_DEBUGCTLMSR to start tracing. > You will need to first pause your guest's VCPUs, then update appropriate > register in VMCS (bracketed with vmx_vmcs_enter/exit) and then unpause > it. > 3. If you program BTS to generate interrupts you may need to do something > about it in vpmu_interrupt() to prevent those interrupts from going into the > guest as this will likely confuse it and it will die (the interrupt I think > will be an > NMI, making things real bad for the guest). > 3. Now you should be able to read buffers from hypervisor. Why should I prevent the loading of guest IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA? The idea was to access/setup the guest IA32_DEBUGCTLMSR and MSR_IA32_DS_AREA when in dom0. So when there is a VMENTER the guest registers get loaded and the BTS starts to log. And stops of course when there is an VMEXIT. Regarding 1. I`m not sure how I know at which address the BTS is located in this case. LetÂs say I setup the BTS in the guest at address x. To get this address x I need to read the guest MSR_IA32_DS_AREA, right? For this I would need to access the vcpu->arch_vcpu-> hvm_vcpu->vpmu used by the guest since the MSR_IA32_DS_AREA isn`t part of the vmcs (and therefore cannot be accessed by the handy __vmread()). Is there a good way to get this information during a vpmu_interrupt() (since I believe The BTINT will have to be handled there), or maybe a VMEXIT? 2. I already use the vmx_vmcs_enter/exit but didnât think about pausing the vpcu. I will add that. 3. I didnât look at the BTINT yet, but this sounds reasonable. > > > This would be "the guest is logging the branch traces", but it is setup and > controlled from the dom0. So more or less a hybrid I think. > > > >>> Of course the guest must not be able to use this memory in its > >>> normal > >> operations but just for BTS. > >>> Is this even possible? I am rather confused at the moment. :-D > >>> > >>>>> Then I write the pointer to the BTS Buffer into the DS Buffer > >>>>> Management Area at +0x0 and +0x8 (BTS Buffer Base and BTS Index) > >>>>> > >>>>> When I use vmx_msr_write_intercept to store the value in > >>>>> MSR_IA32_DS_AREA the host reboots (my idea is he tries to access a > >>>>> vpmu-struct that isnÂt there in the current vcpu and panics). > >> > >> Who is trying to write to MSR_IA32_DS_AREA? The guest or dom0? I > >> thought you said that you want dom0 to do sampling. Or are you trying > >> to setup DS area from your guest and control it from dom0? I am > somewhat confused. > >> > > The dom0 writes to MSR_IA32_DS_AREA. I want to do all the setup and > > controlling from dom0 in a way that enables the guest to store branch > > traces in the BTS (that was setup by the dom0) > > I think I understand why you crash hypervisor now. I mentioned above that > writing into vmcs requires bracketing by vmx_vmcs_enter/exit. So, in > addition to having new vcpu parameter to vmx_msr_write_intercept(), you > need to add those two. See vmx_vlapic_msr_changed(), right above > vmx_msr_write_intercept(). And don't forget to pause guest's vcpu (I am > pretty sure you need that since your guest may be running somewhere else > at this time). > > > > Sorry if my explanations are a bit confusing. I myself am confused about > this part of the Xen-code. > > > >>>> Can you post hypervisor log? (hard to say how helpful it will be > >>>> without seeing your code changes though) > >>>> > >>> Right after enabling the BTS I get a triple fault. > >>> hvm.c:1357:d2 Triple fault on VCPU0 - invoking HVM shutdown action 1. > >> > >> That's not host reboot, this is your guest dying. > > > > Yes > > When I use my own vmx_msr_write_intercept (which explicitly uses the > vcpu of my guest domain instead of the "current") and my own > core2_vpmu_do_wrmsr , core2_vpmu_msr_common_check I donât get a > host reboot, but a dying guest when I try to enable BTS. As you said most > likely because the MSR_IA32_DS_AREA points to dom0-memory and the > hypervisor is not amused when a guest tries to write stuff there. > > When I use the build in ones (which all use struct vcpu *v = current;) I > > get a > host reboot. > > Maybe because of a missing vpmu-structs as I notice that only one vcpu_id > gets initialized in vpmu_initialise during boot. > > So when using the build in vmx_msr_write_intercept the writing ends in > > vpmu_do_wrmsr at if ( vpmu->arch_vpmu_ops && vpmu- > >arch_vpmu_ops->do_wrmsr ) > > return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content); and > > the host reboots. > > Maybe I need some special kind of initialization before I call > vmx_msr_write_intercept? > > Even with > > struct vcpu *current_v=current; > > vpmu_initialise(current_v); > > return_value= vmx_msr_write_intercept(MSR_IA32_DS_AREA, > > ds_buffer_management_area); I get an instant host reboot at the above > > mentioned return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content); > > Right. Because you are trying to access VMCS from dom0 context. dom0 > doesn't have VMCS as it is a PV guest. > I thought so, but isnât the if clause if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->do_wrmsr ) supposed to catch that? Kevin > > -boris > > > > >>>>> When I use a modified version of vmx_msr_write_intercept I donât > get > >>>>> any crashes as long as I donât enable BTS and TR in the > >>>>> GUEST_IA32_DEBUGCTL (BTR works). When I enable the BTS (and TR) > >> the > >>>>> guest crashes. I suppose he gets killed by the hypervisor for > >>>>> accessing forbidden memory. > >>>>> > >>>> Possibly because DS area point to hypervisor memory. > >>>> > >>>> > >>>> Having said all this, I am not sure how well BTS works. You did notice > >>>> this in the hypervisor log: > >>>> > >>>> (XEN) > >> ****************************************************** > >>>> (XEN) ** WARNING: Emulation of BTS Feature is switched on ** > >>>> (XEN) ** Using this processor feature in a virtualized ** > >>>> (XEN) ** environment is not 100% safe. ** > >>>> (XEN) ** Setting the DS buffer address with wrong values ** > >>>> (XEN) ** may lead to hypervisor hangs or crashes. ** > >>>> (XEN) ** It is NOT recommended for production use! ** > >>>> (XEN) > >> ****************************************************** > >>> Yes, I saw that. It doesnât state that BTS is not working at all, just > >>> that it is > >> not that safe to use. > >>> As I understand it as long as I set the DS buffer address correctly I > >>> should > be > >> fine, right? > >> > >> Right. Except that I am not convinced you did set this buffer correctly, > >> which is possibly why your hypervisor crashed (I am not sure I > >> understood under what circumstances though). > >> > >> -boris > > We are thinking very much alike. I also am not convinced I set the buffer > correctly. ^^ > > But since I get a reboot as soon as > > return vpmu->arch_vpmu_ops->do_wrmsr(msr, msr_content); gets called > I donât think that the setup of the buffer is the problem (when using the > original vmx_msr_write_intercept), but rather something with the setup of > the vpmu. > > When I use my own vmx_msr_write_intercept with the d->vcpu[0] instead > of current the writing succeeds but the guest crashes/gets killed when the > BTS is enabled. > > So in this second case the setup of the buffer seems to be the problem. > > > > Kevin > > > >>> Since I donât want to use for production that is fine with me. At least > >>> for > >> now. > >>> > >>> Kevin > >>>> -boris > >>>> > >>>> > >>>>> The modified version of vmx_msr_write_intercept takes a vcpu-struct > as > >>>>> a parameter and uses this instead of the current vcpu. > >>>>> > >>>>> Instead of > >>>>> > >>>>> staticint vmx_msr_write_intercept(unsigned int msr, uint64_t > >>>> msr_content) > >>>>> { > >>>>> > >>>>> struct vcpu *v = current; > >>>>> > >>>>> I just have > >>>>> > >>>>> staticint own_vmx_msr_write_intercept(unsigned int msr, uint64_t > >>>>> msr_content, struct vcpu *v) > >>>>> > >>>>> I get this vcpu by d->vcpu[0] as I have limited my guest domain to one > >>>>> vcpu atm. > >>>>> > >>>>> Of course I also use similarly modified version of the called > >>>>> functions(vpmu_do_wrmsr,â). > >>>>> > >>>>> IÂm pretty sure that my problem is with a wrong scope/usage of the > >>>>> vcpus/memory, but I have no idea how to fix this. > >>>>> > >>>>> I can see a potential problem with the memory allocation (in the host) > >>>>> into which the cpu in guest-mode is supposed to write. > >>>>> > >>>>> Or maybe I got the principle of a vcpu/vpmu all wrong. > >>>>> > >>>>> Since I couldnât find any project that uses the BTS for the guest, I > >>>>> am wondering if anyone has ever done this and if it is possible at all. > >>>>> > >>>>> Any input is welcome as I am pretty much stuck atmâ > >>>>> > >>>>> Cheers > >>>>> > >>>>> Kevin > >>>>> > >>>>> > >>>>> ____________ > >>>>> Virus checked by G Data MailSecurity > >>>>> Version: AVA 25.404 dated 24.02.2015 > >>>>> Virus news: www.antiviruslab.com <http://www.antiviruslab.com> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Xen-devel mailing list > >>>>> Xen-devel@xxxxxxxxxxxxx > >>>>> http://lists.xen.org/xen-devel > >>> ____________ > >>> Virus checked by G Data MailSecurity > >>> Version: AVA 25.418 dated 25.02.2015 > >>> Virus news: www.antiviruslab.com > > ____________ > > Virus checked by G Data MailSecurity > > Version: AVA 25.420 dated 25.02.2015 > > Virus news: www.antiviruslab.com ____________ Virus checked by G Data MailSecurity Version: AVA 25.433 dated 26.02.2015 Virus news: www.antiviruslab.com _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |