[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Need help in debugging partially blocked hypervisor


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>
  • Date: Fri, 30 Oct 2009 13:20:39 +0100
  • Delivery-date: Fri, 30 Oct 2009 05:21:09 -0700
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:From:To:Subject:Date:User-Agent: References:In-Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Message-Id; b=n/ifMltTFeGi00nPMSNvDQTQnj1vXxH/lmT5SF8vl/ZGLesHPyeu/mgu ylxaSrorbLjjUz2r6K1/O5NnZkUQMLANO71QXFPw+1bDgWhDX1y9qAwHR hwuR8fwpIVWUTzaT81FCDc7uBLtTqRu2VYwZWNEINds8lFxC+f04esjZa iKXIri622WlIRFRncKU5DwR9mcmgHc07KXKovHHNslCF5ktfQHYLN4MhJ BrbImiGmbHNADtHhMLPlwLOAkPY3Q;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi,
 
> I need some help in debugging a strange hypervisor behavior together
> with using fully virtualized performance counters.
> 

I added some own tracer to xentrace to find, what the CPU is doing.
No I can see, that in the strange case the CPU is doing endless (and nothing
else!) performance counter NMI's within the hypervisor.

pmu_apic_interrupt
  smp_pmu_apic_interrupt
    vmx_do_pmu_interrupt
      vpmu_do_interrupt

In the normal case in core2_vpmu_do_interrupt:
            1. Read the cause of the nmi
        rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content);
        ...
            2. Save the value for the domU
        ...
            3. Reset the cause
        wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content);
            4. Inject NMI in domU

This works very well for a short time.
Then the hypervisor falls in the endless nmi loop. The cause for this seems
to be that "3. Reset the cause" doesn't work anymore. Means writing to the
MSR_CORE_PERF_GLOBAL_OVF_CTRL doesn't reset the MSR_CORE_PERF_GLOBAL_STATUS
which leads to the next nmi immediately.
I found this by adding another tracer which reads the 
MSR_CORE_PERF_GLOBAL_STATUS
once again after writing the MSR_CORE_PERF_GLOBAL_OVF_CTRL.
In the normal case this contains now 0, in the strange case the value is 
unchanged!

I searched the intel processor spec but couldn't find any help.
So my questions is, what is wrong here?
Can anybody with more knowledge point me in the right direction, what can I 
still
do to find the real cause of this?

Many thanks in advance!
Dietmar.

-- 
Company details: http://ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.