[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] State of current Xen debugger
Yeah, but the performance counters are driven by the same LAPIC timesource that drives the main LAPIC timer. -- Keir On 28/09/2010 16:40, "Roger Cruz" <roger.cruz@xxxxxxxxxxxxxxxxxxx> wrote: > > > By the APIC timer? When I traced this code I was under the impression that is > driven by the performance counters counting cycles and generating an interrupt > when the counter overflows. I found this was the routine being called to > setup the watchdog > > static void __pminit setup_p6_watchdog(unsigned counter) > { > unsigned int evntsel; > > nmi_perfctr_msr = MSR_P6_PERFCTR0; <--- register > > clear_msr_range(MSR_P6_EVNTSEL0, 2); > clear_msr_range(MSR_P6_PERFCTR0, 2); > > evntsel = P6_EVNTSEL_INT > | P6_EVNTSEL_OS > | P6_EVNTSEL_USR > | counter; > > wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); > write_watchdog_counter("P6_PERFCTR0"); > apic_write(APIC_LVTPC, APIC_DM_NMI); > evntsel |= P6_EVNTSEL0_ENABLE; > wrmsr(MSR_P6_EVNTSEL0, evntsel, 0); > } > > and then during the NMI tick handler this path was executed > > else if ( nmi_perfctr_msr == MSR_P6_PERFCTR0 ) > { > /* > * Only P6 based Pentium M need to re-unmask the apic vector but > * it doesn't hurt other P6 variants. > */ > apic_write(APIC_LVTPC, APIC_DM_NMI); > } > write_watchdog_counter(NULL); > > > > static inline void write_watchdog_counter(const char *descr) > { > u64 count = (u64)cpu_khz * 1000; > > do_div(count, nmi_hz); > if(descr) > Dprintk("setting %s to -0x%08Lx\n", descr, count); > wrmsrl(nmi_perfctr_msr, 0 - count); > } > > > It is also my understanding that during the CPU c3 state change in cpu_idle.c, > the APIC timer is turned off. See comments below. > > /* > * Before invoking C3, be aware that TSC/APIC timer may be > * stopped by H/W. Without carefully handling of TSC/APIC stop issues, > * deep C state can't work correctly. > */ > /* preparing APIC stop */ > lapic_timer_off(); <------------- APIC timer appears to be turned off > here. > > /* Get start time (ticks) */ > t1 = inl(pmtmr_ioport); > /* Trace cpu idle entry */ > TRACE_2D(TRC_PM_IDLE_ENTRY, cx->idx, t1); > /* Invoke C3 */ > acpi_idle_do_entry(cx); > /* Get end time (ticks) */ > t2 = inl(pmtmr_ioport); > > /* recovering TSC */ > cstate_restore_tsc(); <----- this is our backport of an unstable > patch to keep TSCs synchronized > /* Trace cpu idle exit */ > > > Thanks Keir! > > Roger > > -----Original Message----- > From: Keir Fraser on behalf of Keir Fraser > Sent: Tue 9/28/2010 11:30 AM > To: Roger Cruz; Dan Magenheimer; Tim Deegan > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > Subject: Re: [Xen-devel] State of current Xen debugger > > On 28/09/2010 16:21, "Roger Cruz" <roger.cruz@xxxxxxxxxxxxxxxxxxx> wrote: > >> I am still chasing this hard hang in our system with a modified 3.4.2 xen. I >> have upgraded the BIOS and the problem still exists. The only thing that so >> far had appeared to work was adding max_cstate=0 but now I have a report >> where >> it still hung in one customer who had that flag enabled. The rest of them >> have been successfully running for more than a week with this ³work-around². >> I have isolated the problem to Lenovo with the Centrino processors. These >> guys will stop the TSC when in C3. >> >> What I need to really understand is why the NMI/watchdog in Xen is not >> working >> and causing a crash when the CPU hangs. I was under the impression that NMIs >> couldn¹t be masked at all. Is there anyway that Xen could be disabling or >> changing that behavior? I know the NMI is being driven by a timer set in >> the >> NMI handler. Could there be a case where this timer is disabled? Any ideas >> are welcome! > > The NMI counter gets driven by the APIC timer. Perhaps it needs poking > womehow on wakeup from C3? My suggestion for debugging this would be to take > a look at what native Linux does. The NMI perfctr poking logic was all taken > from (rather old now) upstream Linux. > > -- Keir > >> Thanks >> Roger R. Cruz >> >> >> >> >> >> >> >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Roger Cruz >> Sent: Tuesday, September 14, 2010 11:55 AM >> To: Dan Magenheimer; Tim Deegan >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx >> Subject: RE: [Xen-devel] State of current Xen debugger >> >> Hi Dan, >> >> I am using 3.4.2 where we have made very minor modifications (some backports, >> for example). >> >> I have not tried your suggestions.. so I will do that next.. thanks! >> >> R. >> >> -----Original Message----- >> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >> Sent: Tue 9/14/2010 11:19 AM >> To: Roger Cruz; Tim Deegan >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx >> Subject: RE: [Xen-devel] State of current Xen debugger >> >> A couple of thoughts: >> >> >> >> Have you tried max_cstate=0 (as a Xen boot option)? >> >> >> >> Also, you didn't say what version of Xen you are using but playing around >> with >> hpet_broadcast (enabling it or force-disabling it as below) might be worth a >> try. >> >> >> >> http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00556.html >> >> >> >> From: Roger Cruz [mailto:roger.cruz@xxxxxxxxxxxxxxxxxxx] >> Sent: Tuesday, September 14, 2010 8:56 AM >> To: Tim Deegan >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx >> Subject: RE: [Xen-devel] State of current Xen debugger >> >> >> >> Hi Tim, good to hear from you again >> >> I had a pretty good inkling that one of you hardcore developers would say >> that >> :-) Yes, it is pretty well wedged. I can cause the problem more rapidly by >> dropping to a single CPU. When the hang happens, the Xen console is >> completely dead. None of the special keys work. >> >> I do have hopes a BIOS upgrade could fix this as a last resort but I want to >> see if at least I can understand the problem. We have a few different >> machines that are exhibiting similar symptoms so I have to see if I can find >> a >> work-around without requiring every user to upgrade their BIOS :-( >> >> Just in case, what debugger have you been using? Are there recent >> instructions on how to set it up that you can point me to? >> >> Thanks >> Roger >> >> >> -----Original Message----- >> From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx] >> Sent: Tue 9/14/2010 10:30 AM >> To: Roger Cruz >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx >> Subject: Re: [Xen-devel] State of current Xen debugger >> >> Hi, >> >> At 15:22 +0100 on 14 Sep (1284477779), Roger Cruz wrote: >>> I am trying to debug a problem where the hypervisor is hanging hard. >>> Not even the NMI watchdog is triggering a reboot. So I wanted to hook >>> up a debugger. >> >> Sorry to bring a counsel of despair but if the NMI watchdog isn't >> working then your chances of getting a working debugger are slim. It's >> likely that at least one CPU is very very stuck. Does the 'd' debug key >> work on the serial line when the machine is wedged? >> >> On a more cheerful note, I've twice seen hard hangs like this that >> turned out to be hardware issues, fixable with BIOS upgrades. >> >> Cheers, >> >> Tim. >> >>> What is the state of the current debuggers out there? >>> Any input on how I should set it up (kdb, gdb, etc) and pointers to a >>> good wiki page are much appreciated. I did perform a Google search >>> and found some links but I want to hear from the current developers as >>> to what is most stable and useful for debugging this type of hard >>> hang. I only have a serial port PCI-express card to use as the laptop >>> has no built in port. >> >> -- >> Tim Deegan <Tim.Deegan@xxxxxxxxxx> >> Principal Software Engineer, XenServer Engineering >> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) >> >> No virus found in this incoming message. >> Checked by AVG - www.avg.com >> Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 >> 02:35:00 >> >> No virus found in this incoming message. >> Checked by AVG - www.avg.com >> Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 >> 02:35:00 >> >> >> No virus found in this incoming message. >> Checked by AVG - www.avg.com >> Version: 9.0.851 / Virus Database: 271.1.1/3119 - Release Date: 09/14/10 >> 02:35:00 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxxxxxxxx >> http://lists.xensource.com/xen-devel > > > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.856 / Virus Database: 271.1.1/3149 - Release Date: 09/28/10 > 02:34:00 > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |