[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate problem)
Dan, Don't use Cset20073 for testing separately, since it needs two minor fixes check-ined by the Cset #20093 and #20149. Except this, Keir also has a typo in Cset #20076 fixed by Cset #20092. In addition, one serious issue is also introduced in #Cset20084 which is fixed in Cset #20140. I remembered Pod also has issues which can crash hypervisor before Cset #20100. Thus, it is too hard to identify this issue through bisect before #Cset20149, since these issues are introduced and fixed crossedly. Certainly, if you want to test Cset #20073, you at least have to apply the Cset#20093 and #20149 on top of it. :) Xiantao Dan Magenheimer wrote: >> But I'll give bisecting a try. > > Looks like the problem has been around for awhile. It appears > the problem starts at c/s 20073. Xiantao cc'ed since 20073 > was his patch. > > 20070 boots OK without max_cstate=2 > > 20072 boots most of the way without max_cstate=2 but crashes > before a login prompt (when xend is starting I think) > > 20073 FAILS to boot without max_cstate=2 but crashes > before a login prompt > > 20082 FAILS to boot without max_cstate=2 but crashes > before a login prompt with max_cstate=2 > > 20143 FAILS to boot without max_cstate=2 but boots OK > with max_cstate=2 > > Note that I have NOT bisected tools, just the hypervisor > so the crashes are likely due to a newer xend failing on > an older hypervisor (which is irrelevant to this problem). > >> -----Original Message----- >> From: Dan Magenheimer >> Sent: Tuesday, December 08, 2009 10:42 AM >> To: Yu, Ke; Xen-Devel (E-mail) >> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely >> hpet/Cstate problem) >> >> >>> case, if convenient, could you help to do some bisect to see >>> which cset cause this bug? >> >> I can do this, but because it is often no longer easy to >> bisect Xen because of interdependencies with other >> components, I was hoping that Keir or you or someone might >> have some idea of what changeset might have caused the regression. >> But I'll give bisecting a try. >> >>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can >>> Xen response to three Ctrl-'A' in serial? >> >> Unfortunately, I can't seem to get a Xen console working on >> the Merom machine, and the problem can't be reproduced on >> my other machine where the Xen console is working (because >> Conroe doesn't support deep C). >> >>> -----Original Message----- >>> From: Yu, Ke [mailto:ke.yu@xxxxxxxxx] >>> Sent: Tuesday, December 08, 2009 12:08 AM >>> To: Dan Magenheimer; Xen-Devel (E-mail) >>> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely >>> hpet/Cstate problem) >>> >>> >>>> -----Original Message----- >>>> In this thread, I observed that I was unable to >>>> provoke deep C state (C3) on my Dell D630, which has >>>> a Intel Merom (dual-core laptop) processor. At that >>>> time, when I tried enabling hpetbroadcast, dom0 boot failed. >>>> >>>> http://lists.xensource.com/archives/html/xen-devel/2009-10/ms >>>> g01027.html >>>> >>>> As it turned out, all RHEL5-based (maybe RHEL4- also) dom0 >>>> default installation run /sbin/hwclock, which IIRC takes >>>> the RTC away from Xen and gives it to dom0. Since the >>>> Xen hpet emulation does not do RTC emulation, bad things >>>> then happen when a deep Cstate is entered (dom0 apparently >>>> never wakes up). I think Ke Yu has also reproduced this problem. >>>> >>>> Sometime in the last few weeks, some patch in xen-unstable >>>> apparently changed some defaults and xen-unstable will >>>> no longer boot with this processor/dom0, with or without >>>> hpetbroadcast on the Xen command line. However, specifying >>>> max_cstate=2 on the Xen command line allows a successful >>>> dom0 boot, so I suspect the problem is the same (or at >>>> least very similar). >>>> >>>> I did a quick scan for hpet changes and found c/s 20497, >>>> but backing it out made no difference. >>>> >>>> I have a workaround for now, but since it is likely that >>>> many customers (including all of Oracle's OVS customers) >>>> use a RHEL5-based dom0 boot sequence, and Merom processors >>>> work fine otherwise, it would be nice to get this identified >>>> and fixed before 4.0. >>> >>> Let's firstly figure out which component the issue resides. >>> >>> Firstly, in the default boot (i.e. without specifying >>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can >>> Xen response to three Ctrl-'A' in serial? >>> >>> If only dom0 hangs, it is probably that RTC malfunction make >>> incorrect dom0 time and lead dom0 fail to boot. Then RTC >>> emulation in hypervisor should fix this issue. >>> >>> If Xen also hangs, it should be another bug, i.e. hpet >>> broadcast does not wake up CPU in deep C states. in this >>> case, if convenient, could you help to do some bisect to see >>> which cset cause this bug? >>> >>> Best Regards >>> Ke _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |