[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3 state wake up



There's a race in rombios/Makefile which could cause the rombios to be
re-built befor the 32bit parts have been re-generated. If you re-made your
rombios only once, and then started debugging it without ever re-making it
again, you could then see this issue. If you had at any point re-made your
rombios a second time, the issue would most likely have gone away. :-)

 -- Keir

On 31/7/08 11:25, "Ke, Liping" <liping.ke@xxxxxxxxx> wrote:

> Hi, Keir and jiajun
> 
> We found here the reason of triple fault is because eax = 0@point when do
> "call eax". 
> 
> Strange is that after pull latest tree xen/18178, kernel/622 do a clean build,
> virtual s3/resume back successfully. I can't reproduce the error anymore. (I
> use default remote-qemu). I tested both with vtd on and off. Both fine.
> 
> Also dump guest area from 0xfcb00 (upcall jump table area), seems just fine.
> Now I can't reproduce, I will keep on eye on the problem to see whether it
> happens again -:(.
> 
> Regards,
> Criping
> 
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: 2008年7月30日 20:05
> To: Ke, Liping; Tian, Kevin; Xu, Jiajun; Yu, Ke; Jiang, Yunhong
> Cc: xen-devel; Ian Jackson
> Subject: Re: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3 state wake up
> 
> No, not necessarily. Obviously some other exception or interrupt has
> occurred here and lack of a valid IDT has turned it into a triple fault. It
> oughtn't to be an interrupt since your RFLAGS.IF = 0. The fact that
> RFLAGS.RF = 1 is perhaps interesting (could very well not be related to your
> problem though).
> 
>  -- Keir
> 
> On 30/7/08 12:50, "Ke, Liping" <liping.ke@xxxxxxxxx> wrote:
> 
>> Btw: need TR register be set when switching to protect mode?
>> -----Original Message-----
>> From: Ke, Liping
>> Sent: 2008年7月30日 19:44
>> To: Tian, Kevin; Keir Fraser; Xu, Jiajun; Yu, Ke; Jiang, Yunhong
>> Cc: xen-devel; Ian Jackson
>> Subject: RE: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3 state wake up
>> 
>> Hi, all
>> Just found:
>> When doing get_s3_waking_vector(rombios.c)->upcall(32bitgateway.c)->
>> switch_to_protmode-> call eax (I add a deadloop in get_s3_wakeing_vector @
>> beginning, so I think first instruction will fail@call eax) , it will meet
>> triple fault with below information:
>> 
>> 
>> (XEN) VMEntry: intr_info=00000031 errcode=00000006 ilen=00000000
>> (XEN) VMExit: intr_info=00000000 errcode=00000043 ilen=00000000
>> (XEN)         reason=00000002(trip fault) qualification=00000000
>> 
>> Seems mostly caused by protect mode execution environment after switching to
>> protect mode. I dump both normal and abnormal vmcs. Could anybody help to
>> identify below information:
>> 1. Whether gdtr/ldtr selector/base/attr could be 0/0/0x93 in protect mode
>> 2. gs/fs limit should be 0xfffffffff instead of 0xffff in protect mode?
>> 3. cr0 and cr4 bit such as PAE/PSE makes different? I guess No?
>> 4. Guest EFER now should be 0. It is correct when switching to protect mode?
>> 
>> Any input is warmly welcome, really not familiar with asm code in
>> 32bitgateway
>> -:)
>> Thanks a lot!
>> Criping
>> 
>> Triple fault point vmcs:
>> (XEN) >>> Domain 1 <<<
>> (XEN)  VCPU 0
>> (XEN) *** Guest State ***
>> (XEN) CR0: actual=0x0000000080010031, shadow=0x0000000000000011,
>> gh_mask=fffffff
>> fffffffff
>> (XEN) CR4: actual=0x0000000000002020, shadow=0x0000000000000000,
>> gh_mask=fffffff
>> fffffffff
>> (XEN) CR3: actual=0x000000007d3eba20, target_count=0
>> (XEN)      target0=0000000000000000, target1=0000000000000000
>> (XEN)      target2=0000000000000000, target3=0000000000000000
>> (XEN) RSP = 0x000000000000ffe8 (0x000000000000ffe8)  RIP = 0x0000000000000003
>> (0
>> x0000000000000003)
>> (XEN) RFLAGS=0x0000000000010082 (0x0000000000010082)  DR7 =
>> 0x0000000000000400
>> (XEN) Sysenter RSP=00000000c1809300 CS:RIP=0060:00000000c0403f4c
>> (XEN) CS: sel=0x0008, attr=0x00c9b, limit=0xffffffff, base=0x0000000000000000
>> (XEN) DS: sel=0x0018, attr=0x00c93, limit=0xffffffff, base=0x0000000000000000
>> (XEN) SS: sel=0x0018, attr=0x00c93, limit=0xffffffff, base=0x0000000000000000
>> (XEN) ES: sel=0x0018, attr=0x00c93, limit=0xffffffff, base=0x0000000000000000
>> (XEN) FS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000
>> (XEN) GS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000
>> (XEN) GDTR: sel=0x0000, attr=0x00000, limit=0x0000001f,
>> base=0x00000000000fa5f0
>> (XEN) LDTR: sel=0x0000, attr=0x00082, limit=0x0000ffff,
>> base=0x0000000000000000
>> (XEN) IDTR: sel=0x0000, attr=0x00000, limit=0x0000ffff,
>> base=0x0000000000000000
>> (XEN) TR: sel=0x0000, attr=0x0008b, limit=0x0000ffff, base=0x0000000000000000
>> (XEN) TSC Offset = ffffffc32074372c
>> (XEN) DebugCtl=0000000000000000 DebugExceptions=0000000000000000
>> (XEN) Interruptibility=0000 ActivityState=0000
>> (XEN) *** Host State ***
>> (XEN) RSP = 0xffff83007d3f7fa0  RIP = 0xffff828c80181200
>> (XEN) CS=e008 DS=0000 ES=0000 FS=0000 GS=0000 SS=0000 TR=e060
>> (XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff828c80274c80
>> (XEN) GDTBase=ffff820000000000 IDTBase=ffff83007c69c080
>> (XEN) CR0=0000000080050033 CR3=000000007b59c000 CR4=00000000000026b0
>> (XEN) Sysenter RSP=ffff83007d3f7fd0 CS:RIP=e008:ffff828c801a5290
>> (XEN) *** Control State ***
>> (XEN) PinBased=0000003f CPUBased=b6a1e7fa SecondaryExec=00000041
>> (XEN) EntryControls=000011ff ExitControls=0003efff
>> (XEN) ExceptionBitmap=00044000
>> (XEN) VMEntry: intr_info=00000031 errcode=00000006 ilen=00000000
>> (XEN) VMExit: intr_info=00000000 errcode=00000043 ilen=00000000
>> (XEN)         reason=00000002 qualification=00000000
>> (XEN) IDTVectoring: info=00000000 errcode=00000000
>> (XEN) TPR Threshold = 0x00
>> (XEN) EPT pointer = 0x0000000000000000
>> (XEN) Virtual processor ID = 0x0000
>> (XEN) **************************************
>> 
>> 
>> Normal vmcs
>> (XEN)  VCPU 0
>> (XEN) *** Guest State ***
>> (XEN) CR0: actual=0x000000008005003b, shadow=0x0000000080050033,
>> gh_mask=fffffff
>> fffffffff
>> (XEN) CR4: actual=0x00000000000026f0, shadow=0x00000000000006f0,
>> gh_mask=fffffff
>> fffffffff
>> (XEN) CR3: actual=0x000000007d3eba20, target_count=0
>> (XEN)      target0=0000000000000000, target1=0000000000000000
>> (XEN)      target2=0000000000000000, target3=0000000000000000
>> (XEN) RSP = 0x00000000f3a7de9c (0x00000000f3a7de9c)  RIP = 0x00000000c0506769
>> (0
>> x00000000c050676b)
>> (XEN) RFLAGS=0x0000000000000046 (0x0000000000000046)  DR7 =
>> 0x0000000000000400
>> (XEN) Sysenter RSP=00000000c1809300 CS:RIP=0060:00000000c0403f4c
>> (XEN) CS: sel=0x0060, attr=0x00c9b, limit=0xffffffff, base=0x0000000000000000
>> (XEN) DS: sel=0x007b, attr=0x00cf3, limit=0xffffffff, base=0x0000000000000000
>> (XEN) SS: sel=0x0068, attr=0x00c93, limit=0xffffffff, base=0x0000000000000000
>> (XEN) ES: sel=0x007b, attr=0x00cf3, limit=0xffffffff, base=0x0000000000000000
>> (XEN) FS: sel=0x0000, attr=0x00c00, limit=0xffffffff, base=0x0000000000000000
>> (XEN) GS: sel=0x0033, attr=0x00df3, limit=0xffffffff, base=0x00000000b7f058d0
>> (XEN) GDTR: sel=0x0033, attr=0x00000, limit=0x000000ff,
>> base=0x00000000c1810000
>> (XEN) LDTR: sel=0x0088, attr=0x00082, limit=0x00000027,
>> base=0x00000000c078b020
>> (XEN) IDTR: sel=0x0088, attr=0x00000, limit=0x000007ff,
>> base=0x00000000c06e0000
>> (XEN) TR: sel=0x0080, attr=0x0008b, limit=0x00002073, base=0x00000000c1807100
>> (XEN) TSC Offset = ffffffc32074372c
>> (XEN) DebugCtl=0000000000000000 DebugExceptions=0000000000000000
>> (XEN) Interruptibility=0000 ActivityState=0000
>> (XEN) *** Host State ***
>> (XEN) RSP = 0xffff828c8023ffa0  RIP = 0xffff828c80181200
>> (XEN) CS=e008 DS=0000 ES=0000 FS=0000 GS=0000 SS=0000 TR=e040
>> (XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff828c80274c00
>> (XEN) GDTBase=ffff820000000000 IDTBase=ffff828c80278500
>> (XEN) CR0=000000008005003b CR3=000000007b59c000 CR4=00000000000026b0
>> (XEN) Sysenter RSP=ffff828c8023ffd0 CS:RIP=e008:ffff828c801a5290
>> (XEN) *** Control State ***
>> (XEN) PinBased=0000003f CPUBased=b6a1e7fa SecondaryExec=00000041
>> (XEN) EntryControls=000011ff ExitControls=0003efff
>> (XEN) ExceptionBitmap=00044080
>> (XEN) VMEntry: intr_info=00000031 errcode=00000006 ilen=00000000
>> (XEN) VMExit: intr_info=00000000 errcode=00000000 ilen=00000000
>> (XEN)         reason=0000001e qualification=1f440001
>> (XEN) IDTVectoring: info=00000000 errcode=00000000
>> (XEN) TPR Threshold = 0x00
>> (XEN) EPT pointer = 0x0000000000000000
>> (XEN) Virtual processor ID = 0x0000
>> (XEN) **************************************
>> 
>> -----Original Message-----
>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Ke, Liping
>> Sent: 2008年7月30日 12:49
>> To: Tian, Kevin; Keir Fraser; Xu, Jiajun; Yu, Ke
>> Cc: xen-devel; Ian Jackson
>> Subject: RE: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3 state wake up
>> 
>> Hi, Kevin
>> Thanks for the detailed explanation. Now I understand it :)
>> Then most of guests with correct initial value should use legacy wakeup
>> vector. The guests I have tested should be so.
>> 
>> Strange thing is that with top of tree, when do resume, it will do reset,
>> mostly caused by not getting wakeup vector. I will dig into the reason.
>> 
>> Thanks& Regards,
>> Criping
>> 
>> -----Original Message-----
>> From: Tian, Kevin
>> Sent: 2008年7月30日 11:09
>> To: Ke, Liping; Keir Fraser; Xu, Jiajun; Yu, Ke
>> Cc: xen-devel; Ian Jackson
>> Subject: RE: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3 state wake up
>> 
>> Liping, it's not guest BIOS to choose which instead should simply
>> follow ACPI spec, i.e, if OSPM fills value in x_firmware field, then
>> guest BIOS picks that value as wakeup vector in a flat protect mode.
>> Else, if OSPM fills value in legacy firmware field, guest BIOS then
>> resumes to given address in real mode.
>> 
>> It's the OSPM to decide which field to be used, according to whether
>> its wakeup vector is developed as real mode code. Then it's not 'us'
>> to decide. :-)
>> 
>> Commodity OSes are all using real mode wakeup vector by far. But
>> there's a known bug in Linux kernel where, whether to use x_firmware
>> field is incorrectly counted by its initial value. Normally BIOS will fill
>> zero in that field which avoids Linux to use xfirmware field. If guest
>> BIOS incorrectly puts some value in that field, guest Linux will choose
>> xfirmware field although it only has real mode wakeup vector. But
>> this is a guest bug.
>> 
>> Thanks,
>> Kevin
>> 
>> 
>>> From: Ke, Liping
>>> Sent: 2008年7月30日 10:51
>>> 
>>> Hi, Keir
>>> 
>>> Sure. I am looking on it:)
>>> Just got someinfo, according to the ACPI spec, when we are
>>> using x_firmware_waking_vector, we should wake up from protect
>>> mode. Since we now resume back from real mode, so we'd better
>>> use firmware_waking_vector.
>>> 
>>> Thanks a lot!
>>> Criping
>>> 
>>> 
>>> -----Original Message-----
>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>>> Sent: 2008年7月29日 23:12
>>> To: Ke, Liping; Xu, Jiajun
>>> Cc: xen-devel; Ian Jackson
>>> Subject: Re: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3
>>> state wake up
>>> 
>>> I fixed these issues as of changeset 18166. However S3 resume
>>> is still not
>>> working for me. Perhaps it's something to do with the new ioemu-remote
>>> repository? Anyway, I'll hand it back to you to dig into further. ;-)
>>> 
>>> Oh, also our handling of x_firmware_waking_vector appears not
>>> good. If the
>>> OSPM specifies that vector, are we not supposed to wake it in
>>> flat protected
>>> mode?
>>> 
>>> -- Keir
>>> 
>>> On 29/7/08 11:26, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
>>> 
>>>> I can reproduce the issue. It's two things: firstly certain
>>> ACPI tables do
>>>> need to be writable (e.g., firmware_waking_vector).
>>> Secondly, when the BIOS
>>>> re-POSTs it is writing to itself, which we allow on initial
>>> boot but not on
>>>> warm reset. That needs fixing. I'll take a look at doing so.
>>>> 
>>>>  -- Keir
>>>> 
>>>> On 29/7/08 10:53, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I didn't actually test cs18120, so I'm not certain that I
>>> removed all writes
>>>>> to write-protected ROM regions. If such writes are
>>> happening then the logging
>>>>> at line 1510 in xen/arch/x86/hvm/hvm.c should be printed to
>>> the Xen console.
>>>>> You may need a debug build of Xen to see them, or add
>>> guest_loglvl=all as a
>>>>> Xen boot parameter.
>>>>> 
>>>>> The EBDA is simply a RAM area for the BIOS to stash
>>> important private (and in
>>>>> some cases public) data. Usually it is located just below the VGA
>>>>> framebuffer,
>>>>> at around 0x9fc00. Certain parts of it have a well-defined
>>> format; other
>>>>> parts
>>>>> are completely private to the BIOS. For our purposes all we
>>> care about is
>>>>> that
>>>>> we do not write-protect it, and we just stash an extra
>>> 8-bit variable within
>>>>> it to indicate if this is a warm return from S3.
>>>>> 
>>>>>  -- Keir
>>>>> 
>>>>> On 29/7/08 10:47, "Ke, Liping" <liping.ke@xxxxxxxxx> wrote:
>>>>> 
>>>>>> Hi, Selander and Jean
>>>>>> 
>>>>>> Jiajun is reporting similar (on cs18132) error in latest cs.
>>>>>> I found when keeping cs18120, revert 18027, everything is just ok.
>>>>>> So cs18120 itself works fine, yet if cs18027 set
>>> ro-attributes, problem
>>>>>> still
>>>>>> exist.
>>>>>> 
>>>>>> Just did some debugging, from ITP, one cpu is in
>>> default_idle loop, other
>>>>>> one
>>>>>> is for-ever running in x86_emulate/memcpy/__hvm_copy, etc.
>>> So I think this
>>>>>> might be the same problem Guyader meet before?
>>>>>> 
>>>>>> I am not familiar about EBDA, could somebody help me to
>>> have a look?
>>>>>> 
>>>>>> Thanks& Regards,
>>>>>> Criping
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>>>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf
>>> Of Keir Fraser
>>>>>> Sent: 2008年7月24日 20:45
>>>>>> To: Jean Guyader; Trolle Selander
>>>>>> Cc: xen-devel; Ian Jackson
>>>>>> Subject: Re: [Xen-devel] Re: [PATCH] ioemu-remote: ACPI S3
>>> state wake up
>>>>>> 
>>>>>> On 24/7/08 13:12, "Jean Guyader"
>>> <jean.guyader@xxxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>>> Jean Guyader wrote:
>>>>>>>> I already tried to reduce the rw area, and just keep
>>> 0xe0 -> 0xef. But
>>>>>>>> obviously it doesn't work the device model needs to
>>> write on this frame
>>>>>>>> 0xf1. I still don't figure out why.
>>>>>>> 
>>>>>>> The rombios write on this page because of this flags
>>> s3_resume_flag
>>>>>>> (rombios.c:98883). I don't know if it's a good reason to set the
>>>>>>> rombios as rw. However it's bad to set the first 2 pages
>>> of the rombios
>>>>>>> as rw just because of that.
>>>>>>> Any suggestions ?
>>>>>> 
>>>>>> In that case the changes to ioemu-remote should be
>>> reverted. The correct fix
>>>>>> is to move the S3 resume flag into the EBDA. I have
>>> committed this fix as
>>>>>> xen-unstable.hg:18120.
>>>>>> 
>>>>>>  -- Keir
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Xen-devel mailing list
>>>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>>> http://lists.xensource.com/xen-devel
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.