[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] x86emul: consolidate loop counter handling



On 07/12/16 09:56, Jan Beulich wrote:
>>>> On 06.12.16 at 17:20, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 06/12/16 13:38, Jan Beulich wrote:
>>> @@ -3977,33 +3981,21 @@ x86_emulate(
>>>          break;
>>>  
>>>      case 0xe0 ... 0xe2: /* loop{,z,nz} */ {
>>> +        unsigned long count = get_loop_count(&_regs, ad_bytes) - 1;
>>>          int do_jmp = !(_regs.eflags & EFLG_ZF); /* loopnz */
>>>  
>>>          if ( b == 0xe1 )
>>>              do_jmp = !do_jmp; /* loopz */
>>>          else if ( b == 0xe2 )
>>>              do_jmp = 1; /* loop */
>>> -        switch ( ad_bytes )
>>> -        {
>>> -        case 2:
>>> -            do_jmp &= --(*(uint16_t *)&_regs.ecx) != 0;
>>> -            break;
>>> -        case 4:
>>> -            do_jmp &= --(*(uint32_t *)&_regs.ecx) != 0;
>>> -            _regs.ecx = (uint32_t)_regs.ecx; /* zero extend in x86/64 mode 
>>> */
>>> -            break;
>>> -        default: /* case 8: */
>>> -            do_jmp &= --_regs.ecx != 0;
>>> -            break;
>>> -        }
>>> -        if ( do_jmp )
>>> +        if ( count && do_jmp )
>>>              jmp_rel((int32_t)src.val);
>>> +        put_loop_count(&_regs, ad_bytes, count);
>> I think this would also be clearer to follow if it had the form:
>>
>> unsigned long count = get_loop_count(&_regs, ad_bytes);
>> ...
>> put_loop_count(&_regs, ad_bytes, count - 1);
>> if ( count != 0 && do_jmp )
>>     jmp_rel((int32_t)src.val);
> Well, first of all it would need to be "count != 1" then. And I'm not
> convinced it is any less clear the way it was written. But I'll change
> it nevertheless, to avoid further discussion. I'd like to keep the
> put_loop_count() after jmp_rel() though to limit the lifetime of
> do_jmp.

jmp_rel() is the entity which performs the 0-length instruction fetch at
the branch target, and hides a goto done.  Your patch does change the
behaviour in this regard.

If we believe the ordering of the pseudocode in the manual, %ecx should
be updated before the jump occurs.  Let me see if I can work out what
real hardware does in this case.

>
>> Having said that, src.val is unconditionally a signed 8 byte immediate,
>> so I would have expected this to be an int8_t cast, rather than int32_t.
> We've had this discussion on another branch not so long ago: The
> reading into src.val does sign extension. And jmp_rel() casts to
> int anyway. Furthermore the patch doesn't even touch this line,
> and doing so would not fit its subject.

Ah yes.  I recall now.  (This is a very subtle case.)

>
>> Finally however, the emulated behaviour is wrong.  The manual states
>> "Note that the LOOP instruction ignores REX.W; but 64-bit address size
>> can be over-ridden using a 67H prefix."
>>
>> I think we need some extra early operand decoding to clobber REX.W, then
>> feed 67 conditionally back into ad_bytes.
> Where do you see REX.W being looked at here. Are you mixing up
> ad_bytes and op_bytes?

I am, yes.  Sorry.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.