[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] Sanity check xsave area when migrating or restoring from older Xen verions



On 22/10/14 14:19, Don Koch wrote:
> On Wed, 22 Oct 2014 11:00:52 +0100
> Jan Beulich <JBeulich@xxxxxxxx> wrote:
>
>>>>> On 21.10.14 at 21:25, <dkoch@xxxxxxxxxxx> wrote:
>>> On Tue, 21 Oct 2014 20:00:53 +0100
>>> Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>>>
>>>> On 21/10/14 19:40, Don Koch wrote:
>>>>> Xen 4.3 and older transferred a maximum sized xsave area (as if all
>>>>> the available XCR0 bits were set); the new version only transfers
>>>>> based on the actual XCR0 bits. This may result in a smaller area if
>>>>> the last sections were missing (e.g., the LWP area from an AMD
>>>>> machine). If the size doesn't match the XCR0 derived size, the part of
>>>>> the xsave area between the XCR0 specified and transferred size is
>>>>> checked for zero data. If any part of the overflow area is non-zero,
>>>>> we return with an error.
>>>>>
>>>>> Signed-off-by: Don Koch <dkoch@xxxxxxxxxxx>
>>>>> ---
>>>>> Changes in V4:
>>>>> - Removed check of size base on xfeature_mask.
>>>>> - Unsign some ints.
>>>>> - Change %d to %u for unsigned ints.
>>>>> - Move printk to only print if non-zero data found.
>>>>>
>>>>> Changes in V3:
>>>>> - use h->data for zero check
>>>>> - remove max size check (use size that was sent)
>>>>> - fix error message (drop first byte value)
>>>>> - fix "for" issues
>>>>>
>>>>> Changes in V2:
>>>>> - Add check for size.
>>>>> - Add check for non-zero data in unused part of block.
>>>>>
>>>>>  xen/arch/x86/hvm/hvm.c | 28 ++++++++++++++++------------
>>>>>  1 file changed, 16 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>>>> index f0e1edc..c2780d2 100644
>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>> @@ -1971,6 +1971,7 @@ static int hvm_load_cpu_xsave_states(struct domain 
>>>>> *d, 
>>> hvm_domain_context_t *h)
>>>>>      struct vcpu *v;
>>>>>      struct hvm_hw_cpu_xsave *ctxt;
>>>>>      struct hvm_save_descriptor *desc;
>>>>> +    unsigned int i, overflow_start;
>>>>>  
>>>>>      /* Which vcpu is this? */
>>>>>      vcpuid = hvm_load_instance(h);
>>>>> @@ -2011,15 +2012,8 @@ static int hvm_load_cpu_xsave_states(struct domain 
>>> *d, hvm_domain_context_t *h)
>>>>>                          save_area) + XSTATE_AREA_MIN_SIZE);
>>>>>          return -EINVAL;
>>>>>      }
>>>>> -    size = HVM_CPU_XSAVE_SIZE(xfeature_mask);
>>>>> -    if ( desc->length > size )
>>>>> -    {
>>>>> -        printk(XENLOG_G_WARNING
>>>>> -               "HVM%d.%d restore mismatch: xsave length %u > %u\n",
>>>>> -               d->domain_id, vcpuid, desc->length, size);
>>>>> -        return -EOPNOTSUPP;
>>>>> -    }
>>>>>      h->cur += sizeof (*desc);
>>>>> +    overflow_start = h->cur;
>>>>>  
>>>>>      ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur];
>>>>>      h->cur += desc->length;
>>>>> @@ -2038,10 +2032,20 @@ static int hvm_load_cpu_xsave_states(struct 
>>>>> domain 
>>> *d, hvm_domain_context_t *h)
>>>>>      size = HVM_CPU_XSAVE_SIZE(ctxt->xcr0_accum);
>>>>>      if ( desc->length > size )
>>>>>      {
>>>>> -        printk(XENLOG_G_WARNING
>>>>> -               "HVM%d.%d restore mismatch: xsave length %u > %u\n",
>>>>> -               d->domain_id, vcpuid, desc->length, size);
>>>>> -        return -EOPNOTSUPP;
>>>>> +        /* Make sure missing bytes are all zero. */
>>>> Please make a reference to the bug in this comment, so the reasons for
>>>> the strange check is a little more obvious given a glance at the code.
>>>>
>>>> Perhaps
>>>>
>>>> /*
>>>>  * Xen-4.3 and older used to send longer-than-needed xsave regions. 
>>>> Permit loading the record if the extra data is all zero
>>>>  */
>>>>
>>>> (suitably wrapped, given its natural indentation)
>>> OK, will do.
>>>
>>>>> +        for ( i = size; i < desc->length; i++ )
>>>>> +        {
>>>>> +            if ( h->data[overflow_start + i] )
>>>>> +            {
>>>>> +                printk(XENLOG_G_WARNING
>>>>> +                       "HVM%u.%u restore mismatch: xsave length %u > 
>>>>> %u\n",
>>>>> +                       d->domain_id, vcpuid, desc->length, size);
>>>>> +                printk(XENLOG_G_WARNING
>>>>> +                       "HVM%u.%u restore mismatch: xsave has non-zero 
>>>>> data starting at %#x\n",
>>>>> +                       d->domain_id, vcpuid, i);
>>>> This should be one message.  Also note that, while a lot of code gets it
>>>> wrong, domain_id is signed while vcpuid is unsigned.
>>> I had suggested one message. Jan said it should be two.
>> Right, and I still think it should be two. Just not the way you did it.
>> I specifically said in the reply to the previous version " just add your
>> new check ahead of the existing printk()". In case this was ambiguous
>> to you - I think the pre-existing printk() should continue to get
>> issued (even if not being on an error path anymore) so that we have
>> some kind of indication that the truncating path was taken. After all
>> this shouldn't happen frequently, considering that the most recent
>> stable releases of the older branches already don't do this anymore.
> I thought that's where I had it. If the block size mismatch is detected,
> issue the first message then go into the loop to check for non-zero data
> and, if any is found, then issue the second and exit.
>
> Andrew, IIUC, didn't want the first one issued unless the non-zero data
> case was found, i.e. issue no message unless both conditions were met.
>
> So, which should I do?

On consideration, having the unconditional overly-long error will be
useful for debugging purposes, so best to keep it, independently of the
"and non-zero data" error.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.