[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] tools/migration: Fix potential overflow in send_checkpoint_dirty_pfn_list()


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 6 Jul 2021 16:00:24 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=St0EwMrId+cZUv2Y991CKhOohQUFO9iLd9sMz+Pae4c=; b=CCdYFQlhZSt46YKWsx5a+8H0aIKyhEHug/5CoaLFaiVfu+fk8fhu7CBH7AkpwPZ+E1x4SA0UlfWTOYmKBAPfM7aI+xQzjSKJwHpJO9zblZp1dA6BqMkDZvDyV1MoFQqT6nd1gGqRP98mrumgJR1f7GT8d06bGG3bI7dqHWN+sj1TlIb8wGjzPMOHmwmLFPzO3w0Gtx2v52Lgw2LfCUDy34nJj1fy9V7mZSv45kp408B/MR3gLaR4lFJD5XKi8F5+qLDdbB1CZzWHmc7zE2wtxeL5WPrkRRapU4X9/+MGwGY3l0ZqeslrMTOWjX7MzM2MUBVgO1KfkH5HPavgOAjPlA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zn7Q6lT/vWGN9B9824GRQ6ZMziOu7kFA5rfziB1qqPNl+AB+sM7gG25Vp0z+Sr8zUtLFh2GjKZ/ezuNk7sG/qBh5wDlYhSRaD3f1WUoShkZ9msV6K9M3IirnvElnQH/augm5+UNcjJ+WoGXrkGB6hAh1S/BBlgk2cLlvSyUCDHjxDxTL24Ou5cq7cqI7ZlEDl2gIncpxVELzr7tx+CUNtlgtgXEtJf1JlltFnzfqrk1hBpH8p4rCjl4+fqjg//DF9YWEHMUHzETzbqYwRzGXzDfuKpmsdP3tWBzbQjylHjv/HZuitZjoSSTaUb0Y0aT2SaIdO2rQbeoXl0EnJT0bDw==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: Ian Jackson <iwj@xxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Olaf Hering <olaf@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 06 Jul 2021 14:00:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 06.07.2021 15:34, Andrew Cooper wrote:
> On 06/07/2021 13:03, Jan Beulich wrote:
>> On 06.07.2021 13:23, Andrew Cooper wrote:
>>> 'count * sizeof(*pfns)' can in principle overflow, but is implausible in
>>> practice as the time between checkpoints is typically sub-second.
>>> Nevertheless, simplify the code and remove the risk.
>>>
>>> There is no need to loop over the bitmap to calculate count.  The number of
>>> set bits is returned in xc_shadow_op_stats_t which is already collected (and
>>> ignored).
>>>
>>> Bounds check the count against what will fit in REC_LENGTH_MAX.  At the time
>>> of writing, this allows up to 0xffffff pfns.
>> Well, okay, this then means that an overflow in the reporting of
>> dirty_count doesn't matter for now, because the limit is lower
>> anyway.
>>
>>>  Rearrange the pfns loop to check
>>> for errors both ways, not simply that there were more pfns than expected.
>> Hmm, "both ways" to me would mean ...
>>
>>> @@ -459,24 +462,20 @@ static int send_checkpoint_dirty_pfn_list(struct 
>>> xc_sr_context *ctx)
>>>          goto err;
>>>      }
>>>  
>>> -    for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
>>> +    for ( i = 0, written = 0; count && i < ctx->restore.p2m_size; ++i, 
>>> --count )
>>>      {
>>>          if ( !test_bit(i, dirty_bitmap) )
>>>              continue;
>>>  
>>> -        if ( written > count )
>>> -        {
>>> -            ERROR("Dirty pfn list exceed");
>>> -            goto err;
>>> -        }
>>> -
>>>          pfns[written++] = i;
>>>      }
>>>  
>>> -    rec.length = count * sizeof(*pfns);
>>> -
>>> -    iov[1].iov_base = pfns;
>>> -    iov[1].iov_len = rec.length;
>>> +    if ( written != stats.dirty_count )
>>> +    {
>>> +        ERROR("Mismatch between dirty bitmap bits (%u), and dirty_count 
>>> (%u)",
>>> +              written, stats.dirty_count);
>>> +        goto err;
>>> +    }
>> ... you then also check that there are no further bit set in the
>> bitmap. As said elsewhere, I'm not convinced using statistics as
>> a basis for actual operation (rather than just reporting) is
>> appropriate.
> 
> I'm not interested in inference based on the name of the structure.

I'm afraid that's the problem: Because you started using it for
something it wasn't meant to be used for, you now think it's the
name that's misleading, and the use is okay. I remain of the
opinion that it's the other way around (but see below for there
not being an real dependency at least in this particular case).

>>  I'm unaware of there being any spelled out guarantee
>> that the numbers reported back from the hypercall are accurate.
> 
> The live loop uses this information already for this purpose.  If it is
> wrong, we've got bigger problems that this.

send_memory_live() passes the value to send_dirty_pages(), which
in turn passes it only to xc_report_progress_step() and uses it
in

    if ( written > entries )
        DPRINTF("Bitmap contained more entries than expected...");

There's no relying on this number at all afaics.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.