[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 06/12] libxenguest: guard against overflow from too large p2m when checkpointing

  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 25 Jun 2021 20:00:23 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yPGvVCo7omd/hK9DMmyUOmC/PwZNwFLSWdPE566IVlg=; b=UAG4LadrOpHvQsuTSsNC6uT78aNpcBACuCcy88/egVJ4tK5CoaM9zuuDITytHxZlUG3mYG0IqT4QFwQbP3ownBeolio1nlqOZF6zFmZQr++Y7Uv1P86B3khcfmnjfhOmxFohN7xJjgjmnYr4T/XVQLX3R3sbkE5VD5wMBLhbLbZ6e9KhQRdGz1ift6PTh8Z7TrW+O7PFVLYsKvevUG4Q+dAFFpl9gX/OJMNAWdMtrsaxCEbOZz1yW9+SUpPWXXh1Y1gkPpsslT2osnR1htjlG70UE94BhPws2cQX1u+p8eJKh7x80OqvccEHiOGb9I25f2WAnw734RQBm9lQq7Q3nA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i75wVq6q/PQIP2qwxKc61QcV+Qk1PEbukeHMI7U94nlGZW40BJ5lu7XGukfKvl3j3aGGiUSf5pDtofcWm4JmJkcS77mfAuZAcqjz40IXh4X12BRfLQ0pOHP12ee9WNTaTn0UyHZyqphajcxZCcxtle2utHroq8Rw6VG351z6pmEuCr4TGvzhY8+C5mDPlR9Y7rfpy8NXCfepHn1I06vob657Q0CUiuwLughaOUzSLo5Pm7LVAYlz6W/2kO9l2GKhCWuKauh2dCY/eNXYDPXjyTY/M7aqxR1Vd7BJ7oqnNiPYfMyHfKIaSLD7UUC42kyx5u80lfYP39Nzw8TeVVZZRA==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Fri, 25 Jun 2021 19:00:45 +0000
  • Ironport-hdrordr: A9a23:KoKFYK+4Go4zhqtyeT5uk+E6db1zdoMgy1knxilNoENuHfBwxv rDoB1E73LJYVYqOU3Jmbi7Scy9qADnhOFICO4qTMuftWjdyRaVxeRZg7cKrAeQYxEWmtQtsp uINpIOcuEYbmIK/voSgjPIaurIqePvmMvD5Za8vgJQpENRGsVdBm9Ce3am+yZNNW977PQCZf ihD4Z81kGdkSN9VLXLOpBJZZmNm/T70LbdJTIWDR8u7weDyRuu9b7BChCdmjMTSSlGz7sO+X XM11WR3NTjj9iLjjvnk0PD5ZVfn9XsjvNFGcy3k8AQbhHhkByhaohNU6CL+Bo1vOaswlA3l8 SkmWZvA+1Dr1fqOk2lqxrk3AftlBw07WX59FOeiXz/5eTkWTMTEaN69MBkWyqcz3BlkMB30a pN0W7cnYFQFwn8kCP04MWNfw12l3CzvWEpnYco/j9iuLMlGftsRLEkjQRo+M9qJlO91GlnKp gvMCjk3ocSTbvABEqp51WGqbeXLwYO9hTveDlJhiXa6UkPoJjVp3FojfD3pU1wg67VfaM0rN gsAp4Y4I2mcfVmG56VJN1xDPdfWVa9DS4lDgqpUBza/fY8SgzwQtjMke4I2N0=
  • Ironport-sdr: UwFocDfWdxzpMhE0JG/eCIoeteSBtf333rQGGqHrl3l0WVeEozC//iaBbGOKfJANLGxTeCAgit IlwHoJQ4LViz6hBcQFXkRuD89QtQsYr9bWLCilgsLaXvYY1JjD++F9Nf7wQH6MYOCsLvxj1hKz y0Q33fZiHAjZnCI1gNbjfifqfhszsStSxW55EQFa3h/d/TbC3uJAqDw0qV1dVY5h+GWSR2xcFB TYpj2X58fGw8kBeKjM5xKgyjk62UqS4n0BzlB709h/WZEQaEtIe3R2jd52ThWsFfiWiGth+t1E NKs=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25/06/2021 14:20, Jan Beulich wrote:
> struct xc_sr_record's length field has just 32 bits.

The stream max record length is

/* Somewhat arbitrary - 128MB */
#define REC_LENGTH_MAX                (128U << 20)

and is checked in the low level helpers, making the upper bound on the
number of PFNs 0xFFFFFF once the record header is taken into account.

There doesn't appear to have been any consideration made to what happens
if this number gets too large.  That said, the replication will totally
fall apart if it ever gets to a fraction of this, because this is the
list of pages the source side needs to send again in addition to
whatever *it* dirtied, as it is the state we've lost on the destination
side by permitting the VM to run live.

The common case is that, when execution diverges, the dirtied pages on
source and destination will be almost the same, so merging this on the
source side shouldn't lead to many superfluous pages needing to be sent.

>  Fill it early and
> check that the calculated value hasn't overflowed. Additionally check
> for counter overflow early - there's no point even trying to allocate
> any memory in such an event.
> While there also limit an induction variable's type to unsigned long:
> There's no gain from it being uint64_t.
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Of course looping over test_bit() is pretty inefficient, but given that
> I have no idea how to test this code I wanted to restrict changes to
> what can sensibly be seen as no worse than before from just looking at
> the changes.

At this point, I'm not sure it can be tested.  IIRC, COLO depends on
some functionality which didn't make its way upstream into Qemu.

> --- a/tools/libs/guest/xg_sr_restore.c
> +++ b/tools/libs/guest/xg_sr_restore.c
> @@ -450,7 +450,8 @@ static int send_checkpoint_dirty_pfn_lis
>      xc_interface *xch = ctx->xch;
>      int rc = -1;
>      unsigned int count, written;
> -    uint64_t i, *pfns = NULL;
> +    unsigned long i;
> +    uint64_t *pfns = NULL;
>      struct iovec *iov = NULL;
>      struct xc_sr_record rec = {
> @@ -469,16 +470,28 @@ static int send_checkpoint_dirty_pfn_lis
>      for ( i = 0, count = 0; i < ctx->restore.p2m_size; i++ )
>      {
> -        if ( test_bit(i, dirty_bitmap) )
> -            count++;
> +        if ( test_bit(i, dirty_bitmap) && !++count )

This is far too opaque logic.

Its also entirely unnecessary...  All this loop is doing is calculating
the size for the memory allocation below, and that can be done by using
the stats output from xc_logdirty_control(), which means it doesn't want
deleting in the earlier patch.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.