[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 v3 15/15] tools/ocaml/libs/xc: fix use of uninitialized memory in shadow_allocation_get


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Edwin Torok <edvin.torok@xxxxxxxxxx>
  • Date: Wed, 9 Nov 2022 13:52:41 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AshlGvsUEL8E+f1ma1eeySPM/TeTkUHP3JiBRTHltU0=; b=h3WiiyS7F8+MskQIRgrV4E2Fq3IcZh3daUQW2Jvh3tEmCmByqk6q2+Vsg+l1RmWZOypO50ypqRZrJylFndpeY6AfR4Nh7spUq8MtpyNjbZRz5dE6Fk4BCXJ50rCo2tXfFrc6P9J0IMdZCFyghyjTNmP7BcETaoQO4/W+Zf0qjkCStt3iukdOAjTZvcY5uPqOHs2vTaC3GuhTYyG4HX2IgyYd1EHO4uJdTgDeGbxWuZiwZGnEo5GIX7j2lYHXWxbJrZaydDU0rZWpvXs/KncfWYhbvtchn+phycleX8WGBjPnxL4n0ZsU3qsxGFAew1zkXfCVBDdCrYkVQX+9qzLDOw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cUo8lHUqKRmbMIXAUJA5fnNDhyTUh1ISxdhbcYaeeGQCBlmYbSMOs8lNKQ/m+CDZBI8a1/YlAJDfQsZdSr3fYDn8IBq2cxW51cCpx1NhU9OTw/+hLOVIWOj0DmdEyZxgc7ih38pfdSoq6cBCZb9/nSTHn6swYe6JwK7fCyqWR8l5vmYEGlUlV10mB3ci93NLQohFDDQH06F74q+deAjwJU3o4Kx4/4EEwriDeCDwfp6vNVy7qZwLeih/hFog112ag19sdTD6n9uzWueB89EtO0HgKEgFQUW2WHNRDEBbB5ZMY4rSOGdwqSevml204tLhujjp0zcLQZxw0R9/o8r7LQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Henry Wang <Henry.Wang@xxxxxxx>, Christian Lindig <christian.lindig@xxxxxxxxxx>, David Scott <dave@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Wed, 09 Nov 2022 13:52:56 +0000
  • Ironport-data: A9a23:Wk40cao4cvR9cH7aI3PXAkDcM/VeBmIrZBIvgKrLsJaIsI4StFCzt garIBmAa6mMYGGmco0jYIvioE4GvMSByd5lSwturik0En8R85uZCYyVIHmrMnLJJKUvbq7FA +Y2MYCccZ9uHhcwgj/3b9ANeFEljfngqoLUUbKCYWYpAFc+E0/NsDo788YhmIlknNOlNA2Ev NL2sqX3NUSsnjV5KQr40YrawP9UlKm06W1wUmAWP6gR5gaHzyhNVvrzGInqR5fGatgMdgKFb 76rIIGRpgvx4xorA9W5pbf3GmVirmn6ZFXmZtJ+AsBOszAazsAA+v9T2Mk0MC+7vw6hjdFpo OihgLTrIesf0g8gr8xGO/VQO3kW0aSrY9YrK1Dn2SCY5xWun3cBX5yCpaz5VGEV0r8fPI1Ay RAXAGFcUk2xre6T+pWQU/ZsnZQmCM/UI6pK7xmMzRmBZRonabbqZvySoPV+g3I3jM0IGuvCb c0EbzYpdA7HfxBEJlYQDtQ5gfusgX78NTZfrTp5p4JuuzSVkFM3j+CraYKJEjCJbZw9ckKwq 27Y/mK/HhYAM9+3wjuZ6HO8wOTImEsXXapCSOXmp6Iy2TV/wEQJDyIRaWmyqMWIh1eHQuwYL mY7uS0x+P1aGEuDC4OVsweDiG6JuFsQVsRdF8U+6RqR0ezE7gCBHG8GQzVdLts8u6ceZTEsz E7PoNrvChRmqrjTQnWYnp+YpC2/ETIYJmgDYWkDVwRty8bniJE+iFTIVNkLLUKuptj8GDW1y TbVqiE73u0XlZRSj/n9+k3biTWxoJSPVhQy+gjcQmOi6EV+eZKhYIurr1Pc6J6sMbqkc7VIh 1Bc8+D20QzEJcvlePClKAnVIIyU2g==
  • Ironport-hdrordr: A9a23:CVO186Gz8hHJ8/q8pLqFS5HXdLJyesId70hD6qkvc3Fom52j/f xGws5x6fatskdrZJkh8erwW5Vp2RvnhNNICPoqTM2ftW7dySeVxeBZnMHfKljbdxEWmdQtsp uIH5IeNDS0NykDsS+Y2nj2Lz9D+qjgzEnAv463oBlQpENRGthdBmxCe2Sm+zhNNW177O0CZf +hD6R8xwaISDAyVICWF3MFV+/Mq5ngj5T9eyMLABYh9U2nkS6owKSSKWnY4j4uFxd0hZsy+2 nMlAL0oo+5teug9xPa32jPq7xLhdrazMdZDsDksLlUFtyssHfqWG1SYczGgNkHmpDq1L/sqq iKn/4UBbUw15oWRBDynfKi4Xi47N9k0Q6e9bbRuwqenSW+fkN1NyMJv/MmTjLJr0Unp91yy6 RNwiaQsIdWFwrJmGDn68HPTAwCrDv8nZKz+dRj8EC3fLFuH4O5l7Zvin99AdMFBmb3+YonGO 5hAIXV4+tXa0qTazTcsnN0yNKhU3wvFlPeK3Jy8fC9wnxThjR03kEYzMsQkjMJ8488UYBN46 DBPr5znL9DQ8cKZeZ2BfsHQ8GwFmvRKCi8eF66MBDiDuUKKnjNo5n47PE84/yrYoUByN8olJ HIQDpjxBoPkoLVeLizNbFwg2PwqT+GLEXQI+llluhEk6y5Qqb3OiueT11rm9e8opwkc7/mZ8 o=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY84hP/JHefKuHJEOBKR0Y52MvLq42nfQA
  • Thread-topic: [PATCH for-4.17 v3 15/15] tools/ocaml/libs/xc: fix use of uninitialized memory in shadow_allocation_get

On 8 Nov 2022, at 15:34, Edwin Török <edvin.torok@xxxxxxxxxx> wrote:
> 
> It has been noticed in 2013 that shadow allocation sometimes returns the
> wrong value, which got worked around by adding a limit to the shadow
> multiplier of 1000 and ignoring the value from Xen in that case
> to avoid a shadow multiplier causing a VM to request 6PB of memory for
> example:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fxapi-project%2Fxen-api%2Fpull%2F1215%2Fcommits%2Fbe55a8c30b41d1cd7596fc100ab1cfd3539f74eb&amp;data=05%7C01%7Cedvin.torok%40citrix.com%7C54fa199055674737536f08dac19f7026%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C638035187781870066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=l3%2BSDQinoqZ9CZvWsAcXgFl5vEbJf7hjVzBPLKoVYp4%3D&amp;reserved=0
> 
> However that is just a workaround, and I've just reproduced this by
> killing a VM mid migration, which resulted in a shadow multiplier of
> 629.42, rendering the VM unbootable even after a host reboot.


After some more discussion it looks like this is getting fixed in the 
hypervisor, so this workaround wouldn't be needed,
might want to hold off on this patch until the domctl discussion is settled at:
https://lore.kernel.org/xen-devel/20221108113850.61619-1-roger.pau@xxxxxxxxxx/


Best regards,
--Edwin

> 
> The real bug is in Xen: when a VM is dying it will return '0' for paging
> op domctls and log a message at info level
> 'Ignoring paging op on dying domain', which leaves the 'mb' parameter
> uninitialized upon return from the domctl.
> 
> The binding also doesn't initialize the 'c->mb' parameter (it is meant
> to be used only when setting, not when querying the allocation),
> which results in the VM getting a shadow allocation (and thus multiplier)
> set based on what value happened to be currently on the stack.
> 
> Explicitly initialize the value passed to the domctl, and detect the 
> uninitialized
> case (shadow allocation of 0), and raise an exception in that case.
> The exception will cause xenopsd to skip setting the shadow multiplier.
> 
> Note that the behaviour of Xen here is inconsistent between x86 and ARM:
> ARM would return EINVAL when it gets a paging op on a dying domain,
> and X86-64 would return 0 with possibly uninitialized data.
> 
> It might be desirable to change the x86 path in the hypervisor to return
> EINVAL, although that would require more testing in case it breaks
> somethig.
> But the bindings should be defensive anyway against bugs like this.
> 
> Signed-off-by: Edwin Török <edvin.torok@xxxxxxxxxx>
> ---
> Reason for inclusion in 4.17:
> - fixes a long-standing (>9y old) bug that is still happening today
> 
> Changes since v2:
> - new in v3
> ---
> tools/ocaml/libs/xc/xenctrl_stubs.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index e2d897581f..9681a74e40 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -1019,7 +1019,7 @@ CAMLprim value stub_shadow_allocation_get(value xch, 
> value domid)
> {
>     CAMLparam2(xch, domid);
>     CAMLlocal1(mb);
> -    unsigned int c_mb;
> +    unsigned int c_mb = 0;
>     int ret;
> 
>     caml_enter_blocking_section();
> @@ -1029,6 +1029,9 @@ CAMLprim value stub_shadow_allocation_get(value xch, 
> value domid)
>     caml_leave_blocking_section();
>     if (ret != 0)
>         failwith_xc(_H(xch));
> +    if ( !c_mb )
> +        caml_failwith("domctl returned uninitialized data for shadow "
> +                      "allocation, dying domain?");
> 
>     mb = Val_int(c_mb);
>     CAMLreturn(mb);
> -- 
> 2.34.1
> 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.