[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tools/ocaml/xc: Drop the GC lock for all hypercalls



[gmail is a bit terrible and defaults to reply to single person not
reply all, resent]

There is one bug here that would cause a crash, and several instances
of undefined behaviour.

On Mon, Sep 2, 2024 at 9:10 AM Edwin Torok <edwin.torok@xxxxxxxxx> wrote:
>
> On Fri, Aug 30, 2024 at 6:53 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> 
> wrote:
> >
> > We should be doing this unilaterally.
>
> Agreed, but we should do it safely, since last time I did this I
> learned about a few more instances of behaviours I previously thought
> to be safe, but that are undefined behaviour.
> Which probably means we have a bunch of other code to fixup (I should
> really finish my static analyzer project, and update it with the newly
> learned rules to catch all these...).
> See below for comments.
>
> Although there is one bug here we've previously known to avoid:
> String_val cannot be dereferenced with the lock released, that one is
> an OCaml value and will cause actual problems,
> so we need to caml_copy_string that one.
>
> >
> > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > ---
> > CC: Christian Lindig <christian.lindig@xxxxxxxxxx>
> > CC: David Scott <dave@xxxxxxxxxx>
> > CC: Edwin Török <edwin.torok@xxxxxxxxx>
> > CC: Rob Hoes <Rob.Hoes@xxxxxxxxxx>
> > CC: Andrii Sultanov <andrii.sultanov@xxxxxxxxx>
> >
> > Also pulled out of a larger cleanup series.
> > ---
> >  tools/ocaml/libs/xc/xenctrl_stubs.c | 63 +++++++++++++++++++++++++++--
> >  1 file changed, 60 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> > b/tools/ocaml/libs/xc/xenctrl_stubs.c
> > index c78191f95abc..20487b21008f 100644
> > --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> > +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> > @@ -312,7 +312,10 @@ CAMLprim value stub_xc_domain_max_vcpus(value xch_val, 
> > value domid,
> >         xc_interface *xch = xch_of_val(xch_val);
> >         int r;
> >
> > +       caml_enter_blocking_section();
> >         r = xc_domain_max_vcpus(xch, Int_val(domid), Int_val(max_vcpus));
>
> We need to move the Int_val macros out of here, domid is registered as
> a GC root, so the GC *will* write to it (it'll write the same value).
> So in practice it probably won't cause any observable corruption, but
> is still undefined behaviour and may not play nicely with compiler
> optimizations.
>
> This would probably be easier to review in a git tree, because there
> isn't enough context in the patch to see which values are registered
> as GC roots or not.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (r)
> >                 failwith_xc(xch);
> >
> > @@ -329,7 +332,10 @@ value stub_xc_domain_sethandle(value xch_val, value 
> > domid, value handle)
> >
> >         domain_handle_of_uuid_string(h, String_val(handle));
> >
> > +       caml_enter_blocking_section();
> >         i = xc_domain_sethandle(xch, Int_val(domid), h);
>
> GC root, need to move the macro out and assign to a local value.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (i)
> >                 failwith_xc(xch);
> >
> > @@ -391,7 +397,10 @@ CAMLprim value stub_xc_domain_shutdown(value xch_val, 
> > value domid, value reason)
> >         xc_interface *xch = xch_of_val(xch_val);
> >         int ret;
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_domain_shutdown(xch, Int_val(domid), Int_val(reason));
>
> GC roots again.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > @@ -503,7 +512,10 @@ CAMLprim value stub_xc_domain_getinfo(value xch_val, 
> > value domid)
> >         xc_domaininfo_t info;
> >         int ret;
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_domain_getinfo_single(xch, Int_val(domid), &info);
>
> GC root.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > @@ -546,7 +558,10 @@ CAMLprim value stub_xc_vcpu_context_get(value xch_val, 
> > value domid,
> >         int ret;
> >         vcpu_guest_context_any_t ctxt;
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_vcpu_getcontext(xch, Int_val(domid), Int_val(cpu), &ctxt);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if ( ret < 0 )
> >                 failwith_xc(xch);
> >
> > @@ -584,10 +599,14 @@ CAMLprim value stub_xc_vcpu_setaffinity(value 
> > xch_val, value domid,
> >                 if (Bool_val(Field(cpumap, i)))
> >                         c_cpumap[i/8] |= 1 << (i&7);
> >         }
> > +
> > +       caml_enter_blocking_section();
> >         retval = xc_vcpu_setaffinity(xch, Int_val(domid),
> >                                      Int_val(vcpu),
> >                                      c_cpumap, NULL,
> >                                      XEN_VCPUAFFINITY_HARD);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         free(c_cpumap);
> >
> >         if (retval < 0)
> > @@ -612,10 +631,13 @@ CAMLprim value stub_xc_vcpu_getaffinity(value 
> > xch_val, value domid,
> >         if (c_cpumap == NULL)
> >                 failwith_xc(xch);
> >
> > +       caml_enter_blocking_section();
> >         retval = xc_vcpu_getaffinity(xch, Int_val(domid),
> >                                      Int_val(vcpu),
> >                                      c_cpumap, NULL,
> >                                      XEN_VCPUAFFINITY_HARD);
>
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (retval < 0) {
> >                 free(c_cpumap);
> >                 failwith_xc(xch);
> > @@ -639,9 +661,13 @@ CAMLprim value stub_xc_sched_id(value xch_val)
> >  {
> >         CAMLparam1(xch_val);
> >         xc_interface *xch = xch_of_val(xch_val);
> > -       int sched_id;
> > +       int ret, sched_id;
> > +
> > +       caml_enter_blocking_section();
> > +       ret = xc_sched_id(xch, &sched_id);
> > +       caml_leave_blocking_section();
> >
> > -       if (xc_sched_id(xch, &sched_id))
> > +       if (ret)
> >                 failwith_xc(xch);
> >
> >         CAMLreturn(Val_int(sched_id));
> > @@ -674,7 +700,10 @@ CAMLprim value stub_xc_evtchn_reset(value xch_val, 
> > value domid)
> >         xc_interface *xch = xch_of_val(xch_val);
> >         int r;
> >
> > +       caml_enter_blocking_section();
> >         r = xc_evtchn_reset(xch, Int_val(domid));
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (r < 0)
> >                 failwith_xc(xch);
> >         CAMLreturn(Val_unit);
> > @@ -811,7 +840,10 @@ CAMLprim value stub_xc_send_debug_keys(value xch_val, 
> > value keys)
> >         xc_interface *xch = xch_of_val(xch_val);
> >         int r;
> >
> > +       caml_enter_blocking_section();
> >         r = xc_send_debug_keys(xch, String_val(keys));
>
> This is clearly unsafe because String_val dereferences an OCaml value
> with the lock released, which is not allowed.
> You need to copy the string to a C string, and free it afterwards.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (r)
> >                 failwith_xc(xch);
> >         CAMLreturn(Val_unit);
> > @@ -952,7 +984,11 @@ CAMLprim value stub_xc_domain_set_memmap_limit(value 
> > xch_val, value domid,
> >         int retval;
> >
> >         v = Int64_val(map_limitkb);
> > +
> > +       caml_enter_blocking_section();
> >         retval = xc_domain_set_memmap_limit(xch, Int_val(domid), v);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (retval)
> >                 failwith_xc(xch);
> >
> > @@ -1203,8 +1239,11 @@ CAMLprim value 
> > stub_xc_domain_ioport_permission(value xch_val, value domid,
> >         c_nr_ports = Int_val(nr_ports);
> >         c_allow = Bool_val(allow);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_domain_ioport_permission(xch, Int_val(domid),
> >                                          c_start_port, c_nr_ports, c_allow);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > @@ -1225,8 +1264,11 @@ CAMLprim value stub_xc_domain_iomem_permission(value 
> > xch_val, value domid,
> >         c_nr_pfns = Nativeint_val(nr_pfns);
> >         c_allow = Bool_val(allow);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_domain_iomem_permission(xch, Int_val(domid),
> >                                          c_start_pfn, c_nr_pfns, c_allow);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > @@ -1245,8 +1287,11 @@ CAMLprim value stub_xc_domain_irq_permission(value 
> > xch_val, value domid,
> >         c_pirq = Int_val(pirq);
> >         c_allow = Bool_val(allow);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_domain_irq_permission(xch, Int_val(domid),
> >                                        c_pirq, c_allow);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > @@ -1309,7 +1354,9 @@ CAMLprim value 
> > stub_xc_domain_test_assign_device(value xch_val, value domid, val
> >         func = Int_val(Field(desc, 3));
> >         sbdf = encode_sbdf(domain, bus, dev, func);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_test_assign_device(xch, Int_val(domid), sbdf);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> >
> >         CAMLreturn(Val_bool(ret == 0));
> >  }
> > @@ -1328,8 +1375,10 @@ CAMLprim value stub_xc_domain_assign_device(value 
> > xch_val, value domid, value de
> >         func = Int_val(Field(desc, 3));
> >         sbdf = encode_sbdf(domain, bus, dev, func);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_assign_device(xch, Int_val(domid), sbdf,
> >                                XEN_DOMCTL_DEV_RDM_RELAXED);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> >
> >         if (ret < 0)
> >                 failwith_xc(xch);
> > @@ -1350,7 +1399,9 @@ CAMLprim value stub_xc_domain_deassign_device(value 
> > xch_val, value domid, value
> >         func = Int_val(Field(desc, 3));
> >         sbdf = encode_sbdf(domain, bus, dev, func);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_deassign_device(xch, Int_val(domid), sbdf);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> >
> >         if (ret < 0)
> >                 failwith_xc(xch);
> > @@ -1379,8 +1430,11 @@ CAMLprim value stub_xc_get_cpu_featureset(value 
> > xch_val, value idx)
> >                 /* To/from hypervisor to retrieve actual featureset */
> >                 uint32_t fs[fs_len], len = fs_len;
> >                 unsigned int i;
> > +               int ret;
> >
> > -               int ret = xc_get_cpu_featureset(xch, Int_val(idx), &len, 
> > fs);
> > +               caml_enter_blocking_section();
> > +               ret = xc_get_cpu_featureset(xch, Int_val(idx), &len, fs);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +               caml_leave_blocking_section();
> >
> >                 if (ret)
> >                         failwith_xc(xch);
> > @@ -1403,7 +1457,10 @@ CAMLprim value stub_xc_watchdog(value xch_val, value 
> > domid, value timeout)
> >         int ret;
> >         unsigned int c_timeout = Int32_val(timeout);
> >
> > +       caml_enter_blocking_section();
> >         ret = xc_watchdog(xch, Int_val(domid), c_timeout);
>
> Haven't checked these anymore, but I assume they are GC roots.
>
> > +       caml_leave_blocking_section();
> > +
> >         if (ret < 0)
> >                 failwith_xc(xch);
> >
> > --
> > 2.39.2
> >



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.