[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/6] x86/HVM: split restore state checking from state loading



On Tue, Dec 05, 2023 at 09:52:31AM +0100, Jan Beulich wrote:
> On 04.12.2023 18:27, Roger Pau Monné wrote:
> > On Tue, Nov 28, 2023 at 11:34:04AM +0100, Jan Beulich wrote:
> >> ..., at least as reasonably feasible without making a check hook
> >> mandatory (in particular strict vs relaxed/zero-extend length checking
> >> can't be done early this way).
> >>
> >> Note that only one of the two uses of hvm_load() is accompanied with
> >> hvm_check(). The other directly consumes hvm_save() output, which ought
> >> to be well-formed. This means that while input data related checks don't
> >> need repeating in the "load" function when already done by the "check"
> >> one (albeit assertions to this effect may be desirable), domain state
> >> related checks (e.g. has_xyz(d)) will be required in both places.
> >>
> >> Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> ---
> >> Do we really need all the copying involved in use of _hvm_read_entry()
> >> (backing hvm_load_entry()? Zero-extending loads are likely easier to
> >> handle that way, but for strict loads all we gain is a reduced risk of
> >> unaligned accesses (compared to simply pointing into h->data[]).
> > 
> > See below, but I wonder whether the checks could be performed as part
> > of hvm_load() without having to introduce a separate handler and loop
> > over the context entries.
> 
> Specifically not. State loading (in the longer run) would better not fail
> once started. (Imo it should have been this way from the beginning.) Only
> then will the vCPU still be in a predictable state even after a possible
> error.

Looking at the callers, does such predictable state after failure
matter?

One caller is an hypercall used by the toolstack at domain create,
failing can just lead to the domain being destroyed.  The other caller
is vm fork, which will also lead to the fork being destroyed if
context loading fails.

Maybe I'm overlooking something.

> >> Would the hvm_sr_handlers[] better use array_access_nospec()?
> > 
> > Maybe?  Given this is a domctl I do wonder whether a domain already
> > having access to such interface won't have easier ways to leak data
> > from Xen.  Maybe for a disaggregated setup.
> 
> Hmm, now we're in the middle - Andrew effectively said "no need to".

I'm certainly not an expert on whether array_access_nospec() should be
used, so if Andrew says no need, that's likely better advice.

Maybe the xsm check used in such desegregated setups would already
stop speculation?

> >> @@ -275,6 +281,78 @@ int hvm_save(struct domain *d, hvm_domai
> >>      return 0;
> >>  }
> >>  
> >> +int hvm_check(const struct domain *d, hvm_domain_context_t *h)
> >> +{
> >> +    const struct hvm_save_header *hdr;
> >> +    int rc;
> >> +
> >> +    if ( d->is_dying )
> >> +        return -EINVAL;
> >> +
> >> +    /* Get at the save header, which must be first. */
> >> +    hdr = hvm_get_entry(HEADER, h);
> >> +    if ( !hdr )
> >> +        return -ENODATA;
> >> +
> >> +    rc = arch_hvm_check(d, hdr);
> >> +    if ( rc )
> >> +        return rc;
> >> +
> >> +    for ( ; ; )
> >> +    {
> >> +        const struct hvm_save_descriptor *desc;
> >> +        hvm_check_handler handler;
> >> +
> >> +        if ( h->size - h->cur < sizeof(*desc) )
> >> +        {
> >> +            /* Run out of data */
> >> +            printk(XENLOG_G_ERR
> >> +                   "HVM restore %pd: save did not end with a null 
> >> entry\n",
> >> +                   d);
> >> +            return -ENODATA;
> >> +        }
> >> +
> >> +        /* Read the typecode of the next entry and check for the 
> >> end-marker. */
> >> +        desc = (const void *)&h->data[h->cur];
> >> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
> >> +        {
> >> +            /* Reset cursor for hvm_load(). */
> >> +            h->cur = 0;
> >> +            return 0;
> >> +        }
> >> +
> >> +        /* Find the handler for this entry. */
> >> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
> >> +             !hvm_sr_handlers[desc->typecode].name ||
> >> +             !hvm_sr_handlers[desc->typecode].load )
> >> +        {
> >> +            printk(XENLOG_G_ERR "HVM restore %pd: unknown entry typecode 
> >> %u\n",
> >> +                   d, desc->typecode);
> >> +            return -EINVAL;
> >> +        }
> >> +
> >> +        /* Check the entry. */
> >> +        handler = hvm_sr_handlers[desc->typecode].check;
> >> +        if ( !handler )
> >> +        {
> >> +            if ( desc->length > h->size - h->cur - sizeof(*desc) )
> >> +                return -ENODATA;
> >> +            h->cur += sizeof(*desc) + desc->length;
> >> +        }
> >> +        else if ( (rc = handler(d, h)) )
> >> +        {
> >> +            printk(XENLOG_G_ERR
> >> +                   "HVM restore %pd: failed to check %s:%u rc %d\n",
> >> +                   d, hvm_sr_handlers[desc->typecode].name, 
> >> desc->instance, rc);
> >> +            return rc;
> >> +        }
> >> +
> >> +        process_pending_softirqs();
> > 
> > Looking at this, won't it be better to call the check() hooks inside
> > the hvm_load() function instead of duplicating the loop?
> > 
> > I realize that you only perform the checks when the state is loaded
> > from a domctl, but still seems quite a lot of code duplication for
> > little benefit.
> > 
> > hvm_load() could gain an extra parameter to select whether the input
> > must be checked or not, and that would avoid having to iterate twice
> > over the context.
> 
> Well, see above.
> 
> >> +    }
> >> +
> >> +    /* Not reached */
> > 
> > ASSERT_UNREACHABLE() maybe?
> 
> Hmm, I'd find it kind of odd to have such here. While hvm_load() doesn't
> have such either, perhaps that's not a meaningful reference. Adding this
> would make me fear introducing a Misra violation (adding dead code).

But isn't this the purpose of ASSERT_UNREACHABLE() exactly?  IOW:
Misra will need an exception for all usage of ASSERT_UNREACHABLE()
already.

I think ASSERT_UNREACHABLE() is much better than a Not reached
comment: conveys the same information to readers of the code and has
a run-time consequence on debug builds.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.