Xen project Mailing List

Re: [PATCH v2 2/3] xen/domain: fix UBSAN null pointer dereference of d->shared_info

To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>

Date: Wed, 3 Jun 2026 10:18:38 +0200

Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 03 Jun 2026 08:18:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 03.06.2026 10:07, Oleksii Kurochko wrote: > On 6/3/26 7:54 AM, Jan Beulich wrote: >> On 02.06.2026 18:11, Oleksii Kurochko wrote: >>> On 6/2/26 1:19 PM, Jan Beulich wrote: >>>> On 25.05.2026 15:20, Oleksii Kurochko wrote: >>>>> It is legal to have d->shared_info equal to NULL for architectures which >>>>> support only the FIFO ABI for event channel management. >>>>> >>>>> Having d->shared_info == NULL leads to a UBSAN issue on such >>>>> architectures: >>>>> UBSAN: Undefined behaviour in common/domain.c:325:10 >>>>> member access within null pointer of type 'struct >>>>> shared_info_t' >>>>> >>>>> vcpu_info_reset() maps v->vcpu_info_area.map to the per-vcpu slot inside >>>>> the domain's shared_info page for vcpus with id < XEN_LEGACY_MAX_VCPUS, >>>>> and falls back to dummy_vcpu_info for vcpus beyond that limit. >>>>> Extend the existing fallback condition to also cover the case where no >>>>> shared_info page has been allocated, mapping the vcpu to dummy_vcpu_info >>>>> instead. This is the correct behaviour: dummy_vcpu_info already serves >>>>> as the safe stand-in for vcpus that have no usable shared_info slot. >>>>> >>>>> Additionally, if an architecture supports only the FIFO ABI, setup_ports() >>>>> should be updated to avoid a NULL pointer dereference of d->shared_info, >>>>> since in that case there will be no pending events in >>>>> shared_info->evtchn_pending and the pending flag of the FIFO event channel >>>>> does not need to be set to true. >>>>> update_domain_wallclock_time() accesses d->shared_info via shared_info() >>>>> macro. On architectures that do not allocate a shared_info page (currently >>>>> RISC-V, which runs guests in dom0less mode without the PV ABI), this >>>>> causes >>>>> a NULL dereference. The early return is safe: if there is no shared_info >>>>> page, there is nothing to update. For all existing architectures (x86, >>>>> ARM) >>>>> that do allocate it, the guard is never taken and behavior is unchanged. >>>>> >>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> >>>>> --- >>>>> Changes in v2: >>>>> - Update commit message + subject. >>>>> - Drop Fixes tag. >>>>> - Handle migration of pending events from 2L and FIFO ABIs when arch >>>>> support only FIFO ABI. >>>> >>>> What does this item describe? On an arch supporting only FIFO, how could >>>> evtchn need migrating from 2L? >>> >>> Agree this item is inaccurate. >>> >>> evtchn_init() always calls evtchn_2l_init(d) first >>> (event_channel.c:1627), so every domain starts with 2L regardless of >>> arch (of course, it is just initialization of evtchn_port_ops which >>> aren't really used when only FIFO is supported). >>> >>> setup_ports() is called during the guest-initiated 2L→FIFO transition >>> (event_fifo.c:637), not at arch init time. There is no arch that >>> supports "only FIFO" as a starting state and that is why it is needed to >>> guard setup_ports() against NULL d->shared_info when migrating 2L >>> pending state to FIFO even 2L wasn't really used by an arch with only >>> FIFO support. >> >> Imo on arch-es not supporting 2L, domains shouldn't start in 2L mode. > > Agree but will it be easy to achieve now with the current code base? > > The best what could be done it is avoid calling evtchn_2l_init() now in > event_channel.c and: > 1. Add a new Kconfig symbol, CONFIG_HAS_EVTCHN_2L (or re-use > HAS_SHARED_INFO suggested before), selected by x86 and ARM. I'd stick to just HAS_SHARED_INFO as long as a separate control for 2- level evtchn isn't strictly needed. > 2. In evtchn_init() (event_channel.c:1627), guard the call: > #ifdef CONFIG_HAS_EVTCHN_2L > evtchn_2l_init(d); > #else > evtchn_none_init(d); > #endif > 3. Add a small stub ops table (probably in event_fifo.c or a new > event_none.c) with no-op set_pending/clear_pending/unmask, is_pending > returning false, is_masked returning true (valid until > evtchn_fifo_init_control() replaces them). That's one of the options (the stubs could then as well live in event_channel.c). Another might be to put the FIFO ops in place right away, making sure they can cope with evtchn_fifo_init_control() not having been called yet. >>>>> --- a/xen/common/domain.c >>>>> +++ b/xen/common/domain.c >>>>> @@ -320,7 +320,7 @@ void vcpu_info_reset(struct vcpu *v) >>>>> struct domain *d = v->domain; >>>>> >>>>> v->vcpu_info_area.map = >>>>> - ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS) >>>>> + ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS && d->shared_info) >>>>> ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id]) >>>>> : &dummy_vcpu_info); >>>>> } >>>> >>>> At the example of this: The extra conditionals are dead code on x86 and >>>> Arm. While the status of the respective Misra rule is still uncertain >>>> for Xen, imo we'd be better off avoiding the introduction of new dead >>>> code. Which in turn means we may need some kind of abstraction to have >>>> these extra conditionals in place only for arch-es not supporting >>>> shared-info at all. >>> >>> What about then add config HAS_SHARED_INFO to xen/common/Kconfig and then: >> >> We're getting closer. Imo we want to go farther, though: shared_info() as a >> construct should be unavailable when !HAS_SHARED_INFO. _That_ then will >> make obvious (by causing build failures) that all respective use sites were >> properly dealt with. > > I will add then: > > +#ifdef CONFIG_HAS_SHARED_INFO > #define shared_info(d, field) __shared_info(d, (d)->shared_info, > field) > +#endif > > But with doing that we have only option of using #ifdef HAS_SHARED_INFO > in the place where shared_info() is used. If it is fine then I will be > happy to do in this way. Well, I gave a suggestion to avoid such #ifdef-ary, ... >>> --- a/xen/common/domain.c >>> +++ b/xen/common/domain.c >>> @@ -319,10 +319,14 @@ void vcpu_info_reset(struct vcpu *v) >>> { >>> struct domain *d = v->domain; >>> >>> +#ifdef CONFIG_HAS_SHARED_INFO >>> v->vcpu_info_area.map = >>> - ((v->vcpu_id < XEN_LEGACY_MAX_VCPUS && d->shared_info) >>> - ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id]) >>> - : &dummy_vcpu_info); >>> + (v->vcpu_id < XEN_LEGACY_MAX_VCPUS) >>> + ? (vcpu_info_t *)&shared_info(d, vcpu_info[v->vcpu_id]) >>> + : &dummy_vcpu_info; >>> +#else >>> + v->vcpu_info_area.map = &dummy_vcpu_info; >>> +#endif >>> } >> >> I agree with #ifdef here. >> >>> --- a/xen/common/event_fifo.c >>> +++ b/xen/common/event_fifo.c >>> @@ -562,9 +562,10 @@ static void setup_ports(struct domain *d, unsigned >>> int prev_evtchns) >>> >>> evtchn = evtchn_from_port(d, port); >>> >>> - if ( d->shared_info && >>> - guest_test_bit(d, port, &shared_info(d, evtchn_pending)) ) >>> +#ifdef CONFIG_HAS_SHARED_INFO >>> + if ( guest_test_bit(d, port, &shared_info(d, evtchn_pending)) ) >>> evtchn->pending = true; >>> +#endif >> >> While as per above shared_info() would best not exist when !HAS_SHARED_INFO >> (in which case #ifdef may be unavoidable here), an alternative where >> IS_ENABLED() could be used here may want at least considering. E.g. >> causing a link-time failure when shared_info() is used (and not compiled >> out). ... here. There are downsides to this, so which route to go needs settling on. >>> --- a/xen/common/time.c >>> +++ b/xen/common/time.c >>> @@ -94,8 +94,9 @@ void update_domain_wallclock_time(struct domain *d) >>> uint32_t *wc_version; >>> uint64_t sec; >>> >>> - if ( !d->shared_info ) >>> - return; >>> +#ifndef CONFIG_HAS_SHARED_INFO >>> + return; >>> +#endif >>> >>> spin_lock(&wc_lock); >> >> Constructs like this are imo somewhat ugly. Using IS_ENABLED() instead >> would make things at least a little better (again imo). > > Considering mentioned above it would be better to #ifdef the whole buddy > of the function: > > void update_domain_wallclock_time(struct domain *d) > { > +#ifdef CONFIG_HAS_SHARED_INFO > + > uint32_t *wc_version; > uint64_t sec; > > -#ifndef CONFIG_HAS_SHARED_INFO > - return; > -#endif > - > spin_lock(&wc_lock); > > wc_version = &shared_info(d, wc_version); > @@ -120,6 +118,8 @@ void update_domain_wallclock_time(struct domain *d) > *wc_version = version_update_end(*wc_version); > > spin_unlock(&wc_lock); > + > +#endif /* CONFIG_HAS_SHARED_INFO */ > } In which case there's no point having use sites actually call here. I.e. we then may want to have an inline stub when !HAS_SHARED_INFO. > Considering also that shared_info is expected to be used only for 2L > then it would be better to introduce CONFIG_HAS_EVTCHN_2L instead. How does the event channel model in use matter for update_domain_wallclock_time()? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.