Xen project Mailing List

Re: [PATCH v1 2/3] xen/domain: fix UBSAN null pointer dereference in vcpu_info_reset()

To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>

Date: Wed, 20 May 2026 16:21:59 +0200

Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: Baptiste Le Duc <baptiste.le-duc@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 20 May 2026 14:22:09 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20.05.2026 15:40, Oleksii Kurochko wrote: > > > On 5/20/26 2:03 PM, Jan Beulich wrote: >> On 20.05.2026 13:33, Oleksii Kurochko wrote: >>> >>> >>> On 5/19/26 1:53 PM, Jan Beulich wrote: >>>> On 19.05.2026 13:22, Oleksii Kurochko wrote: >>>>> On 5/19/26 12:55 PM, Oleksii Kurochko wrote: >>>>>> On 5/19/26 11:37 AM, Jan Beulich wrote: >>>>>>> On 19.05.2026 10:39, Oleksii Kurochko wrote: >>>>>>>> vcpu_info_reset() maps v->vcpu_info_area.map to the per-vcpu slot >>>>>>>> inside >>>>>>>> the domain's shared_info page for vcpus with id < XEN_LEGACY_MAX_VCPUS, >>>>>>>> and falls back to dummy_vcpu_info for vcpus beyond that limit. >>>>>>>> >>>>>>>> However, it does not guard against d->shared_info being NULL. The >>>>>>>> shared_info() macro expands to a member access through d->shared_info, >>>>>>>> so when an architecture does not allocate a shared_info page the >>>>>>>> dereference triggers UBSAN: >>>>>>>> UBSAN: Undefined behaviour in common/domain.c:325:10 >>>>>>>> member access within null pointer of type 'struct shared_info_t' >>>>>>>> >>>>>>>> Extend the existing fallback condition to also cover the case where no >>>>>>>> shared_info page has been allocated, mapping the vcpu to >>>>>>>> dummy_vcpu_info >>>>>>>> instead. This is the correct behaviour: dummy_vcpu_info already serves >>>>>>>> as the safe stand-in for vcpus that have no usable shared_info slot. >>>>>>>> >>>>>>>> Fixes: 295514ff75506 ("common: convert vCPU info area registration") >>>>>>> >>>>>>> I question this, largely (but not only) because I also ... >>>>>>> >>>>>>>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> >>>>>>>> Reviewed-by: Baptiste Le Duc <baptiste.le-duc@xxxxxxxxxx> >>>>>>>> --- >>>>>>>> RISC-V does not allocate a shared_info page at the momemnt because its >>>>>>>> guests run in dom0less mode and do not use the Xen PV ABI, so >>>>>>>> d->shared_info remains NULL throughout domain lifetime. >>>>>>> >>>>>>> ... question this mode of operation. Yes, you may (for now) be able to >>>>>>> get >>>>>>> away without, but e.g. event channels will want supporting at some >>>>>>> point. >>>>>>> Which will require a shared info page. Better put that in place right >>>>>>> away, >>>>>>> even if the guests you test with don't use it (yet). Certain other >>>>>>> common >>>>>>> code also assumes d->shared_info to never be NULL for an alive domain. >>>>>>> >>>>>> >>>>>> Would it be fine than to allocate it in arch_domain_create() ... : >>>>>> >>>>>> if ( (d->shared_info = alloc_xenheap_pages(0, 0)) == NULL ) >>>>>> goto fail; >>>>>> >>>>>> clear_page(d->shared_info); >>>>>> >>>>>> ... but without calling share_xen_page_with_guest() after that >>>>>> allocation as share_xen_page_with_guest() isn't implemented at the >>>>>> moment? >>>> >>>> I would have said "yes" here, but ... >>>> >>>>> Or could it be an option for all arch-s move allocation of >>>>> d->shared_info to domain_create() in common just after >>>>> arch_domain_create()? >>>> >>>> ... Andrew's reply pretty much rules out not only this option, but the >>>> shared-info-page concept as a whole (for RISC-V). See my reply there. In >>>> the meantime, the change as suggested may then indeed be what we want to >>>> go with, albeit (a) with a better description and (b) perhaps covering >>>> all d->shared_info uses. >>> >>> Looking at guest kernel code (Linux), FIFO is tried first, so if RISC-V >>> is going to support only FIFO, d->shared_info could legally be NULL. >>> >>> Looking at the Xen side, if an architecture decides to support only >>> FIFO, d->shared_info is touched only in vcpu_info_reset(), which is >>> called from vcpu_create(). >>> >>> All other places where d->shared_info is accessed should not be >>> reachable except for one case in event_fifo.c: when a guest issues the >>> EVTCHNOP_init_control hypercall, setup_ports() reads from shared_info(d, >>> evtchn_pending): >>> static void setup_ports(struct domain *d, unsigned int prev_evtchns) >>> { >>> ... >>> if ( guest_test_bit(d, port, &shared_info(d, evtchn_pending)) >>> evtchn->pending = true; >>> ... >>> } >>> } >>> >>> This looks like it handles the transition from the 2L ABI to the FIFO >>> ABI: if a guest started with 2L and then switched to FIFO, any events >>> already pending in shared_info(d, evtchn_pending) need to be migrated to >>> FIFO's per-channel evtchn->pending flag. But it looks like I am missing >>> something here as I mentioned at the start that Linux uses or FIFO or 2L. >>> >>> Am I missing something? >> >> Quite likely you aren't, but I didn't check. My earlier "covering all" may >> well resolve to merely stating things accordingly in the patch description. > > If either FIFO or 2L can be used, shouldn't guest_test_bit(d, port, > &shared_info(d, evtchn_pending)) in setup_ports() be dropped? If FIFO > was chosen by Linux, there won't be any events in &shared_info(d, > evtchn_pending), so it is essentially dead code that could just be > dropped. Why would it be dead code? Who said that a guest couldn't to 2L for a while, then switch to FIFO? Think of boot loaders, for example. Jan > Or would it be better to leave it and skip only if > d->shared_info is allocated: if ( d->shared_info && guest_test_bit(...) > ) to cover the case when a guest wants to switch from 2L to FIFO (if > that is even a possible case at all, since as I mentioned above, the > guest (Linux) chooses the event ABI once and it stays for its lifetime)? > > ~ Oleksii

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.