Xen project Mailing List

Re: [PATCH] xen: Fix event channel callback via INTX/GSI

On Fri, 2020-12-18 at 17:20 -0500, boris.ostrovsky@xxxxxxxxxx wrote: > Are there other cases where this would be useful? If it's just to > test a hypervisor under development I would think that people who are > doing that kind of work are capable of building their own kernel. My > concern is mostly about having yet another boot option that is of > interest to very few people who can easily work around not having it. For hypervisor testing we can just set the Xen major version number in CPUID to 3, and that stops xs_reset_watches() from doing anything. cf. https://lkml.org/lkml/2017/4/10/266 Karim ripped out all this INTX code in 2017 because it was broken, and subsequently put it back because it *was* working for older versions of Xen, due to that "coincidence". The conclusion back then was that if it was put back it should at least *work* consistently, and he was going to send a patch "shortly". This is a that patch :) > > Add a 'no_vector_callback' command line argument to test it. > > This last one should be a separate patch I think. Could do. > > + /* > > + * It doesn't strictly *have* to run on CPU0 but it sure > > + * as hell better process the event channel ports delivered > > + * to CPU0. > > + */ > > + irq_set_affinity(pdev->irq, cpumask_of(0)); > > + > > > Is the concern here that it won't be handled at all? Indeed, the events don't get handled at all if the PCI interrupt lands on a CPU other than zero. When the handler calls xen_hvm_evtchn_do_upcall() that processes pending events for whichever CPU it happens to be running on, and *not* the events pending for CPU0. And the boot stops in xs_reset_watches() waiting (without a timeout) for an interrupt that never gets processed, as before. > And is this related to the issue this patch is addressing? It is required to fix the event channel callback via INTX/GSI, yes. Although it could reasonably be lifted out into a separate patch too. > > static int __init xenbus_probe_initcall(void) > > { > > - if (!xen_domain()) > > - return -ENODEV; > > - > > - if (xen_initial_domain() || xen_hvm_domain()) > > - return 0; > > + /* > > + * Probe XenBus here in the XS_PV case, and also XS_HVM unless we > > + * need to wait for the platform PCI device to come up, which is > > + * the (XEN_PVPVM && !xen_have_vector_callback) case. > > + */ > > + if (xen_store_domain_type == XS_PV || > > + (xen_store_domain_type == XS_HVM && > > + (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback))) > > + xenbus_probe(); > > > > - xenbus_probe(NULL); > > return 0; > > } > > - > > device_initcall(xenbus_probe_initcall); > > > > +int xen_set_callback_via(uint64_t via) > > +{ > > + struct xen_hvm_param a; > > + int ret; > > + > > + a.domid = DOMID_SELF; > > + a.index = HVM_PARAM_CALLBACK_IRQ; > > + a.value = via; > > + > > + ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a); > > + if (ret) > > + return ret; > > + > > + /* > > + * If xenbus_probe_initcall() deferred the xenbus_probe() > > + * due to the callback not functioning yet, we can do it now. > > + */ > > + if (!xenstored_ready && xen_store_domain_type == XS_HVM && > > + IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback) > > > I'd create an is_callback_ready() (or something with a better name) > helper. I pondered that, and indeed dropping the XVM/vector conditions and doing it literally based on whether xen_set_callback_via() had been called at all (and not too early). But it looks like there are cases where Arm doesn't call xen_set_callback_via() at all, and it made more sense to me to live xen_set_callback_via() to sit right here and have those two conditions within a page of each other in juxtaposition, with suitable comments. I think that's probably easier to understand and work with than a "helper". > > > + xenbus_probe(); > > + > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(xen_set_callback_via); > > + > > /* Set up event channel for xenstored which is run as a local process > > * (this is normally used only in dom0) > > */ > > @@ -817,11 +851,17 @@ static int __init xenbus_init(void) > > break; > > } > > > > - /* Initialize the interface to xenstore. */ > > - err = xs_init(); > > - if (err) { > > - pr_warn("Error initializing xenstore comms: %i\n", err); > > - goto out_error; > > + /* > > + * HVM domains may not have a functional callback yet. In that > > + * case let xs_init() be called from xenbus_probe(), which will > > + * get invoked at an appropriate time. > > + */ > > + if (xen_store_domain_type != XS_HVM) { > > > Can we delay xs_init() for !XS_HVM as well? In other words wait until > after PCI platform device has been probed (on HVM) and then call > xs_init() for everyone. We're half-way there already, because xenbus_probe() *does* happen later as a device_initcall, and I've just made it call xs_init(). We could make it avoid calling xs_init() from xenbus_init() in the XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it. But right now xenbus_probe() doesn't run for the other cases, so there'd have to be a mode where it *only* calls xs_init() and doesn't do the notifier chain. That seems like more churn that was needed so I didn't do it.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.