[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen: privcmd: Add support for irqfd
On 13.07.23 10:44, Juergen Gross wrote: Hello all. > On 12.07.23 10:48, Viresh Kumar wrote: >> Xen provides support for injecting interrupts to the guests via the >> HYPERVISOR_dm_op() hypercall. The same is used by the Virtio based >> device backend implementations, in an inefficient manner currently. >> >> Generally, the Virtio backends are implemented to work with the Eventfd >> based mechanism. In order to make such backends work with Xen, another >> software layer needs to poll the Eventfds and raise an interrupt to the >> guest using the Xen based mechanism. This results in an extra context >> switch. >> >> This is not a new problem in Linux though. It is present with other >> hypervisors like KVM, etc. as well. The generic solution implemented in >> the kernel for them is to provide an IOCTL call to pass the interrupt >> details and eventfd, which lets the kernel take care of polling the >> eventfd and raising of the interrupt, instead of handling this in user >> space (which involves an extra context switch). >> >> This patch adds support to inject a specific interrupt to guest using >> the eventfd mechanism, by preventing the extra context switch. >> >> Inspired by existing implementations for KVM, etc.. Viresh, great work! Do you perhaps have corresponding users-space (virtio backend) example adopted for that feature (I would like to take a look at it if possible)? >> >> Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> >> --- >> drivers/xen/privcmd.c | 285 ++++++++++++++++++++++++++++++++++++- >> include/uapi/xen/privcmd.h | 14 ++ >> 2 files changed, 297 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c >> index e2f580e30a86..e8096b09c113 100644 >> --- a/drivers/xen/privcmd.c >> +++ b/drivers/xen/privcmd.c >> @@ -9,11 +9,16 @@ >> #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt >> +#include <linux/eventfd.h> >> +#include <linux/file.h> >> #include <linux/kernel.h> >> #include <linux/module.h> >> +#include <linux/mutex.h> >> +#include <linux/poll.h> >> #include <linux/sched.h> >> #include <linux/slab.h> >> #include <linux/string.h> >> +#include <linux/workqueue.h> >> #include <linux/errno.h> >> #include <linux/mm.h> >> #include <linux/mman.h> >> @@ -833,6 +838,266 @@ static long privcmd_ioctl_mmap_resource(struct >> file *file, >> return rc; >> } >> +/* Irqfd support */ >> +static struct workqueue_struct *irqfd_cleanup_wq; >> +static DEFINE_MUTEX(irqfds_lock); >> +static LIST_HEAD(irqfds_list); >> + >> +struct privcmd_kernel_irqfd { >> + domid_t dom; >> + u8 level; >> + u32 irq; >> + struct eventfd_ctx *eventfd; >> + struct work_struct shutdown; >> + wait_queue_entry_t wait; >> + struct list_head list; >> + poll_table pt; >> +}; >> + >> +/* From xen/include/public/hvm/dm_op.h */ >> +#define XEN_DMOP_set_irq_level 19 >> + >> +struct xen_dm_op_set_irq_level { >> + u32 irq; >> + /* IN - Level: 0 -> deasserted, 1 -> asserted */ >> + u8 level; >> + u8 pad[3]; >> +}; >> + >> +struct xen_dm_op { >> + u32 op; >> + u32 pad; >> + union { >> + /* >> + * There are more structures here, we won't be using them, so >> + * can skip adding them here. >> + */ >> + struct xen_dm_op_set_irq_level set_irq_level; >> + } u; >> +}; > > Instead of copying definitions over from Xen into privcmd.c, please just > update > the related linux header include/xen/interface/dm_op.h from the Xen public > header. > >> + >> +static void irqfd_deactivate(struct privcmd_kernel_irqfd *kirqfd) >> +{ >> + lockdep_assert_held(&irqfds_lock); >> + >> + list_del_init(&kirqfd->list); >> + queue_work(irqfd_cleanup_wq, &kirqfd->shutdown); >> +} >> + >> +static void irqfd_shutdown(struct work_struct *work) >> +{ >> + struct privcmd_kernel_irqfd *kirqfd = >> + container_of(work, struct privcmd_kernel_irqfd, shutdown); >> + u64 cnt; >> + >> + eventfd_ctx_remove_wait_queue(kirqfd->eventfd, &kirqfd->wait, &cnt); >> + eventfd_ctx_put(kirqfd->eventfd); >> + kfree(kirqfd); >> +} >> + >> +static void irqfd_inject(struct privcmd_kernel_irqfd *kirqfd) >> +{ >> + struct xen_dm_op dm_op = { >> + .op = XEN_DMOP_set_irq_level, >> + .u.set_irq_level.irq = kirqfd->irq, >> + .u.set_irq_level.level = kirqfd->level, >> + }; >> + struct xen_dm_op_buf xbufs = { >> + .size = sizeof(dm_op), >> + }; >> + u64 cnt; >> + >> + eventfd_ctx_do_read(kirqfd->eventfd, &cnt); >> + set_xen_guest_handle(xbufs.h, &dm_op); >> + >> + xen_preemptible_hcall_begin(); >> + HYPERVISOR_dm_op(kirqfd->dom, 1, &xbufs); > > Please add some error handling, e.g. by issuing a message in case this > hypercall > was failing. Adding a bool "error" to struct privcmd_kernel_irqfd in > order to > avoid multiple error messages for the same device might be a good idea. In addition to provided comments, I would like to mention that this particular dm_op has Arm implementation only in vanilla hypervisor. So this feature cannot be immediately reused on x86 because of XEN_DMOP_set_irq_level at least. As I understand, the x86's variant is XEN_DMOP_set_isa_irq_level (there is also XEN_DMOP_set_pci_intx_level for legacy PCI interrupts). Please note, I am not asking to wire on x86, but maybe it is worth considering to put this new feature under something like Kconfig's XEN_PRIVCMD_IRQFD which depends on Arm for now? Or maybe to put dm_op specific part of irqfd_inject() under CONFIG_ARCH_XXX here or introduce per-ARCH helpers to form a dm_op (Arm's variant will use XEN_DMOP_set_irq_level like in current commit). If not, then at least a sentence in the description mentioning that this works on Arm only needs to be added, I think. Also, I am wondering whether we should foresee the IOCTL_PRIVCMD_IRQFD interface to be suitable for all existing irq related dm_ops (not only XEN_DMOP_set_irq_level) right now or this can be extended/updated later on (when there is a real need)? [snip]
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |