[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: privcmd: Add support for irqfd


  • To: Juergen Gross <jgross@xxxxxxxx>, Viresh Kumar <viresh.kumar@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>
  • Date: Thu, 13 Jul 2023 14:40:08 +0000
  • Accept-language: en-US, ru-RU
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7AWXD6xifU4U0eaRhlp3L/5/ivssBJWP3bjtBZPlTns=; b=UeSY0DWNHVnX5y0xI3/BvMmqWSJ1Ck2WieoHj6dVvhx+IiX/XvZ0MYT5dOC67DqZEobyRAEDY9r7zeyJ8/iFeE3KiVpcWzBC7VxgeW8zVvRjp58MtTLWyQmI6r7nMpOHVUw78ti+7BzAtGRax3LNe+lzwnPFQ1J1GrSDj9wx5LCgHOxCk3f9A3xruDmUcGfaWxT97AotYTaUkJfc06DoRfp3NPAV2y0cganSEe9EgfIXuFahorQ5u89ojNWJrmL0btjBhmswNV1Id4V9oCHcIjBwjcUbNx7uihdNilGy9GZaZH+Unmi4UfTjPJwZDxE0dRvswQjhF0qPj9+Fgjk3zQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PsqAdu2R4EXtx0frvVcKB2jlRCB48n/Gkij4e4PTZmJ/XVgmaUuTGzs36GzFsQS0T9wnNaD9duJasHjB6WAMoM01QVIp7oMENm1MN/G5BdouJJMoV3rf1FfO4grMqY22g+hzf+5xd2gp4vL4Qx6N8yHQ3iGecV7QQf0Sf3IYo1MLOQp6/3g0O6BlznRfD5Zj2J3g6gXr3yxk1ZwZLsLmibsMs06CBpvP9EHMFXL9wjtRGPRFnU7jg5NBa3GRAYusbjaGRheJ3Q2cBwKJ1rB2wsB+LaGbjXcKfcySfKRkhHZ6RiYMGDK1t5ky1k5YTn/RCfiGNwxr9Wj0tNs2TjK/jw==
  • Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>, Alex Bennée <alex.bennee@xxxxxxxxxx>, "stratos-dev@xxxxxxxxxxxxxxxxxxx" <stratos-dev@xxxxxxxxxxxxxxxxxxx>, Erik Schilling <erik.schilling@xxxxxxxxxx>, Manos Pitsidianakis <manos.pitsidianakis@xxxxxxxxxx>, Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 13 Jul 2023 14:40:39 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZtJ2tw+2Ln+ZUhUa/nIHLslV4fq+3Uo+AgAB0BQA=
  • Thread-topic: [PATCH] xen: privcmd: Add support for irqfd


On 13.07.23 10:44, Juergen Gross wrote:

Hello all.


> On 12.07.23 10:48, Viresh Kumar wrote:
>> Xen provides support for injecting interrupts to the guests via the
>> HYPERVISOR_dm_op() hypercall. The same is used by the Virtio based
>> device backend implementations, in an inefficient manner currently.
>>
>> Generally, the Virtio backends are implemented to work with the Eventfd
>> based mechanism. In order to make such backends work with Xen, another
>> software layer needs to poll the Eventfds and raise an interrupt to the
>> guest using the Xen based mechanism. This results in an extra context
>> switch.
>>
>> This is not a new problem in Linux though. It is present with other
>> hypervisors like KVM, etc. as well. The generic solution implemented in
>> the kernel for them is to provide an IOCTL call to pass the interrupt
>> details and eventfd, which lets the kernel take care of polling the
>> eventfd and raising of the interrupt, instead of handling this in user
>> space (which involves an extra context switch).
>>
>> This patch adds support to inject a specific interrupt to guest using
>> the eventfd mechanism, by preventing the extra context switch.
>>
>> Inspired by existing implementations for KVM, etc..


Viresh, great work!

Do you perhaps have corresponding users-space (virtio backend) example 
adopted for that feature (I would like to take a look at it if possible)?



>>
>> Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>> ---
>>   drivers/xen/privcmd.c      | 285 ++++++++++++++++++++++++++++++++++++-
>>   include/uapi/xen/privcmd.h |  14 ++
>>   2 files changed, 297 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>> index e2f580e30a86..e8096b09c113 100644
>> --- a/drivers/xen/privcmd.c
>> +++ b/drivers/xen/privcmd.c
>> @@ -9,11 +9,16 @@
>>   #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt
>> +#include <linux/eventfd.h>
>> +#include <linux/file.h>
>>   #include <linux/kernel.h>
>>   #include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/poll.h>
>>   #include <linux/sched.h>
>>   #include <linux/slab.h>
>>   #include <linux/string.h>
>> +#include <linux/workqueue.h>
>>   #include <linux/errno.h>
>>   #include <linux/mm.h>
>>   #include <linux/mman.h>
>> @@ -833,6 +838,266 @@ static long privcmd_ioctl_mmap_resource(struct 
>> file *file,
>>       return rc;
>>   }
>> +/* Irqfd support */
>> +static struct workqueue_struct *irqfd_cleanup_wq;
>> +static DEFINE_MUTEX(irqfds_lock);
>> +static LIST_HEAD(irqfds_list);
>> +
>> +struct privcmd_kernel_irqfd {
>> +    domid_t dom;
>> +    u8 level;
>> +    u32 irq;
>> +    struct eventfd_ctx *eventfd;
>> +    struct work_struct shutdown;
>> +    wait_queue_entry_t wait;
>> +    struct list_head list;
>> +    poll_table pt;
>> +};
>> +
>> +/* From xen/include/public/hvm/dm_op.h */
>> +#define XEN_DMOP_set_irq_level 19
>> +
>> +struct xen_dm_op_set_irq_level {
>> +    u32 irq;
>> +    /* IN - Level: 0 -> deasserted, 1 -> asserted */
>> +    u8 level;
>> +    u8 pad[3];
>> +};
>> +
>> +struct xen_dm_op {
>> +    u32 op;
>> +    u32 pad;
>> +    union {
>> +        /*
>> +         * There are more structures here, we won't be using them, so
>> +         * can skip adding them here.
>> +         */
>> +        struct xen_dm_op_set_irq_level set_irq_level;
>> +    } u;
>> +};
> 
> Instead of copying definitions over from Xen into privcmd.c, please just 
> update
> the related linux header include/xen/interface/dm_op.h from the Xen public
> header.
> 
>> +
>> +static void irqfd_deactivate(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    lockdep_assert_held(&irqfds_lock);
>> +
>> +    list_del_init(&kirqfd->list);
>> +    queue_work(irqfd_cleanup_wq, &kirqfd->shutdown);
>> +}
>> +
>> +static void irqfd_shutdown(struct work_struct *work)
>> +{
>> +    struct privcmd_kernel_irqfd *kirqfd =
>> +        container_of(work, struct privcmd_kernel_irqfd, shutdown);
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_remove_wait_queue(kirqfd->eventfd, &kirqfd->wait, &cnt);
>> +    eventfd_ctx_put(kirqfd->eventfd);
>> +    kfree(kirqfd);
>> +}
>> +
>> +static void irqfd_inject(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    struct xen_dm_op dm_op = {
>> +        .op = XEN_DMOP_set_irq_level,
>> +        .u.set_irq_level.irq = kirqfd->irq,
>> +        .u.set_irq_level.level = kirqfd->level,
>> +    };
>> +    struct xen_dm_op_buf xbufs = {
>> +        .size = sizeof(dm_op),
>> +    };
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_do_read(kirqfd->eventfd, &cnt);
>> +    set_xen_guest_handle(xbufs.h, &dm_op);
>> +
>> +    xen_preemptible_hcall_begin();
>> +    HYPERVISOR_dm_op(kirqfd->dom, 1, &xbufs);
> 
> Please add some error handling, e.g. by issuing a message in case this 
> hypercall
> was failing. Adding a bool "error" to struct privcmd_kernel_irqfd in 
> order to
> avoid multiple error messages for the same device might be a good idea.


In addition to provided comments, I would like to mention that this 
particular dm_op has Arm implementation only in vanilla hypervisor.

So this feature cannot be immediately reused on x86 because of 
XEN_DMOP_set_irq_level at least. As I understand, the x86's variant is 
XEN_DMOP_set_isa_irq_level (there is also XEN_DMOP_set_pci_intx_level 
for legacy PCI interrupts).

Please note, I am not asking to wire on x86, but maybe it is worth 
considering to put this new feature under something like Kconfig's 
XEN_PRIVCMD_IRQFD which depends on Arm for now?  Or maybe to put dm_op 
specific part of irqfd_inject() under CONFIG_ARCH_XXX here or introduce 
per-ARCH helpers to form a dm_op (Arm's variant will use 
XEN_DMOP_set_irq_level like in current commit). If not, then at least a 
sentence in the description mentioning that this works on Arm only needs 
to be added, I think.

Also, I am wondering whether we should foresee the IOCTL_PRIVCMD_IRQFD 
interface to be suitable for all existing irq related dm_ops (not only 
XEN_DMOP_set_irq_level) right now or this can be extended/updated later 
on (when there is a real need)?



[snip]

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.