Xen project Mailing List

Re: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>

Date: Fri, 5 Mar 2021 09:31:03 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QMBqWEWWYNRCX7em1yTs78D2mUBVX5lWHaH9h73/aug=; b=f3014upCBT8W6B5a4KcYDPJgC8ErQOOnEd/KbZ6mBC1wbD2n0CQ458ZU5ehnWLeeJblNPfHLvpZvnl3VNDDgqX4Jl7p7V7kC8HwD0EwwZbZpHwDj9xMybhm+3SOnnn+E2RjxVXgPMicXiKO38WcHLJUSXOiMOow77aN/QziG/rb5YIZ6pVcBtxK4qrsm806Yv1FvTLExJDJB5VXsCZdfhLCWDCILtP6BcYoArdMP0cxFZoweDnildMPq4PM4wQp2oVdxarK3EfDJr1QWjegvsZ9cd3NlkjIXxidY6qQAda6ckbemL1foU9AB4mSsaPppFgNywmIvalknjalhLL3K/w==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AX0qhKxyJzwEUPHHUfdELEo+hrjyalDHSmpbASzjfl9wm1LbX3WCrFTQYC5fYucacStthwqklAWvsepxd3A5BK/Wp/EPV5CxrKtUBP9yz6rnGUAYARud13OPptAS88f59HJcsNqEz0icakfShsNldhwrmFp1MpOOdyG3CxCO0QO3b2hUDIqhiOCn5QHjDhxMOfuAqBbKJgnTkUun1Et1p6igAW3qInIxBDTLAA3aa8fQKk3Ngf3hqN0C0NTxhBfi2eSLi2BWliqxsupNCfI7/CUWjURwWiCH3kkJh/P5kCRkYKkOjqWTHcJH7Av4ZlZY9M4ZARS3sHdt16wu0i2jCw==

Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=epam.com;

Cc: Julien Grall <julien.grall.oss@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Fri, 05 Mar 2021 09:35:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHXCYx4A6OUUHr1gkqxWv1TEOLkuqplchSAgAAzdYCAAXE/AIAAtX0AgAAaNQCAABhvgIAAC3gAgADMQgCADFrIAA==

Thread-topic: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)

Hi, Volodymyr Babchuk writes: > Hi Andrew, > > Andrew Cooper writes: > >> On 24/02/2021 23:58, Volodymyr Babchuk wrote: >>> And I am not mentioning x86 support there... >> >> x86 uses per-pCPU stacks, not per-vCPU stacks. >> >> Transcribing from an old thread which happened in private as part of an >> XSA discussion, concerning the implications of trying to change this. >> >> ~Andrew >> >> -----8<----- >> >> Here is a partial list off the top of my head of the practical problems >> you're going to have to solve. >> >> Introduction of new SpectreRSB vulnerable gadgets. I'm really close to >> being able to drop RSB stuffing and recover some performance in Xen. >> >> CPL0 entrypoints need updating across schedule. SYSCALL entry would >> need to become a stub per vcpu, rather than the current stub per pcpu. >> This requires reintroducing a writeable mapping to the TSS (doable) and >> a shadow stack switch of active stacks (This corner case is so broken it >> looks to be a blocker for CET-SS support in Linux, and is resulting in >> some conversation about tweaking Shstk's in future processors). >> >> All per-cpu variables stop working. You'd need to rewrite Xen to use >> %gs for TLS which will have churn in the PV logic, and introduce the x86 >> architectural corner cases of running with an invalid %gs. Xen has been >> saved from a large number of privilege escalation vulnerabilities in >> common with Linux and Windows by the fact that we don't use %gs, so >> anyone trying to do this is going to have to come up with some concrete >> way of proving that the corner cases are covered. > > Thank you. This is exactly what I needed. I am not a big specialist in > x86, but from what I said, I can see that there is no easy way to switch > contexts while in hypervisor mode. > > Then I want to return to a task domain idea, which you mentioned in the > other thread. If I got it right, it would allow to > > 1. Implement asynchronous hypercalls for cases when there is no reason > to hold calling vCPU in hypervisor for the whole call duration > Okay, I was too overexcited there. I mean - surely it is possible to implement async hypercalls, but there is no immediate profit in this: such hypercall can't be preempted anyways. On a SMP system you can offload hypercall to another core, but that's basically all. > I skimmed through ML archives, but didn't found any discussion about it. Maybe you can give some hint how to find it? > As I see it, its implementation would be close to idle domain > implementation, but a little different. -- Volodymyr Babchuk at EPAM

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.