[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Fri, 5 Mar 2021 09:31:03 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QMBqWEWWYNRCX7em1yTs78D2mUBVX5lWHaH9h73/aug=; b=f3014upCBT8W6B5a4KcYDPJgC8ErQOOnEd/KbZ6mBC1wbD2n0CQ458ZU5ehnWLeeJblNPfHLvpZvnl3VNDDgqX4Jl7p7V7kC8HwD0EwwZbZpHwDj9xMybhm+3SOnnn+E2RjxVXgPMicXiKO38WcHLJUSXOiMOow77aN/QziG/rb5YIZ6pVcBtxK4qrsm806Yv1FvTLExJDJB5VXsCZdfhLCWDCILtP6BcYoArdMP0cxFZoweDnildMPq4PM4wQp2oVdxarK3EfDJr1QWjegvsZ9cd3NlkjIXxidY6qQAda6ckbemL1foU9AB4mSsaPppFgNywmIvalknjalhLL3K/w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AX0qhKxyJzwEUPHHUfdELEo+hrjyalDHSmpbASzjfl9wm1LbX3WCrFTQYC5fYucacStthwqklAWvsepxd3A5BK/Wp/EPV5CxrKtUBP9yz6rnGUAYARud13OPptAS88f59HJcsNqEz0icakfShsNldhwrmFp1MpOOdyG3CxCO0QO3b2hUDIqhiOCn5QHjDhxMOfuAqBbKJgnTkUun1Et1p6igAW3qInIxBDTLAA3aa8fQKk3Ngf3hqN0C0NTxhBfi2eSLi2BWliqxsupNCfI7/CUWjURwWiCH3kkJh/P5kCRkYKkOjqWTHcJH7Av4ZlZY9M4ZARS3sHdt16wu0i2jCw==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=epam.com;
  • Cc: Julien Grall <julien.grall.oss@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 05 Mar 2021 09:35:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXCYx4A6OUUHr1gkqxWv1TEOLkuqplchSAgAAzdYCAAXE/AIAAtX0AgAAaNQCAABhvgIAAC3gAgADMQgCADFrIAA==
  • Thread-topic: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)

Hi,

Volodymyr Babchuk writes:

> Hi Andrew,
>
> Andrew Cooper writes:
>
>> On 24/02/2021 23:58, Volodymyr Babchuk wrote:
>>> And I am not mentioning x86 support there...
>>
>> x86 uses per-pCPU stacks, not per-vCPU stacks.
>>
>> Transcribing from an old thread which happened in private as part of an
>> XSA discussion, concerning the implications of trying to change this.
>>
>> ~Andrew
>>
>> -----8<-----
>>
>> Here is a partial list off the top of my head of the practical problems
>> you're going to have to solve.
>>
>> Introduction of new SpectreRSB vulnerable gadgets.  I'm really close to
>> being able to drop RSB stuffing and recover some performance in Xen.
>>
>> CPL0 entrypoints need updating across schedule.  SYSCALL entry would
>> need to become a stub per vcpu, rather than the current stub per pcpu.
>> This requires reintroducing a writeable mapping to the TSS (doable) and
>> a shadow stack switch of active stacks (This corner case is so broken it
>> looks to be a blocker for CET-SS support in Linux, and is resulting in
>> some conversation about tweaking Shstk's in future processors).
>>
>> All per-cpu variables stop working.  You'd need to rewrite Xen to use
>> %gs for TLS which will have churn in the PV logic, and introduce the x86
>> architectural corner cases of running with an invalid %gs.  Xen has been
>> saved from a large number of privilege escalation vulnerabilities in
>> common with Linux and Windows by the fact that we don't use %gs, so
>> anyone trying to do this is going to have to come up with some concrete
>> way of proving that the corner cases are covered.
>
> Thank you. This is exactly what I needed. I am not a big specialist in
> x86, but from what I said, I can see that there is no easy way to switch
> contexts while in hypervisor mode.
>
> Then I want to return to a task domain idea, which you mentioned in the
> other thread. If I got it right, it would allow to
>
> 1. Implement asynchronous hypercalls for cases when there is no reason
> to hold calling vCPU in hypervisor for the whole call duration
>

Okay, I was too overexcited there. I mean - surely it is possible to
implement async hypercalls, but there is no immediate profit in this:
such hypercall can't be preempted anyways. On a SMP system you can
offload hypercall to another core, but that's basically all.

> I skimmed through ML archives, but didn't found any discussion about it.

Maybe you can give some hint how to find it?

> As I see it, its implementation would be close to idle domain
> implementation, but a little different.


-- 
Volodymyr Babchuk at EPAM


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.