[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for physical address based hypercalls



Hi Jan,

On 28/09/2022 11:38, Jan Beulich wrote:
For quite some time we've been talking about replacing the present virtual
address based hypercall interface with one using physical addresses.  This is in
particular a prerequisite to being able to support guests with encrypted
memory, as for such guests we cannot perform the page table walks necessary to
translate virtual to (guest-)physical addresses.  But using (guest) physical
addresses is also expected to help performance of non-PV guests (i.e. all Arm
ones plus HVM/PVH on x86), because of the no longer necessary address
translation.

I am not sure this is going to be a gain in performance on Arm. In most of the cases we are using the HW to translate the guest virtual address to a host physical address. But there are no instruction to translate a guest physical address to a host physical address. So we will have to do
the translation in software.

That said, there are other reasons on Arm (and possibly x86) to get rid of the virtual address. At the moment, we are requiring the VA to be always valid. This is quite fragile as we can't fully control how the kernel is touching its page-table (remember that on Arm we need to use break-before-make to do any shattering).

I have actually seen in the past some failure during the translation on Arm32. But I never fully investigated it because they were hard to repro as they rarely happen.


Clearly to be able to run existing guests, we need to continue to support the
present virtual address based interface.  Previously it was suggested to change
the model on a per-domain basis, perhaps by a domain creation control.  This
has two major shortcomings:
  - Entire guest OSes would need to switch over to the new model all in one go.
    This could be particularly problematic for in-guest interfaces like Linux'es
    privcmd driver, which is passed hypercall argument from user space.  Such
    necessarily use virtual addresses, and hence the kernel would need to learn
    of all hypercalls legitimately coming in, in order to translate the buffer
    addresses.  Reaching sufficient coverage there might take some time.
  - All base components within an individual guest instance which might run in
    succession (firmware, boot loader, kernel, kexec) would need to agree on the
    hypercall ABI to use.

As an alternative I'd like to propose the introduction of a bit (or multiple
ones, see below) augmenting the hypercall number, to control the flavor of the
buffers used for every individual hypercall.  This would likely involve the
introduction of a new hypercall page (or multiple ones if more than one bit is
to be used), to retain the present abstraction where it is the hypervisor which
actually fills these pages.  For multicalls the wrapping multicall itself would
be controlled independently of the constituent hypercalls.

A model involving just a single bit to indicate "flat" buffers has limitations
when it comes to large buffers passed to a hypercall.  Since in many cases
hypercalls (currently) allowing for rather large buffers wouldn't normally be
used with buffers significantly larger than a single page (several of the
mem-ops for example), special casing the (presumably) few hypercalls which have
an actual need for large buffers might be an option.

Another approach would be to build in a scatter/gather model for buffers right
away.  Jürgen suggests that the low two address bits could be used as a
"descriptor" here.

IIUC, with this approach we would still need to have a bit in the hypercall number to indicate this is not a virtual address. Is that correct?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.