[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Design session PVH dom0



On 26.09.22 09:53, Jan Beulich wrote:
On 23.09.2022 10:20, Juergen Gross wrote:
On 21.09.22 17:53, Marek Marczykowski-Górecki wrote:
Session description (by Jan):
In the course of working on an XSA I had to finally get PVH Dom0 work on at least one 
of my systems, in a minimal fashion. This had turned up a number of issues, some of 
which have since remained pending. Therefore I’d like to gain understanding on 
whether there is any future to this mode of Dom0 operation, and if so when it can be 
expected to be better than tech preview or even just experimental.

...

Jürgen: PVH dom0 performance?

Roger: it's bad; mostly relevant is qemu interfaces

George: only for safety certifications? performance penalty may be okay

Jürgen: hypercalls can be improved (virtual buffers?)

Some more thoughts on this topic: Having hypercall variants with physically
addressed buffers will help, but there is an additional complexity: what
about hypercalls with really large buffers (e.g. the bitmap for modified
pages for guest migration). In order to avoid having to allocate huge
physically contiguous buffers for those purposes we'd probably need
something like scatter/gather lists for hypercall buffers.

Not sure. I'd rather see us add new (sub)hypercalls for such non-standard
cases. E.g. the bitmap example you give would be amended by a new flavor
having the caller pass in an array of GFNs (perhaps, as you say, with
further indirection to deal with that array also growing large). I'd
really like to keep the common case simple.

The question is how many hypercalls would be hit by the not common case.

Taking a quick glance I spotted:

- grant_table_op (subops setup_table and get_status_frames)
- memory_op (several sub-ops)
- multicall (main list of calls)
- console_io (console data)
- mmuext_op (some ops allow lists)
- xsm_op (not sure a buffer can span pages, but interface would allow it)
- physdev_op (subop set_iobitmap)
- hvm_op (altp2m handling)
- sysctl (multiple sub-ops)
- domctl (multiple sub-ops)
- hypfs (node data can exceed page size)

Do we really want to special case all of those?

And those might
want to be supported in a generic way. Additionally: what if such a SG-list
would exceed the size of a page? The dirty bitmap of a guest with 64 GB of
RAM would already need 512 pages, so the SG-list for that bitmap would already
fill a complete page assuming only 8 byte for one SG-entry (which would limit
the general usability already).

My favorite solution would be some kind of buffer address qualifier for each
buffer (e.g. virtual, physical, SG-list, maybe nested SG-list). So the new
hypercalls would not mean "physical buffer addresses", but "qualified buffer
addresses". By requiring a minimum of 4-byte alignment for each buffer (can we
do that, at least for the new hypercalls?) this would leave the 2 lowest bits
of a buffer address for the new qualifier. If by any means an unaligned buffer
is needed sometimes, it could still be achieved via a single-entry SG-list.

While this might be an option, I'm not sure I'd be really happy with such
re-use of the low address bits, nor with the implied further restriction
on buffer alignment (most struct-s we use are 4-byte aligned at least,
but I don't think it's all of them, plus we also have guest handles to
e.g. arrays of char).

The unaligned cases could be handled dynamically via the single-entry
SG-list.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.