Xen project Mailing List

[Xen-devel] XenProject/XenServer QEMU working group minutes, 30th August 2016

From: Jennifer Herbert <Jennifer.Herbert@xxxxxxxxxx>

Date: Fri, 9 Sep 2016 17:18:56 +0100

Delivery-date: Fri, 09 Sep 2016 16:19:42 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

QEMU XenServer/XenProject Working group meeting 30th August 2016 ================================================================ Attendance ---------- Andrew Cooper Ian Jackson Paul Durrant David Vrabel Jennifer Herbert Introduction ------------ Introduced Paul Durrant to the working group. Started by recapping our purpose: A way to make it possible for qemu to be able to make hypercalls without too much privilege, in a way which is up streamable. Dom0 guest must not be able abuse interface to compromise dom0 kernel. QEMU Hypercalls – DM op ----------------------- Has been much discussion on XenDevel - a problem identified is when you have operations with references to other user memory objects, such as with Track dirty VRAM (As used with the VGA buffer) At the moment, apparently there is only that one, but others may emerge. Most obvious solution would involve the guest kernel validating the virtual address passed, however that would would rely on the guest kernel knowing where to objects where. This is to be avoided. Ian recounts how there where variose proposals, on XenDevel involving, essentially, informing the hypervisor, or some way providing the information about which virtual addresses where being talked about by the hypercall to the hypervisor. Many of these involved this information being transmitted via a different channel. Ian suggest the idea provide a way for the kernel to tell the hypervisor is user virtual rages, dm op allowed memory. And there would be a flag, in the dm op, in a fixed location, that would tell the hypervisor, this only talks about special dm pre-approved memory. A scheme of pre-nominating an area in QEMU, maybe using hypercall buffers is briefly discussed,as well as a few other ideas, but concludes that doesn’t really address the problem of future DM ops – of which there could easily be. Even if we can avoid the problem by special cases for our current set-up, we still need a story for how to add future interfaces with handles, without the need to change the kernel interface. Once we come up with story, we wouldn't necessarily have to implement it. The concept of using physical addressed hypercall buffers was discussed. Privcmd could allocate you a place, and mmap it into user ram, and this is the only memory that would be used with the hypercalls. A hypercall would tell you the buffer range. Each qemu would need to be associated with the correct set of physical buffers. A recent AMD proposal was discussed, which would use only physical addresses, no virtual address. The upshot being we should come up with a solution that is not incompatible this. Ideas further discussed: User code could just put stuff in mmaped memory, and only refer to offset within that buffer. The privcmd driver would fill in physical details. All dm ops would have 3 arguments: dm op, pointer to to struct, and optional pointer to restriction array – the last of which is filled in my privcmd driver. It is discussed how privcmd driver must not look at the dm op number, in particular, to know how to validate addresses, as it must be independent from the API. A scheme where qemu calls an ioctl before it drops privileges, to set up restrictions ahead of time, is discussed. One scheme may work by setting up a rang for a given domain or VCPU. The assumption is that all device model, running in the same domain, have the same virtual address layout. Then there would be a flag, in the stable bit of the api, if to apply that restriction - any kernel dm op would not apply that restriction. The idea can be extended – to have more one address range, or can have range explicitly provided in the hypercall. This latter suggestion is preferred, however each platform would have different valid address ranges, and privcmd is platform independent. Its discussed how a function could be created to return valid rages for your given platform, but this is not considered a element solution. The third parameter of the dm op could be array of ranges, where common case for virtual addresses may be 0-3GB, but for physical addresses, it might be quite fragmented. A further ideas is proposed to extend the dm op, to have a fixed part, to have an array of guest handles, the kernel can audit. The arguments would be: Arg1: Dom ID: Arg2: Guests handle array of tuples(address, size) Arg3: Number guest handles. The first element of the array could be the DM op structure itself, containing the DM Op code, and othe argument to the perticular op. The Privcmd driver would only pass though what is provided by the user. Any extra elements would be ignored by the hypercall, and if there where insufficient, the hypercall code would see a NULL, and be able to gracefully fail. The initial block (of dm arguments) passed in the array would be copied into pre-zeroed memory of max op size, having checked the size is not greater then this. No need to check minimum, buffer initialised to zero, so zero length would result in op 0 being called. Functions/Macros could be created to make retrieving such a block easier. Any further blocks needed would be implicitly refereed too, as a given dm op knows it will put buffer foo in array position bar. It would then use the provided function/macros to retrieve it. This last idea is compared with the proposal previously posted to xendevel by Ian. This scheme is slightly messier in the dm op code, having to refer to numbers instead of fields, however, the pros are that: * it's more extendible. It dostn involeve providing a new weird copy to user memory macro, that can be misused, which security implications. * The restriction, is bound the the specific call, and can vary. * Priv cmd slightly simpler, can just call access_ok. * Is physical access scheme compatible. David agrees to write this idea up in a design document. He will not need to discuss any individual dm ops, but should describe the pros and cons compared on other ideas on the table. XenStore -------- The xs-restrict mechanism was summarised, and its limitation – it does not work though the kernel XenStore driver, which is needed to talk to a XenStore domain. A way to fix this would be to create a wrapper. Another approach is to try and remove XenStore from all non-priverlaged parts of QEMU – as it is thought there isn't that much use remaining. Protocols such as QMP would be used instead. PV drivers such as QDISK could be run in a separate qemu process – for which a patch exists. There where concerns this would like a lot of time to achieve. Although time ran out, it was vaguely concluded that multiple approaches could be run in parallel, where initially xs-restrict is used as is, and then a the xenstore wrapper could be developed alongside efforts to reduce XenStore use in QEMU. Even with the XenStore wrapper, QEMU may benefit from reducing the number of communication protocols in use – ie removing XenStore use. Action items ------------ David: Write up latest DM op proposal. Jenny: Write up and arrange next meetup. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.