Xen project Mailing List

Re: [Xen-devel] [RFC] Overview of work required to implement mem_access for PV guests

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: "Aravindh Puthiyaparambil (aravindp)" <aravindp@xxxxxxxxx>

Date: Mon, 25 Nov 2013 20:29:25 +0000

Accept-language: en-US

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Tim Deegan \(tim@xxxxxxx\)" <tim@xxxxxxx>

Delivery-date: Mon, 25 Nov 2013 20:29:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: Ac7psgqmYCp93KSkS7KT3UtfTSiTNgAS/xuAAAR5gKAAD3WNgAAMTnBQ

Thread-topic: [Xen-devel] [RFC] Overview of work required to implement mem_access for PV guests

>On 25/11/13 19:39, Aravindh Puthiyaparambil (aravindp) wrote: >>> On 25/11/13 07:49, Aravindh Puthiyaparambil (aravindp) wrote: >>>> The mem_access APIs only work with HVM guests that run on Intel >>> hardware with EPT support. This effort is to enable it for PV guests that >>> run >>> with shadow page tables. To facilitate this, the following will be done: >>> >>> Are you sure that this is only Intel with EPT? It looks to be a HAP >>> feature, >>> which includes AMD with NPT support. >> Yes, mem_access is gated on EPT being available. >> >http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/mm/mem >_event.c;h=d00e4041b2bd099b850644db86449c8a235f0f5a;hb=HEAD#l586 >> >> However, I think it is possible to implement this for NPT also. > >So it is - I missed that. > >> >>>> 1. A magic page will be created for the mem_access (mem_event) ring >>> buffer during the PV domain creation. >>> >>> Where is this magic page being created from? This will likely have to be at >the >>> behest of the domain creation flags to avoid making it for the vast majority >of >>> domains which wont want the extra overhead. >> This page will be similar to the console, xenstore and start_info pages. >> >http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/xc_dom_x86. >c;h=e034d62373c7a080864d1aefaa6a06412653c9af;hb=HEAD#l452 >> >> I can definitely make it depend on a domain creation flag, however on the >HVM side pages for all mem_events including mem_access are created by >default. >> >http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/xc_hvm_build >_x86.c;h=q;hb=HEAD#l487 >> >> So is it ok to have a domain creation flag just for mem_access for PV guests? > >The start_info and xenstore pages are critical for a PV guest to boot, >and the console is fairly useful (although not essential). These pages >belong to the guest and the guest has full read/write access and control >over the pages. > >For HVM guests, the special pfns are hidden in the MMIO region, and have >no access by default. HVM domains need to use add_to_physmap to get >access to a subset of the magic pages. > >I do not think it is reasonable for a guest to be able to access its own >mem_access page, and I am not sure how best to prevent PV guests from >getting at it. In the mem_access listener for HVM guests, what happens is that the page is mapped in and then removed from physmap of the guest. http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/tests/xen-access/xen-access.c;h=b00c05aa4890ee694e8101b77cca582fff420c7b;hb=HEAD#l333 I was hoping to do the same for PV guests. Will that not work? Thanks, Aravindh >> >>>> 2. Most of the mem_event / mem_access functions and variable name >are >>> HVM specific. Given that I am enabling it for PV; I will change the names to >>> something more generic. This also holds for the mem_access hypercalls, >>> which fall under HVM ops and do_hvm_op(). My plan is to make them a >>> memory op or a domctl. >>> >>> You cannot remove the hvmops. That would break the hypervisor ABI. >>> >>> You can certainly introduce new (more generic) hypercalls, implement the >>> hvmop ones in terms of the new ones and mark the hvmop ones as >>> deprecated in the documentation. >> Sorry, I should have been more explicit in the above paragraph. I was >planning on doing exactly what you have said. I will be adding a new hypercall >interface for the PV guests; we can then use that for HVM also and keep the >old hvm_op hypercall interface as an alias. >> I would do something similar on the tool stack side. Create >xc_domain_*_access() or xc_*_access() and make them wrappers that call >xc_hvm_*_access() or vice-versa. Then move the functions to xc_domain.c or >xc_mem_access.c. This way I am hoping the existing libxc APIs will still work. >> >> Thanks, >> Aravindh > >Ah ok - that looks sensible overall. > >~Andrew >>> >>>> 3. A new shadow option will be added called PG_mem_access. This mode >is >>> basic shadow mode with the addition of a table that will track the access >>> permissions of each page in the guest. >>>> mem_access_tracker[gfmn] = access_type If there is a place where I can >>>> stash this in an existing structure, please point me at it. >>>> This will be enabled using xc_shadow_control() before attempting to >enable >>> mem_access on a PV guest. >>>> 4. xc_mem_access_enable/disable(): Change the flow to allow >mem_access >>> for PV guests running with PG_mem_access shadow mode. >>>> 5. xc_domain_set_access_required(): No change required >>>> >>>> 6. xc_(hvm)_set_mem_access(): This API has two modes, one if the start >>> pfn/gmfn is ~0ull, it takes it as a request to set default access. Here we >>> will >call >>> shadow_blow_tables() after recording the default access type for the >>> domain. In the mode where it is setting mem_access type for individual >>> gmfns, we will call a function that will drop the shadow for that individual >>> gmfn. I am not sure which function to call. Will >>> sh_remove_all_mappings(gmfn) do the trick? Please advise. >>>> The other issue here is that in the HVM case we could use >>> xc_hvm_set_mem_access(gfn, nr) and the permissions for the range gfn >to >>> gfn+nr would be set. This won't be possible in the PV case as we are >actually >>> dealing with mfns and mfn to mfn+nr need not belong to the same guest. >But >>> given that setting *all* page access permissions are done implicitly when >>> setting default access, I think we can live with setting page permissions >>> one >at >>> a time as they are faulted in. >>>> 7. xc_(hvm)_get_mem_access(): This will return the access type for gmfn >> >from the mem_access_tracker table. >>>> 8. In sh_page_fault() perform access checks similar to >>> ept_handle_violation() / hvm_hap_nested_page_fault(). >>>> 9. Hook in to _sh_propagate() and set up the L1 entries based on access >>> permissions. This will be similar to ept_p2m_type_to_flags(). I think I >>> might >>> also have to hook in to the code that emulates page table writes to ensure >>> access permissions are honored there too. >>>> Please give feedback on the above. >>>> >>>> Thanks, >>>> Aravindh >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@xxxxxxxxxxxxx >>>> http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.