Xen project Mailing List

Re: [Xen-devel] Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail

To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Tue, 10 Jan 2017 15:11:56 +0000

Cc: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>

Delivery-date: Tue, 10 Jan 2017 15:28:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 10/01/17 15:02, Razvan Cojocaru wrote: > On 01/10/2017 04:13 PM, Andrew Cooper wrote: >> On 10/01/17 09:06, Razvan Cojocaru wrote: >>> On 01/09/2017 02:54 PM, Andrew Cooper wrote: >>>> On 09/01/17 11:36, Razvan Cojocaru wrote: >>>>> Hello, >>>>> >>>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest >>>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems >>>>> to eat up all the RAM it can: >>>>> >>>>> (XEN) [ 394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544 >>>>> >>>>> This leads to a problem with xen-access, specifically libxc which does >>>>> this in xc_vm_event_enable() (this is Xen 4.6): >>>>> >>>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE, >>>>> &mmap_pfn, 1); >>>>> >>>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) >>>>> { >>>>> /* Map failed, populate ring page */ >>>>> rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0, >>>>> &ring_pfn); >>>>> if ( rc1 != 0 ) >>>>> { >>>>> PERROR("Failed to populate ring pfn\n"); >>>>> goto out; >>>>> } >>>>> >>>>> The first time everything works fine, xen-access can map the ring page. >>>>> But most of the time the second time fails in the >>>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the >>>>> Xen log (once for each failed attempt): >>>>> >>>>> (XEN) [ 395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544 >>>> Thinking further about this, what happens if you avoid removing the page >>>> on exit? >>>> >>>> The first populate succeeds, and if you leave the page populated, the >>>> second time you come around the loop, it should not be of type XTAB, and >>>> the map should succeed. >>> Sorry for the late reply, had to put out another fire yesterday. >>> >>> I've taken your recommendation to roughly mean this: >>> >>> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c >>> index ba9690a..805564b 100644 >>> --- a/xen/common/vm_event.c >>> +++ b/xen/common/vm_event.c >>> @@ -100,8 +100,11 @@ static int vm_event_enable( >>> return 0; >>> >>> err: >>> + /* >>> destroy_ring_for_helper(&ved->ring_page, >>> ved->ring_pg_struct); >>> + */ >>> + ved->ring_page = NULL; >>> vm_event_ring_unlock(ved); >>> >>> return rc; >>> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d, >>> struct vm_event_domain *ved) >>> } >>> } >>> >>> + /* >>> destroy_ring_for_helper(&ved->ring_page, >>> ved->ring_pg_struct); >>> + */ >>> >>> + ved->ring_page = NULL; >>> vm_event_cleanup_domain(d); >>> >>> vm_event_ring_unlock(ved); >>> >>> but this unfortunately still fails to map the page the second time. Do >>> you mean to simply no longer munmap() the ring page from libxc / the >>> client application? >> Neither. >> >> First of all, I notice that this is probably buggy: >> >> ring_pfn = pfn; >> mmap_pfn = pfn; >> rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn); >> if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB ) >> { >> /* Page not in the physmap, try to populate it */ >> rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0, >> &ring_pfn); >> if ( rc1 != 0 ) >> { >> PERROR("Failed to populate ring pfn\n"); >> goto out; >> } >> } >> >> A failure of xc_get_pfn_type_batch() is not a suggestion that population >> might work. >> >> >> What I meant was taking out this call: >> >> /* Remove the ring_pfn from the guest's physmap */ >> rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0, >> &ring_pfn); >> if ( rc1 != 0 ) >> PERROR("Failed to remove ring page from guest physmap"); >> >> To leave the frame in the guest physmap. The issue is fundamentally >> that after this frame has been taken out, something kicks the VM to >> realise it has an extra frame of balloonable space, which it clearly >> compensates for. >> >> You can work around the added attack surface by marking it RO in EPT; >> neither Xen's nor dom0's mappings are translated via EPT, so they can >> still make updates, but the guest won't be able to write to it. >> >> I should say that this is all a gross hack, and is in desperate need of >> a proper API to make rings entirely outside of the gfn space, but this >> hack should work for now. > Thanks! So far, it seems to work like a charm like this: Great. > > diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c > index 2fef96a..5dd00a6 100644 > --- a/tools/libxc/xc_vm_event.c > +++ b/tools/libxc/xc_vm_event.c > @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t > domain_id, int param, > } > > /* Remove the ring_pfn from the guest's physmap */ > + /* > rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0, > &ring_pfn); > if ( rc1 != 0 ) > PERROR("Failed to remove ring page from guest physmap"); > + */ > + > + if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) ) > + { > + PERROR("Could not set ring page read-only\n"); > + goto out; > + } > > out: > saved_errno = errno; > > Should I send this as a patch for mainline as well? Probably a good idea, although I would include a code comment explaining what is going on, because this is subtle if you don't know the context. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.