[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail



On 10/01/17 09:06, Razvan Cojocaru wrote:
> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>> Hello,
>>>
>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>> to eat up all the RAM it can:
>>>
>>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>
>>> This leads to a problem with xen-access, specifically libxc which does
>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>
>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>                                  &mmap_pfn, 1);
>>>
>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>> {
>>>     /* Map failed, populate ring page */
>>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>                                                &ring_pfn);
>>>     if ( rc1 != 0 )
>>>     {
>>>         PERROR("Failed to populate ring pfn\n");
>>>         goto out;
>>>     }
>>>
>>> The first time everything works fine, xen-access can map the ring page.
>>> But most of the time the second time fails in the
>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>> Xen log (once for each failed attempt):
>>>
>>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>> Thinking further about this, what happens if you avoid removing the page
>> on exit?
>>
>> The first populate succeeds, and if you leave the page populated, the
>> second time you come around the loop, it should not be of type XTAB, and
>> the map should succeed.
> Sorry for the late reply, had to put out another fire yesterday.
>
> I've taken your recommendation to roughly mean this:
>
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index ba9690a..805564b 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -100,8 +100,11 @@ static int vm_event_enable(
>      return 0;
>
>   err:
> +    /*
>      destroy_ring_for_helper(&ved->ring_page,
>                              ved->ring_pg_struct);
> +    */
> +    ved->ring_page = NULL;
>      vm_event_ring_unlock(ved);
>
>      return rc;
> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
> struct vm_event_domain *ved)
>              }
>          }
>
> +        /*
>          destroy_ring_for_helper(&ved->ring_page,
>                                  ved->ring_pg_struct);
> +       */
>
> +        ved->ring_page = NULL;
>          vm_event_cleanup_domain(d);
>
>          vm_event_ring_unlock(ved);
>
> but this unfortunately still fails to map the page the second time. Do
> you mean to simply no longer munmap() the ring page from libxc / the
> client application?

Neither.

First of all, I notice that this is probably buggy:

    ring_pfn = pfn;
    mmap_pfn = pfn;
    rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
    if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
    {
        /* Page not in the physmap, try to populate it */
        rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
                                              &ring_pfn);
        if ( rc1 != 0 )
        {
            PERROR("Failed to populate ring pfn\n");
            goto out;
        }
    }

A failure of xc_get_pfn_type_batch() is not a suggestion that population
might work.


What I meant was taking out this call:

    /* Remove the ring_pfn from the guest's physmap */
    rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
&ring_pfn);
    if ( rc1 != 0 )
        PERROR("Failed to remove ring page from guest physmap");

To leave the frame in the guest physmap.  The issue is fundamentally
that after this frame has been taken out, something kicks the VM to
realise it has an extra frame of balloonable space, which it clearly
compensates for.

You can work around the added attack surface by marking it RO in EPT;
neither Xen's nor dom0's mappings are translated via EPT, so they can
still make updates, but the guest won't be able to write to it.

I should say that this is all a gross hack, and is in desperate need of
a proper API to make rings entirely outside of the gfn space, but this
hack should work for now.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.