[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFCv3 3/8] xen: introduce XEN_DOMCTL_set_recipient



>>> On 25.11.14 at 16:41, <vkuznets@xxxxxxxxxx> wrote:
> "Jan Beulich" <JBeulich@xxxxxxxx> writes:
>>>>> On 07.10.14 at 15:10, <vkuznets@xxxxxxxxxx> wrote:
>>> @@ -1764,11 +1765,28 @@ void free_domheap_pages(struct page_info *pg, 
>>> unsigned int order)
>>>              scrub = 1;
>>>          }
>>>  
>>> -        if ( unlikely(scrub) )
>>> -            for ( i = 0; i < (1 << order); i++ )
>>> -                scrub_one_page(&pg[i]);
>>> +        if ( !d || !d->recipient || d->recipient->is_dying )
>>> +        {
>>> +            if ( unlikely(scrub) )
>>> +                for ( i = 0; i < (1 << order); i++ )
>>> +                    scrub_one_page(&pg[i]);
>>>  
>>> -        free_heap_pages(pg, order);
>>> +            free_heap_pages(pg, order);
>>> +        }
>>> +        else
>>> +        {
>>> +            mfn = page_to_mfn(pg);
>>> +            gmfn = mfn_to_gmfn(d, mfn);
>>> +
>>> +            page_set_owner(pg, NULL);
>>> +            assign_pages(d->recipient, pg, order, 0);
>>
>> This can fail.
> 
> .. if the recipient is dying or we're over-allocating. Shouldn't happen
> (if toolstack does its job right) but I'll add a check.

You must not rely on the tool stack doing things right (see XSA-77).

>> What's worse though is that you don't guard against
>> XEN_DOMCTL_set_recipient zapping d->recipient. If that really is
>> necessary to be supported, some synchronization will be needed.
>>
>> And finally - what's the intended protocol for the tool stack to
>> know that all pages got transferred (and hence the new domain
>> can be launched)?
>>
> 
> (hope this answers both questions above)
> 
> Now the protocol is:
> 1) Toolstack queries for all page frames (with xc_get_pfn_type_batch())
> for the old domain.
> 2) Toolstack issues XEN_DOMCTL_set_recipient with the recipient domain
> 3) Toolstack kills the source domain
> 4) Toolstack waits for awhile (loop keeps checking while we see new pages
> arriving + some timeout).
> 5) Toolstack issues XEN_DOMCTL_set_recipient with DOMID_INVALID
> resetting recipient.
> 6) Toolstack checks if all pages were transferred
> 7) If some pages are missing (e.g. source domain became zombie holding
> something) request new empty pages instead.
> 8) Toolstack starts new domain
> 
> So we don't actually need XEN_DOMCTL_set_recipient to switch from one
> recipient to the other, we just need a way to reset it.

No, this doesn't address either question. For the first one, you again
assume the tool stack behaves correctly, which is wrong nowadays.
And for the second you give a description, but that's not really a
dependable protocol. Nor do I really follow why resetting the recipient
is necessary. Can't the tools e.g. wait for the final death of the domain
if there's no other notification?

>>> --- a/xen/include/public/domctl.h
>>> +++ b/xen/include/public/domctl.h
>>> @@ -965,6 +965,23 @@ struct xen_domctl_vnuma {
>>>  typedef struct xen_domctl_vnuma xen_domctl_vnuma_t;
>>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vnuma_t);
>>>  
>>> +/*
>>> + * XEN_DOMCTL_set_recipient - sets the 'recipient' domain which will 
>>> recieve
>>> + * all the domain's memory after its death or when and memory page from
>>> + * domain's heap is being freed.
>>
>> So this specifically allows for this to happen on a non-dying domain.
>> Why, i.e. what's the intended usage scenario? If not really needed,
>> please remove this and verify in the handling that the source domain
>> is dying.
> 
> Sorry if I didn't get this comment right. We need a way to tell which
> domain will recieve memory and XEN_DOMCTL_set_recipient sets (and
> resets) this target domain. We call it from toolstack before we start
> killing old domain so it is not dying yet. We can't do it
> hypervistor-side when our source domain exits doing SHUTDOWN_soft_reset
> as there is no recipient domain created yet.
> 
> From a general (hypervisor) point of view we don't actually care if the
> domain is dying or not. We just want to recieve all freed pages from
> this domain (so they go to some other domain instead of trash).

We do care I think, primarily because we want a correct GMFN <->
MFN mapping. Seeing multiple pages for the same GMFN is for
example going to cause the whole process in its current form to fail.
And considering the need for such a correct mapping - is it always
the case that the mapping gets updated after a page got removed
from a guest? (I can see that the mapping doesn't change anymore
for a dying guest, but you explicitly say that you want/need this to
work before the guest is actually marked dying.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.