[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V3] X86/vMCE: handle broken page with regard to migration



>>> On 21.11.12 at 14:26, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> Ian Campbell wrote:
>> On Wed, 2012-11-21 at 11:34 +0000, George Dunlap wrote:
>>> On 20/11/12 18:42, Ian Jackson wrote:
>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE: handle
>>>> broken page with regard to migration"): 
>>>>> Ian Jackson wrote:
>>>>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH V3] X86/vMCE: handle
>>>>>> broken page with regard to migration"):
>>>>>>> No, at last lter, there are 4 points:
>>>>>>> 1. start last iter
>>>>>>> 2. get and transfer pfn_type to target
>>>>>>> 3. copy page to target
>>>>>>> 4. end last iter
>>>> ...
>>>>> It indeed checks mce after point 3 for each page, but what's the
>>>>> advantage of keeping a separate list?
>>>> It avoids yet another loop over all the pages.  Unless I have
>>>> misunderstood.  Which I may have, because: if it checks for mce
>>>> after 
>>>> point 3 then surely that is sufficient ?  We don't need to worry
>>>> about 
>>>> mces after that check.
>>> 
>>> It's sufficient, but wouldn't each check require a separate
>>> hypercall? That would surely be slower than just a single hypercall
>>> and a loop (which is what Jinsong's patch does).
>>> 
>>> We don't actually need a list -- I think we just need to know, "Have
>>> any pages broken between reading the p2m table (
>>> xc_get_pfn_type_batch() ); if so, we do another full iteration.
>> 
>> If a page fails between 2. and 3. above then what happens at point 3?
>> I presume we can't map and send the page (since it is broken), do we
>> get some sort of failure to map?
>> 
>> What happens if the failure occurs during stage 3, i.e. while the page
>> is mapped and we are reading from it?
>> 
>> Ian.
> 
> If read a broken page, it generates more serious error (say, SRAR error).
> I don't think guest has good opportunity to survive under this case --> most 
> probably it kill itself and of course we don't need care migration now.
> However, if guest can luckly survive (say complete broken page copying to 
> target), it's OK to continue --> its broken pfn_type will transfer to target 
> next iter so guest will kill itself if access then.

I think you misread the question - it said "we", as in "the tools/
kernel/hypervisor" (at least that's how I'm reading it). The MCE
would surface in host context in this case, and whether that's
fatal to the host depends on the precise properties of the event.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.