[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.0.0x allows for data corruption in Dom0



On 03/09/2010 12:41 AM, Jeremy Fitzhardinge wrote:
> On 03/08/2010 03:23 PM, Joanna Rutkowska wrote:
>> But the corruptions always happen in 32-bytes chunks, which might
>> suggest it's not a page-related problem (e.g. wrongly re-used page), as
>> in that case we would be observing (at least sometimes) much bigger
>> chunks of corrupted data, I think.
>>    
> 
> Given that the domU doesn't have any devices or much going on, it could
> easily be corrupting memory in only small amounts.
> 
But see, before I tried this with such a small dummy do-nothing DomU
(which I did for the purpose of reporting to xen-devel), I experienced
very similar corruption when running regular VMs, i.e. with normal linux
and all the usual apps inside them. Same pattern of corruption.

>> The reason why I still believe it's a hypervisor related thing, it that
>> I'm currently using the very *same* Dom0 kernel (very recent
>> xen/stable-2.6.31) with Xen 3.4.2 and the system is damn stable. And I
>> really mean extensive use with 5-7 VMs running all the time doing
>> various things from Web browsing to kernel building.
>>    
> 
> OK, it's always good to get some positive feedback.
> 

At least one full-time user of the pvops kernel ;)

>> If I was to make an educated guess I would say it's something related to
>> some interrupt handling, i.e. Xen mishandling it, e.g. the handler is
>> writing out-of-buffer somewhere and it just happens to land in the Dom0
>> fs buffer used by e.g. dd operation.
>>    
> 
> 
> It would be interesting to see what happens if you write the file with
> the test domain paused (xm pause ...).  If the corruption continues,
> then it is almost certainly Xen.

Right.

> If it stops, then it either means the
> corruption was caused by pages inappropriately shared between dom0 and
> domU, or something like vcpu context switch is corrupting memory (which
> would be very sad).
> 

Unfortunately, I cannot do any more tests. We have downgraded all our
test machines to Xen 3.4.2 and are using them for other things now. Sorry.

joanna.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.