[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.



 

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx] 
> Sent: 13 April 2007 17:56
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> On 13/4/07 17:47, "Petersson, Mats" <Mats.Petersson@xxxxxxx> wrote:
> 
> > See my other reply, although you may have a point about mapping - my
> > guest is running with the HVMloaders map, which probably 
> maps all memory
> > available to guest linearly, including address zero (as that's where
> > real-mode puts the interrupt vector table, which can be 
> useful to have
> > mapped - just a little bit ;-) ).
> > 
> > So maybe we need an earlier/different test to kill guest? Or do you
> > think this is such a critical error that hypervisor should die?
> 
> The NULL dereference is inside the hypervisor in 
> hvm_do_resume(). At that
> point you are running in Xen's address space, not the guest's. And Xen
> should have no mapping at address zero.

Yes, of course - me not thinking right - sorry [it is late-ish on a
Friday, that's my excuse and I'm sticking to it]. 

> 
> The issue here is that shared_page_va is not initialised, so 
> it contains 0.
> hvm_do_resume() should be getting a pointer derived from this 
> value via
> get_vio(). When it dereferences it, Xen should crash. That 
> didn't happen for
> you and that is scarily inexplicable.

Yes, I follow that. 

However, my guest does A LOT of IOIO exits (it's an IDE test-app), with
some HLT and IRQ exits thrown in for good measure. So if the guest is
doing IOIO exit it would end up in platform.c:844 before it gets to
hvm_do_resume?  Or are you saying that we should crash as soon as the
guest restarts, because that's done through hvm_do_resume? 

> 
> I suggest adding some tracing to hvm_do_resume() to find out 
> whether it is
> being called at all and, if it is, what value it sets its 
> local variable 'p'
> to. Also what value is in v->domain->arch.hvm-domain.shared_page_va.

Would a check for zero in get_vio() with domain_crash_synchronous() be a
"good thing" here, or is that too time-consuming in a relatively
time-critical path of HVM?

I will look at it on Monday (before I update to the new version, just to
make sure I can reproduce it still ;-) ).


> 
> The bugs that cause this condition should all be fixed in xen-unstable
> staging tip, by the way. I just think this situation should 
> be investigated
> before you upgrade in case you've uncovered another latent 
> bug. Because you
> really should be crashing in hvm_do_resume() in this scenario.
> 
>  -- Keir
> 
> 
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.