[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] anomaly in irq check in fixup_page_fault()



On Thu, 2011-07-21 at 07:35 +0100, Keir Fraser wrote:
> On 21/07/2011 02:30, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:
> 
> > Hi,
> > 
> > This is a bit confusing. This for PVOPs kernel, I've not looked at older
> > PV kernels to see what they do yet. But, the VCPU starts with
> > evtchn_upcall_mask set and eflags.IF enabled. However, during kernel
> > boot memory mapping lot of faults are getting fixed up by xen in:
> > 
> > fixup_page_fault():
> >     /* No fixups in interrupt context or when interrupts are disabled. */
> >     if ( in_irq() || !(regs->eflags & X86_EFLAGS_IF) )  <------
> >         return 0;
> 
> A PV guest never has EF.IF=0, so the early exit should never be triggered by
> a guest fault.

When I was playing with PV in HVM prototypes way back I noticed that,
for a pvops kernel at least, we seem to accidentally rely on the fact
that trying to clear EFLAGS.IF from RING>0 silently ignores the change
(as it does for any privileged bits in EFLAGS). This meant that on
vmexit I would sometimes discover that IF was cleared. Originally I made
this shoot the guest (it must be misbehaving, right!) but in the end I
decided to be pragmatic and always |=EFLAGS_IF on the vmexit path.

I _think_ this was the original reason I discovered the issue that I
fixed with the short series I reposted at
http://marc.info/?l=linux-kernel&m=130987084009107 (IOW I think
kernel_eflags ended up with IF incorrect under pvops Xen, but it was a
long time ago so perhaps I'm misremembering).

I also vaguely recall that the optimisation used in Xen's implementation
of the xen_save_fl or xen_save_fl_direct, which basically only
guarantees that the bit at EFLAGS_IF is valid in the value it returns
(compared with native_save_fl which returns a full set of EFLAGS), was
also something I suspected being implicated in IF getting turned off --
but you can pretty easily (at the expense of the optimisation) make
those hooks return the real eflags with the ~evtchn_upcall_mask in the
EFLAGS_IF bit.

At the time I sprinkled assertions around the guest kernel to help debug
the issue, patch (against 2.6.32, so ancient) attached FWIW.

Ian.

> Your best bet is to fake this out in your HVM container wrapper. Just write
> an EFLAGS into the saved regs that has EF.IF=1, as would always be the case
> for a normal PV guest. Rather that than fragile eis_hvm_pv() checks
> scattered around.
> 
> The setting of EF.IF shouldn't matter much for your guest as you'll be doing
> PV event delivery anyway, but I wonder how it ends up with EF.IF=0 -- is
> that deliberate?
> 
>  -- Keir
> 
> > The guest is running under the assumption of INTs disabled during
> > init_memory_mapping, and the first enable happens much later. So this
> > check seems redundant at least for PVOPs kernel.
> > 
> > Now for my hybrid, the guest during initial boot is running with IF
> > disabled, so fixup doesn't like that. Not sure if permanently disabling
> > the (eflags & X86_EFLAGS_IF) check for hybrid would be a good idea for
> > me.
> > 
> > thanks,
> > Mukesh
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

Attachment: instrumentation.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.