[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Altp2m use with PML can deadlock Xen



On Fri, May 10, 2019 at 9:21 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 10/05/2019 16:09, Tamas K Lengyel wrote:
> > On Fri, May 10, 2019 at 8:59 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> 
> > wrote:
> >> On 10/05/2019 15:53, Razvan Cojocaru wrote:
> >>> On 5/10/19 5:42 PM, Tamas K Lengyel wrote:
> >>>> On Thu, May 9, 2019 at 10:19 AM Andrew Cooper
> >>>> <andrew.cooper3@xxxxxxxxxx> wrote:
> >>>>> On 09/05/2019 14:38, Tamas K Lengyel wrote:
> >>>>>> Hi all,
> >>>>>> I'm investigating an issue with altp2m that can easily be reproduced
> >>>>>> and leads to a hypervisor deadlock when PML is available in hardware.
> >>>>>> I haven't been able to trace down where the actual deadlock occurs.
> >>>>>>
> >>>>>> The problem seem to stem from hvm/vmx/vmcs.c:vmx_vcpu_flush_pml_buffer
> >>>>>> that calls p2m_change_type_one on all gfns that were recorded the PML
> >>>>>> buffer. The problem occurs when the PML buffer full vmexit happens
> >>>>>> while the active p2m is an altp2m. Switching  p2m_change_type_one to
> >>>>>> work with the altp2m instead of the hostp2m however results in EPT
> >>>>>> misconfiguration crashes.
> >>>>>>
> >>>>>> Adding to the issue is that it seem to only occur when the altp2m has
> >>>>>> remapped GFNs. Since PML records entries based on GFN leads me to
> >>>>>> question whether it is safe at all to use PML when altp2m is used with
> >>>>>> GFN remapping. However, AFAICT the GFNs in the PML buffer are not the
> >>>>>> remapped GFNs and my understanding is that it should be safe as long
> >>>>>> as the GFNs being tracked by PML are never the remapped GFNs.
> >>>>>>
> >>>>>> Booting Xen with ept=pml=0 resolves the issue.
> >>>>>>
> >>>>>> If anyone has any insight into what might be happening, please let
> >>>>>> me know.
> >>>>>
> >>>>> I could have sworn that George spotted a problem here and fixed it.  I
> >>>>> shouldn't be surprised if we have more.
> >>>>>
> >>>>> The problem that PML introduced (and this is mostly my fault, as I
> >>>>> suggested the buggy solution) is that the vmexit handler from one vcpu
> >>>>> pauses others to drain the PML queue into the dirty bitmap.  Overall I
> >>>>> wasn't happy with the design and I've got some ideas to improve it, but
> >>>>> within the scope of how altp2m was engineered, I proposed
> >>>>> domain_pause_except_self().
> >>>>>
> >>>>> As it turns out, that is vulnerable to deadlocks when you get two vcpus
> >>>>> trying to pause each other and waiting for each other to become
> >>>>> de-scheduled.
> >>>> Makes sense.
> >>>>
> >>>>> I see this has been reused by the altp2m code, but it *should* be safe
> >>>>> to deadlocks now that it takes the hypercall_deadlock_mutext.
> >>>> Is that already in staging or your x86-next branch? I would like to
> >>>> verify that the problem is still present or not with that change. I
> >>>> tested with Xen 4.12 release and that definitely still deadlocks.
> >>> I don't know if Andrew is talking about this patch (probably not, but
> >>> it looks at least related):
> >>>
> >>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=24d5282527f4647907b3572820b5335c15cd0356;hp=29d28b29190ba09d53ae7e475108def84e16e363
> >>>
> >> I was referring to 29d28b2919 which is also in 4.12 as it turns out.
> >> That said, 24d5282527 might in practice be the cause of the deadlock, so
> >> I'd first experiment with taking that fix out.
> >>
> >> I know for certain that it won't be tested with PML enabled, because the
> >> use of PML is incompatible with write-protecting guest pagetables.
> >>
> > Sounds like it's the safe bet to just have PML be disabled for when
> > introspection is used. I would say it would be even better if the use
> > of PML could be controlled on a per-guest base instead of the current
> > global on/off switch. That way it could be disabled only for the
> > introspected domains.
> >
> > I'll do some more experimentation when I get some free time but two
> > observations that speak against the vCPUs trying to pause each other
> > being the culprit is that:
> > - the deadlock doesn't happen with xen-access' altp2m use, it only
> > happens when there are remapped gfn's in the altp2m views
> > - I've added a domain_pause/unpause to the PML flusher before it
> > enters the flush loop but I still got a deadlock
>
> Do you have a minimal repro of the deadlock you could share?

The deadlock is easily reproducible but right now it's only with a
full setup of DRAKVUF (just the standard setup as described on
drakvuf.com). Also, the deadlock only triggers after connecting to the
VM being monitored over VNC (that's what triggers the VGA logdirty
pages filling up the PML buffer).

> Even if it is a combo PML+altp2m problem, we should fix the issue,
> because there are VMI usecases which don't care about write-protecting
> guest pagetables, and we don't want to prevent those cases from using PML.

Agree. For my use-case it's not an issue - I already just posted the
updated instructions to boot with ept=pml=0. But it's better to fix it
before it leads to other problems down the road.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.