Xen project Mailing List

Re: [Xen-devel] Altp2m use with PML can deadlock Xen

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx>

Date: Fri, 10 May 2019 09:09:06 -0600

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>

Delivery-date: Fri, 10 May 2019 15:09:51 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, May 10, 2019 at 8:59 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > On 10/05/2019 15:53, Razvan Cojocaru wrote: > > On 5/10/19 5:42 PM, Tamas K Lengyel wrote: > >> On Thu, May 9, 2019 at 10:19 AM Andrew Cooper > >> <andrew.cooper3@xxxxxxxxxx> wrote: > >>> > >>> On 09/05/2019 14:38, Tamas K Lengyel wrote: > >>>> Hi all, > >>>> I'm investigating an issue with altp2m that can easily be reproduced > >>>> and leads to a hypervisor deadlock when PML is available in hardware. > >>>> I haven't been able to trace down where the actual deadlock occurs. > >>>> > >>>> The problem seem to stem from hvm/vmx/vmcs.c:vmx_vcpu_flush_pml_buffer > >>>> that calls p2m_change_type_one on all gfns that were recorded the PML > >>>> buffer. The problem occurs when the PML buffer full vmexit happens > >>>> while the active p2m is an altp2m. Switching p2m_change_type_one to > >>>> work with the altp2m instead of the hostp2m however results in EPT > >>>> misconfiguration crashes. > >>>> > >>>> Adding to the issue is that it seem to only occur when the altp2m has > >>>> remapped GFNs. Since PML records entries based on GFN leads me to > >>>> question whether it is safe at all to use PML when altp2m is used with > >>>> GFN remapping. However, AFAICT the GFNs in the PML buffer are not the > >>>> remapped GFNs and my understanding is that it should be safe as long > >>>> as the GFNs being tracked by PML are never the remapped GFNs. > >>>> > >>>> Booting Xen with ept=pml=0 resolves the issue. > >>>> > >>>> If anyone has any insight into what might be happening, please let > >>>> me know. > >>> > >>> > >>> I could have sworn that George spotted a problem here and fixed it. I > >>> shouldn't be surprised if we have more. > >>> > >>> The problem that PML introduced (and this is mostly my fault, as I > >>> suggested the buggy solution) is that the vmexit handler from one vcpu > >>> pauses others to drain the PML queue into the dirty bitmap. Overall I > >>> wasn't happy with the design and I've got some ideas to improve it, but > >>> within the scope of how altp2m was engineered, I proposed > >>> domain_pause_except_self(). > >>> > >>> As it turns out, that is vulnerable to deadlocks when you get two vcpus > >>> trying to pause each other and waiting for each other to become > >>> de-scheduled. > >> > >> Makes sense. > >> > >>> > >>> I see this has been reused by the altp2m code, but it *should* be safe > >>> to deadlocks now that it takes the hypercall_deadlock_mutext. > >> > >> Is that already in staging or your x86-next branch? I would like to > >> verify that the problem is still present or not with that change. I > >> tested with Xen 4.12 release and that definitely still deadlocks. > > > > I don't know if Andrew is talking about this patch (probably not, but > > it looks at least related): > > > > http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=24d5282527f4647907b3572820b5335c15cd0356;hp=29d28b29190ba09d53ae7e475108def84e16e363 > > > > I was referring to 29d28b2919 which is also in 4.12 as it turns out. > That said, 24d5282527 might in practice be the cause of the deadlock, so > I'd first experiment with taking that fix out. > > I know for certain that it won't be tested with PML enabled, because the > use of PML is incompatible with write-protecting guest pagetables. > Sounds like it's the safe bet to just have PML be disabled for when introspection is used. I would say it would be even better if the use of PML could be controlled on a per-guest base instead of the current global on/off switch. That way it could be disabled only for the introspected domains. I'll do some more experimentation when I get some free time but two observations that speak against the vCPUs trying to pause each other being the culprit is that: - the deadlock doesn't happen with xen-access' altp2m use, it only happens when there are remapped gfn's in the altp2m views - I've added a domain_pause/unpause to the PML flusher before it enters the flush loop but I still got a deadlock Tamas _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.