[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [VMI] Possible race-condition in altp2m APIs
On 06/05/2019 18:41, Tamas K Lengyel wrote: > Hi Andrew, > thanks for helping brainstorming on this. > >> How exactly does DRAKVUF go about injecting silent breakpoints? It >> obviously has to allocate a new gfn from somewhere to begin with. Do the >> bifurcated frames end up in two different altp2ms, or one in the host p2m >> and one in an alternative? Does #VE ever get used? > I've posted a blog entry about it a while ago, it's still accurate: > https://xenproject.org/2016/04/13/stealthy-monitoring-with-xen-altp2m. Talking of, have we fixed the emulation of `sti`? I don't recall any changes, but given our aim to get the emulator complete, we should fix it. > You can't add new frames to only some of the altp2m's - at least not > with the current interfaces. All the shadow pages are added to the > hostp2m and then in the altp2m the GFN is remapped to the mfn of the > shadow page with an execute-only permissions. Ah - of course. gfns only make sense in the context of the hostp2m. > This way the breakpoint > can be written into the shadow-page and any attempt to read it can be > safely handled on a per-vCPU base by switching it back to the hostp2m > for the duration of a singlestep (with MTF). Setting up the shadow > pages is only safe to do during the initial setup while the altp2m > view is not used and the guest is paused. Once altp2m views are being > used adding new pages to the hostp2m results in losing all altp2m > settings. For the most part this limitation is not an issue because > all supported use-cases add the breakpoints once during the initial > setup and there are no breakpoints added later during runtime. What do the host p2m permissions get set to? How do you cope with future reuse of the gfn for a different purpose later? > > We've noticed that trapping MOV-TO-CR3 with the latest version of > Windows 10 has a lot of issues in terms of overhead when KPTI is used, > so as a band-aid solution it can be disabled to improve performance > (which Mathieu already did). Meltdown isn't subtle with its perf problems... What purpose are you trapping %cr3 writes for? Simply auditing the pagetables in use? If so, VT-x has (since forever, iirc) had the CR3 target list (of 4 entries) which Xen can use to whitelist "safe" %cr3 values, which bypass the VMExit. If all you care about is that the vcpu stays on known-good pagetables, this interface could be plumbed up to include the kernel and user pagetables, which will avoid all the vmexits from syscalls due to meltdown. Alternatively, in some copious free time, once I've got the CPUID/MSR interface in a better state, we could fake up MSR_ARCH_CAPS.RDCL_NO so the guest doesn't turn on its meltdown mitigations in the first place. >> Given how many EPT flushing bugs I've already found in this area, I wouldn't >> be surprised if there are further ones lurking. If it is an EPT flushing >> bug, this delta should make it go away, but it will come with a hefty perf >> hit. > My understanding is that the VPID implementation in Xen is such that > effectively all VMEXITs will trigger assignment of a new VPID to the > vCPU - which is likely a performance issue in itself - so flushing the > EPT is likely not going to make a difference. But it's worth a shot, > maybe it does :) Sadly, things are far more complicated than that. For one, Intel still owe me a comment/correction to that section of the SDM on INVLPG emulation for guests. Xen's use of ASIDs as a common concept started from the AMD side. AMD strictly only cache linear => host physical mappings, so after any change to the p2m, an ASID tick will guarantee to get you a fully clean TLB for future pagewalks to populate. The same is not true for Intel. VPID and EPT were introduced together, and have several kinds of mappings which are cached. The processor may cache: 1) linear => gpa mappings (tagged with current VPID and PCID values, and contain no information from EPT) 2) gpa => hpa mappings (tagged with the current EPTP, may contain other data such as the SPP vector, doesn't contain any data from the guest pagetables) 3) combined mappings which are linear => hpa mappings. In particular, ticking the VPID after an EPT modification *does not* invalidate the gpa=>hpa mappings, so the guest can continue to execute using stale mappings. This is why we've got the logic in vmx_vmenter_helper() to calculate if an INVEPT instruction is necessary. Hence my suggestion for identifying whether it is a real TLB flushing issue, or a logical error elsewhere. :) ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |