[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 07/16] x86/shadow: call sh_update_cr3() directly from sh_page_fault()
On 27.03.2023 17:39, Tim Deegan wrote: > At 10:33 +0100 on 22 Mar (1679481226), Jan Beulich wrote: >> There's no need for an indirect call here, as the mode is invariant >> throughout the entire paging-locked region. All it takes to avoid it is >> to have a forward declaration of sh_update_cr3() in place. >> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >> --- >> I find this and the respective Win7 related comment suspicious: If we >> really need to "fix up" L3 entries "on demand", wouldn't we better retry >> the shadow_get_and_create_l1e() rather than exit? The spurious page >> fault that the guest observes can, after all, not be known to be non- >> fatal inside the guest. That's purely an OS policy. > > I think it has to be non-fatal because it can happen on real hardware, > even if the hardware *does* fill the TLB here (which it is not > required to). But if hardware filled the TLB, it won't raise #PF. If it didn't fill the TLB (e.g. because of not even doing a walk when a PDPTE is non- present), a #PF would be legitimate, and the handler would then need to reload CR3. But such a #PF is what, according to the comment, Win7 chokes on. So it can only be the former case, yet what we do here is fill the (virtual) TLB _and_ raise #PF. Win7 apparently ignores this as spurious, but that's not required behavior for an OS afaik. > Filling just one sl3e sounds plausible, though we don't want to go > back to the idea of having L3 shadows on PAE! Of course. >> Furthermore the sh_update_cr3() will also invalidate L3 entries which >> were loaded successfully before, but invalidated by the guest >> afterwards. I strongly suspect that the described hardware behavior is >> _only_ to load previously not-present entries from the PDPT, but not >> purge ones already marked present. > > Very likely, but we *are* allowed to forget old entries whenever we > want to, so this is at worst a performance problem. That depends on the model, I think: In the original Intel model PDPTEs cannot be "forgotten". In some AMD variants, where L3 is walked normally, they of course can be. >> IOW I think sh_update_cr3() would >> need calling in an "incremental" mode here. > > This would be the best way of updating just the one entry - but as far > as I can tell the existing code is correct so I wouldn't add any more > complexity unless we know that this path is causing perf problems. If it was/is just performance - sure. >> In any event emitting a TRC_SHADOW_DOMF_DYING trace record in this case >> looks wrong. > > Yep. Will add another patch to the series then. >> Beyond the "on demand" L3 entry creation I also can't see what guest >> actions could lead to the ASSERT() being inapplicable in the PAE case. >> The 3-level code in shadow_get_and_create_l2e() doesn't consult guest >> PDPTEs, and all other logic is similar to that for other modes. > > The assert's not true here because the guest can push us down this > path by doing exactly what Win 7 does here - loading CR3 with a > missing L3E, then filling the L3E and causing a page fault that uses > the now-filled L3E. (Or maybe that's not true any more; my mental > model of the pagetable walker code might be out of date) Well, I specifically started the paragraph with 'Beyond the "on demand" L3 entry creation'. IOW the assertion would look applicable to me if we, as suggested higher up, retried shadow_get_and_create_l1e() and it failed again. As the comment there says, "we know the guest entries are OK", so the missing L3 entry must have appeared. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |