[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH for-4.12] x86/altp2m: fix HVMOP_altp2m_set_domain_state race
>>> On 08.02.19 at 12:58, <rcojocaru@xxxxxxxxxxxxxxx> wrote: > On 2/8/19 1:13 PM, Razvan Cojocaru wrote: >> On 2/8/19 12:51 PM, Jan Beulich wrote: >>>>>> On 08.02.19 at 10:56, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>> HVMOP_altp2m_set_domain_state does not domain_pause(), presumably >>>> on purpose (as it was originally supposed to cater to a in-guest >>>> agent, and a domain pausing itself is not a good idea). >>>> >>>> This can lead to domain crashes in the vmx_vmexit_handler() code >>>> that checks if the guest has the ability to switch EPTP without an >>>> exit. That code can __vmread() the host p2m's EPT_POINTER >>>> (before HVMOP_altp2m_set_domain_state "for_each_vcpu()" has a >>>> chance to run altp2m_vcpu_initialise(), but after >>>> d->arch.altp2m_active is set). >>>> >>>> While the in-guest scenario continues to pose problems, this >>>> patch fixes the "external" case. >>> >>> IOW you're papering over the problem rather than fixing it. Why >>> does altp2m_active get set to true before actually having set up >>> everything? Shouldn't it get cleared early, but set late? >> Well, yes, that would have been my second attempt: set the "altp2m >> enabled" bool after the init, and before the uninit and no longer >> domain_pause() explicitly; however I thought that was a brittle >> solution, relying on comments / programmer attention to the code >> sequence rather than taking a proper lock. >> >> I'll test that scenario then and return with the results / possibly >> another patch. > > Actually, your suggestion does not work, because the way the code has > been designed, altp2m_vcpu_initialise() calls altp2m_vcpu_update_p2m(), > which does the proper work that's interesting to us here, like this: > > 2153 static void vmx_vcpu_update_eptp(struct vcpu *v) > 2154 { > 2155 struct domain *d = v->domain; > 2156 struct p2m_domain *p2m = NULL; > 2157 struct ept_data *ept; > 2158 > 2159 if ( altp2m_active(d) ) > 2160 p2m = p2m_get_altp2m(v); > 2161 if ( !p2m ) > 2162 p2m = p2m_get_hostp2m(d); > 2163 > 2164 ept = &p2m->ept; > 2165 ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m)); > 2166 > 2167 vmx_vmcs_enter(v); > 2168 > 2169 __vmwrite(EPT_POINTER, ept->eptp); > 2170 > 2171 if ( v->arch.hvm.vmx.secondary_exec_control & > 2172 SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS ) > 2173 __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx); > 2174 > 2175 vmx_vmcs_exit(v); > 2176 } > > So please note that on line 2159 it checks if altp2m is active, and only > then does it do the right thing. So setting the d->arch.altp2m_active > bool _after_ calling altp2m_vcpu_initialise() will fail to work > correctly - turning this into a chicken-and-egg problem, or perhaps more > interestingly, another discussion about whether in-guest-only altp2m > agents make any sense fundamentally. Well, to be honest I expected dependencies like this to be there, and hence I didn't expect it would be a straightforward change. Just like we do e.g. for the IOMMU enabling, I guess the boolean wants to become a tristate then (off -> enabling -> enabled), which interested sites then can use to distinguish what they want/need to do. Another relatively obvious solution would be to add a boolean parameter to altp2m_vcpu_update_p2m() such that altp2m_vcpu_initialise() can guide it properly. But this of course depends to a certain degree on how wide spread the problem is. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |