[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] x86_32: spurious page faults in guest GDT area
While under long-during stress I can reproduce this issue back to at least c/s 16084, in older change sets it was apparently so rare that during normal work/testing I never noticed it or had to ignore it due to not being re-creatable. However, on recent change sets (tested with our 2.6.25- based kernels only so far) it happens much more frequently (and occasionally even while the machine boots). I inserted selector validation code in the context switch path to verify that a vcpu's selectors are okay (or better, that the guest-provided part of the GDT is accessible). These checks never indicated a failure so far. The faults may happen in various places (hypervisor exit path as well as guest code), and always involve loading a selector register with a guest defined value (i.e. in the first page of the GDT). A page walk in the (hypervisor) fault handler shows that all levels of the translation exist (and are valid/consistent), and instrumentation of the selector manipulation functions shows that none of them get called spuriously. Hence I can only suspect some asynchronous page table manipulation (but I'm not aware of anything like that) lacking proper TLB flushing, or some very rare issue with the CR3 reloading code. The same 32-bit kernel used with a 64-bit hypervisor so far did not show similar problems - while I first thought this would help narrow the problem, I'm pretty clueless at this point because the candidate areas where 32-bit code is different from 64-bit all don't look troublesome to me (most notably TLB flushing is identical between the two). Any ideas on how to narrow the problem would be appreciated. Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |