Re: [Xen-devel] Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)

On 26/08/11 15:44, Konrad Rzeszutek Wilk wrote:
> So while I am still looking at the hypervisor code to figure out why
> it would give me [when trying to map a grant page]:
> (XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000

It is failing in guest_map_l1e() because the page for the vmalloc'd
virtual address PTEs is not present.

The test that fails is:

(l2e_get_flags(l2e) & (_PAGE_PRESENT | _PAGE_PSE)) != _PAGE_PRESENT

I think this is because the GNTTABOP_map_grant_ref hypercall is done
when task->active_mm != &init_mm and alloc_vm_area() only adds PTEs into
init_mm so when Xen looks in the page tables it doesn't find the entries
because they're not there yet.

Putting a call to vmalloc_sync_all() after create_vm_area() and before
the hypercall makes it work for me.  Classic Xen kernels used to have
such a call.

This presumably works on some systems/configuration and not others
depending on what else is using vmalloc(). i.e., if another kernel
thread (?) calls vmalloc() etc. then there will be a page for vmalloc
area PTEs and it will work.

I'll try and post a patch tomorrow.

Thanks to Ian Campbell for pointing me in the right direction.


