[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 107791: regressions - FAIL



>>> On 28.04.17 at 03:19, <osstest-admin@xxxxxxxxxxxxxx> wrote:
> flight 107791 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/107791/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. 
> vs. 107676

This may be indicative of a bug, I'm afraid:

(XEN) d0v4 Error pfn 804a8c: rd=32755 od=32756 caf=00000000 taf=0000000000000000
(XEN) d0v4 Error pfn 804a8c: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a8b: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a8b: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a8a: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a8a: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
...
(XEN) d0v4 Error pfn 804a31: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a31: rd=32755 od=32756 caf=180000000000000 
taf=0000000000000001
(XEN) d0v4 Error pfn 804a30: rd=32755 od=32756 c(XEN) HVM12 save: CPU

rd is DOMID_COW and od is DOMID_INVALID (the latter indicating
an unowned page). I don't suppose there is any memory sharing
being set up for any tests in osstest, so I'm pretty confused. While
there is a respective get_page() in get_page_from_gfn_p2m(),
which  surely looks like it shouldn't trigger any log message, from
what I can tell this isn't the path we get here, or else an unowned
page would also trigger the same warning for the immediately
preceding get_page(page, d). Or wait, no, the warning is being
suppressed for paging_mode_refcounts() and dying domains, so
quite likely it is that path.

Therefore one possible code adjustment might be to do this
second get_page() only when p2m_is_shared(*t). We hold the
p2m lock, so the type is stable between when it was retrieved
and its possible use here. George, Tamas?

It also looks as if it should tell me something that the first page
on the first attempt has count_info and type_info zero, yet the
second attempt as well as all succeeding pages are
PGC_state_free (in which case type_info is meaningless and
rather indicates that the need_tlbflush flag is set). But for now
it doesn't; best I can think of is a race of domain cleanup with
something else still accessing the domain.

Does anyone else have any thoughts here?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.