[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 6/7] xen-gntdev: Support mapping in HVM domains



On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
> On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
>>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, 
>>> int offset, int pages)
>>>             goto out;
>>>  
>>>     for (i = 0; i < pages; i++) {
>>> +           uint32_t check, *tmp;
>>>             WARN_ON(unmap_ops[i].status);
>>> -           __free_page(map->pages[offset+i]);
>>> +           if (!map->pages[i])
>>> +                   continue;
>>> +           /* XXX When unmapping, Xen will sometimes end up mapping the GFN
>>> +            * to an invalid MFN. In this case, writes will be discarded and
>>> +            * reads will return all 0xFF bytes. Leak these unusable GFNs
>>> +            * until a way to restore them is found.
>>> +            */
>>> +           tmp = kmap(map->pages[i]);
>>> +           tmp[0] = 0xdeaddead;
>>> +           mb();
>>> +           check = tmp[0];
>>> +           kunmap(map->pages[i]);
>>> +           if (check == 0xdeaddead)
>>> +                   __free_page(map->pages[i]);
>>> +           else if (debug)
>>> +                   printk("%s: Discard page %d=%ld\n", __func__,
>>> +                           i, page_to_pfn(map->pages[i]));
>>
>> Whoa. Any leads to when the "sometimes" happens? Does the status report an
>> error or is it silent?
> 
> Status is silent in this case. I can produce it quite reliably on my
> test system where I am mapping a framebuffer (1280 pages) between two
> HVM guests - in this case, about 2/3 of the released pages will end up
> being invalid. It doesn't seem to be size-related as I have also seen
> it on the small 3-page page index mapping. There is a message on xm
> dmesg that may be related:
> 
> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
> 7cbc6: c=8000000000000004 t=7400000000000002
> 
> This appears about once per page, with different MFNs but the same c/t.
> One of the two HVM guests (the one doing the mapping) has the PCI
> graphics card forwarded to it.
> 

Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
that breaks HVM grants), which produces different output:

...
(XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 
00000000b803e021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 
00000000b8038021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 
00000000b803d021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 
0000000010829021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 
000000001081c021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 
0000000010816021 for l1e_owner=1, pg_owner=1
(XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 
000000001081a021 for l1e_owner=1, pg_owner=1
...

This appears on the map; nothing is printed on the unmap. If the
unmap happens while the domain is up, it seems to be invalid more often;
most (perhaps all) of the destination-valid unmaps happen when the domain
is being destroyed. Exactly which pages are valid or invalid seems to be
mostly random, although nearby GFNs tend to have the same validity.

If you have any thoughts as to the cause, I can test patches or provide
output as needed; it would be better if this workaround weren't needed.

-- 
Daniel De Graaf
National Security Agency

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.