[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Error ignored in xc_map_foreign_pages



On Feb 13, 2014, at 5:00 AM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:

> On Wed, 2014-02-12 at 18:53 -0800, Mukesh Rathor wrote:
>> It appears that xc_map_foreign_pages() handles return incorrectly :
> 
> libxc is a complete mess in this regard (error handling) generally.
> 
> IIRC there is some subtlety on the Linux privcmd side wrt partial
> failure of batches, particularly once paging/sharing gets involved and
> you want to retry the missing subset, which might have something to do
> with this. David and Andres might know if that is relevant here.

Yes, unfortunately the semantics of error communication are rather quirky and 
therefore fairly error prone.

IIRC, the kernel interface returns -ENOENT in the global rc if any individual 
failure is ENOENT. ENOENT is the case for "hit a paged out pfn, you have to 
wait until it is paged back in". Libxc, however, has the retry built in to wait 
so that ENOENTs (individual or global) are never returned to the caller of 
xc_map_foreign_bulk and friends.

The other condition that sets the rc for the whole operation is an EFAULT in 
copy to/from user. In that (unlikely) case, the values in the err array cannot 
be reasonably trusted.

Finally, if you have partial or total success the rc is zero, and individual 
error entries may have non-zero rc, as in the case in which a map of a hole is 
attempted.

So I believe you've identified a bug in the code below. As for the proposed 
solution, I would still check globally for EFAULT and have a big hammer in that 
case.

Caveat: I haven't looked at the actual code when whipping up this email.

Andres 
> 
>>    res = xc_map_foreign_bulk(xch, dom, prot, arr, err, num);
>>    if (res) {
>>        for (i = 0; i < num; i++) {
>>            if (err[i]) {
>>                errno = -err[i];
>>                munmap(res, num * PAGE_SIZE);
>>                res = NULL;
>>                break;
>>            }
>>        }
>>    }
>> 
>> The add to_physmap batch'd interface  actually will store errors
>> in the err array, and return 0 unless EFAULT or something like that.
>> See xenmem_add_to_physmap_batch(). The case I'm looking at, xentrace
>> calls here to map page which fails, but the return is 0 as the error is
>> succesfully copied by xen. But the error is missed above since res is 0.
>> xentrace does something again, and that causes xen crash. 
>> 
>> It appears the fix could be just removing the check for res above...
>> 
>>    res = xc_map_foreign_bulk(xch, dom, prot, arr, err, num);
>>    for (i = 0; i < num; i++) {
>>        if (err[i]) {
>>         .....
>> 
>> What do you guys think?
>> 
>> thanks,
>> Mukesh
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.