[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.12 V4] x86/altp2m: fix HVMOP_altp2m_set_domain_state race




> On Feb 15, 2019, at 1:24 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
> 
>>>> On 15.02.19 at 13:52, <George.Dunlap@xxxxxxxxxx> wrote:
>>> On Feb 12, 2019, at 11:42 AM, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> 
>>> wrote:
>>> 
>>> HVMOP_altp2m_set_domain_state does not domain_pause(), presumably
>>> on purpose (as it was originally supposed to cater to a in-guest
>>> agent, and a domain pausing itself is not a good idea).
>> 
>> Sorry to come in here on v4 and suggest changing everything, but I don’t 
>> really like the solution you have here.  Not setting altp2m to ‘active’ 
>> until 
>> after the vcpus are set up makes sense; but passing this true/false value in 
>> seems ugly, and still seems a bit racy (i.e., what if p2m_active() is 
>> disabled between the check in HVMOP_altp2m_switch_p2m and the time we 
>> actually call altp2m_vcpu_update_p2m()?)
>> 
>> I certainly don’t think domain_pause() should be our go-to solution for race 
>> avoidance, but in this case it really seems like switching the global p2m 
>> for 
>> every vcpu at once makes the most sense; and trying to safely change this on 
>> an unpaused domain is not only overly complicated, but probably not what we 
>> wanted anyway.
>> 
>> p2m_altp2m_destroy_by_id() and p2m_switch_domain_altp2m_by_id() already use 
>> domain_pause_except_self(); so it seems like not using it for 
>> altp2m_set_domain_state was probably more of an oversight than an 
>> intentional 
>> decision.  Using that here seems like a more robust solution.
> 
> Ah, I didn't even recall there was such a function. As this now
> also allows covering a domain requesting the operation for itself,
> I don't mind the pausing approach anymore.

Yeah, I forgot too until I was grepping for “domain_pause” to figure out what 
everyone else was doing. :-)

>> The one issue is that domain_pause_except_self() currently is actually a 
>> deadlock risk if two different vcpus start it at the same time.  I think the 
>> attached patch (compile-tested only) should fix this issue; after this patch 
>> you should be able to use domain_pause_except_self() in 
>> altp2m_set_domain_state instead.
> 
> There's one thing I don't really like here, which is a result of the
> (necessary) re-use of the hypercall deadlock mutex: This
> certainly poses the risk of getting called from a context where
> the lock was already acquired. Therefore I'd like to suggest to
> use this lock in a recursive way (here and elsewhere).

> 
> And two cosmetic remarks - there's no need to re-specify
> __must_check on the function definition, as the function
> declaration ought to be in scope anyway. And there's a stray
> blank inside the likely() you add.

I don’t see that I added a ‘likely’; there’s one in context, but I don’t see 
any stray blanks there.

The other two points make sense — Razvan, would you be willing to make those 
changes (and test the result, as I haven’t done more than compile-test it)?

 -George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.