[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] xen/evtchn: Clean up teardown handling


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 22 Dec 2020 11:28:24 +0000
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 22 Dec 2020 11:28:33 +0000
  • Ironport-sdr: e/WZRKWVXbIqJlO3yymhpgPXIBtfob3+t/AGoVWYDzzSgnw6E7WRxLTIiyR2JfylRytcPtyetf qTip63W7RKKFk8P+E98uiKg3IqeTLplSszQ0f3tc48sdRvIdlFYIXhu0ubSxzV109wnJRln3NK LPvgjSpGffbuiy1+pvyo9EoRt8jUlurcopXnG9VHS+Q3fUz+/IuacV0tZ+ko56GF/9vvXJkXFK HYkpb1iZdJo/u9SV4jX5CBLOMAaLW6cQM1+By2g2/B3ARa5XVRK8QA2jt69tNfdkRUl6af7ydz xV8=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22/12/2020 10:48, Jan Beulich wrote:
> On 21.12.2020 19:14, Andrew Cooper wrote:
>> First of all, rename the evtchn APIs:
>>  * evtchn_destroy       => evtchn_teardown
>>  * evtchn_destroy_final => evtchn_destroy
> I wonder in how far this is going to cause confusion with backports
> down the road. May I suggest to do only the first of the two renames,
> at least until in a couple of year's time? Or make the second rename
> to e.g. evtchn_cleanup() or evtchn_deinit()?

I considered backports, but I don't think it will be an issue.  The
contents of the two functions are very different, and we're not likely
to be moving the callers in backports.

I'm not fussed about the exact naming, so long as we can make and
agreement and adhere to it strictly.  The current APIs are a total mess.

I used teardown/destroy because that seems to be one common theme in the
APIs, but it will require some to change their name.

>> RFC.  While testing this, I observed this, after faking up an -ENOMEM in
>> dom0's construction:
>>
>>   (XEN) [2020-12-21 16:31:20] NX (Execute Disable) protection active
>>   (XEN) [2020-12-21 16:33:04]
>>   (XEN) [2020-12-21 16:33:04] ****************************************
>>   (XEN) [2020-12-21 16:33:04] Panic on CPU 0:
>>   (XEN) [2020-12-21 16:33:04] Error creating domain 0
>>   (XEN) [2020-12-21 16:33:04] ****************************************
>>
>> XSA-344 appears to have added nearly 2 minutes of wallclock time into the
>> domain_create() error path, which isn't ok.
>>
>> Considering that event channels haven't even been initialised in this
>> particular scenario, it ought to take ~0 time.  Even if event channels have
>> been initalised, none can be active as the domain isn't visible to the 
>> system.
> evtchn_init() sets d->valid_evtchns to EVTCHNS_PER_BUCKET. Are you
> suggesting cleaning up one bucket's worth of unused event channels
> takes two minutes? If this is really the case, and considering there
> could at most be unbound Xen channels, perhaps we could avoid
> calling evtchn_teardown() from domain_create()'s error path? We'd
> need to take care of the then missing accounting (->active_evtchns
> and ->xen_evtchns), but this should be doable.

Actually, its a bug in this patch.  evtchn_init() hasn't been called, so
d->valid_evtchns is 0, so the loop is 2^32 iterations long.  Luckily,
this is easy to fix.

As for avoiding calling, specifically not.  Part of the problem we're
trying to fix is that we've got two different destruction paths, and
making domain_teardown() be fully idempotent is key to fixing that.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.