[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] gnttab: defer allocation of status frame tracking array



Hi Andrew,

On 05/05/2021 11:49, Andrew Cooper wrote:
On 04/05/2021 09:42, Jan Beulich wrote:
This array can be large when many grant frames are permitted; avoid
allocating it when it's not going to be used anyway, by doing this only
in gnttab_populate_status_frames(). While the delaying of the respective
memory allocation adds possible reasons for failure of the respective
enclosing operations, there are other memory allocations there already,
so callers can't expect these operations to always succeed anyway.

As to the re-ordering at the end of gnttab_unpopulate_status_frames(),
this is merely to represent intended order of actions (shrink array
bound, then free higher array entries). If there were racing accesses,
suitable barriers would need adding in addition.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Nack.

The argument you make says that the grant status frames is "large
enough" to care about not allocating.  (I frankly disagree, but that
isn't relevant to my to my nack).

The change in logic here moves a failure in DOMCTL_createdomain, to
GNTTABOP_setversion.  We know, from the last minute screwups in XSA-226,
that there are versions of Windows and Linux in the field, used by
customers, which will BUG()/BSOD when GNTTABOP_setversion doesn't succeed.

Unfortunately, my reply to this on the v2 was left unanswered by you. So I will rewrite it here with more details in the hope this can lead to a consensus now.

From a discussion during January's community call, the behavior was related to an old version version of Windows PV drivers. Newer version will not use Grant Table v2.

However, your comment seems to suggest that GNTTABOP_setversion will never fail today. AFAICT, this is not true today because Xen will return -ENOMEM when trying to populate the status frame (see the call to gnttab_populate_status_frame()).

Therefore...


You're literally adding even more complexity to the grant table, to also
increase the chance of regressing VMs in the wild.  This is not ok.

... I am not sure why you are saying this is a regression.

The only form of this patch which is in any way acceptable, is to avoid
the allocation when you know *in DOMCTL_createdomain* that it will never
be needed by the VM.  So far, that is from Kconfig and/or the command
line options.

I can see how allocating memory upfront is easier for accounting purpose. However, it also means we may not be able to reduce the footprint if we don't know what the guest OS will use it.

This can happen for instance, if you let your customers to run random OSes and never restrict them to v1.

I believe that in most of the cases v1 will be used by the guest. But we technically can't restrict it after the fact (e.g. on reboot) as this may regress customers workload.

That's where the current approach is quite appealing because the allocation is done a runtime. So it does cater a lot more uses cases than your suggestion.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.