[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] x86/HVM: limit cache writeback overhead


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 19 Apr 2023 22:55:49 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TSlcUKq5mYScgZ1rUT638dakeHZvvjZczDBMwx3+0Ro=; b=EBEOVULBIkXU4xz4ht0LSzrPuioddaWRcVkEuC35sxJZRH37pwbVqYxTDWUMPF9ojO6EMKwwQfQSY3Us79JsDLjLBazO4NfabFWBIQNkN4Ull+stMKZd9hHb07yuvbyh6RlMsY4whOQxHNFo8RAccCVP78uBeT1Pkh+aXTlo72m4pVEBYuS+IKnJ060MyaUOlnyZ+e+WVIiwi7MM+QH/1+Kr4BqmvN9jpMX/X414oerEizQ7zXw/wHIp22j7jNa3138DAgdjIJNWkPxZxg5hgcB1EdrOVSlXGIpL+dvr66oTlLbYKXQpstg56H7/yGWwzk7h66lwouScs2wiON/EWQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BjMBJKLX0GI6hnmztJ6KEu3CrAWruyxlQAw7mmJ0TNSM1VLkws7+8sfhapvrBXHR1tDdr6ArGsifSvpAf4FKXeyg+G8tkB9N/yLTY+2+nVQ7Rvv9QgaUKxHQENX5+GvQsgDppH0IXUMJ1iU51jQQfDUelfLQhwvRvbe5zSynq6xkA60dcSq47dmvK+hwftImLygwG99kwrXP+6D9G8A1VVnAb4zWiqIRLc6maEav6kp38ChAbyiqEb6gI4ew9mq1LmXddmkwSPDUwuSwG5OvdWGObDSr0me0cHnwp5pHA8ccIItKyIU4xnF8Yvv13f5+A3QvzNAEQCyOGxPY1xFGCA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>
  • Delivery-date: Wed, 19 Apr 2023 21:56:29 +0000
  • Ironport-data: A9a23:IQOZB6I0aajNFvJoFE+RApQlxSXFcZb7ZxGr2PjKsXjdYENShDwBx jYWX2qAP/7cZWX8L9hxPYzl9EkBupPWnIRgTwJlqX01Q3x08seUXt7xwmUcnc+xBpaaEB84t ZV2hv3odp1coqr0/0/1WlTZhSAgk/rOHvykU7Ss1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpJrfPSwP9TlK6q4mhA4gVvPaojUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c5vXUN05 d0Vdwktbyiqtc6sh7OYStNz05FLwMnDZOvzu1lG5BSAVbMDfsqGRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/dqpTGLkmSd05C0WDbRUvWMSd9YgQCzo WXe8n6iKhobKMae2XyO9XfEaurnxHurBtpOTufknhJsqHyOhUAjMyFNbEfh/t2fjHOYX8lPA ENBr0LCqoB3riRHVOLVXRe1vXqFtR40QMdLHqsx7wTl4rrZ5UOVC3YJShZFacc6r4kmSDoyz FiLktj1Qzt1v9W9Vna15rqS6zSoNkA9LmIcZClCUQoM5fHipp0+ilTESdMLOKyoiJvzEDL5w TGPpQA/gakeiYgA0KDTwLzcqzelp5yMSxFv4AzSBzqh9lkgPNDjYJG041/G6/oGNJyeUlSKo HkDnY6Z8fwKCpaO0ieKRY3hAY2U2hpMCxWE6XYHInXr323FF6KLFWyI3AxDGQ==
  • Ironport-hdrordr: A9a23:ShNj66FkRUhon1T6pLqELMeALOsnbusQ8zAXPiBKJCC9E/bo8v xG+c5w6faaslkssR0b9+xoW5PwI080l6QU3WB5B97LMDUO0FHCEGgI1/qA/9SPIUzDHu4279 YbT0B9YueAcGSTW6zBkXWF+9VL+qj5zEix792uq0uE1WtRGtldBwESMHf9LmRGADNoKLAeD5 Sm6s9Ot1ObCA8qhpTSPAhiYwDbzee77a7bXQ==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19/04/2023 11:46 am, Jan Beulich wrote:
> There's no need to write back caches on all CPUs upon seeing a WBINVD
> exit; ones that a vCPU hasn't run on since the last writeback (or since
> it was started) can't hold data which may need writing back.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

I find it unlikely that this is an improvement in any way at all.

You're adding a memory allocation, and making the fastpath slower, for
all HVM domains even the ~100% of them in practice which never get given
a device in the first place.

Just so you can skip the WBINVD side effect on the L1/L2 caches of the
CPUs this domain happens not to have run on since the last time they
flushed (which is already an under estimate).  Note how this does not
change the behaviour for the L3 caches, which form the overwhelming
majority of the WBINVD overhead in the first place.

So my response was going to be "definitely not without the per-domain
'reduced cacheability permitted' setting we've discussed".  And even
then, not without numbers suggesting it's a problem in the first place,
or at least a better explanation of why it might plausibly be an issue.


But, in writing this, I've realised a real bug.

Cache snoops can occur and pull lines sideways for microarchitectural
reasons.  And even if we want to hand-wave that away as being unlikely
(it is), you can't hand-wave away rogue speculation in the directmap.

By not issuing WBINVD on all cores, you've got a real chance of letting
some lines escape the attempt to evict them.

But it's worse than that - even IPIing all cores, there's a speculation
pattern which can cause some lines to survive.  Rare, sure, but not
impossible.

Right now, I'm not sure that WBINVD can even be used safely without the
extra careful use of CR0.{CD,NW}, which provides a workaround for
native, but nothing helpful for hypervisors...

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.