[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Wed, 22 Jul 2015 12:22:18 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, daniel.kiper@xxxxxxxxxx, keir@xxxxxxx
Delivery-date: Wed, 22 Jul 2015 11:22:34 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 22/07/15 11:04, Jan Beulich wrote:
>>>> On 22.07.15 at 10:42, <andrew.cooper3@xxxxxxxxxx> wrote:
>> In the case of having aligned source and destination on a 16-byte
>> boundary (which we can trivially arrange), then ERMSB (to give it its
>> Intel name) and rep stosl differ only in the setup cost; they still
>> scale at the same rate for changes in length.
>>
>> Therefore, assuming we arrange for 16-byte alignment, using rep stosl
>> would appear to be a single 60ish cycle hit over using ERMSB, but would
>> be substantially more efficient than using rep stosb on a non-ERMSB system.
>>
>> Overall, I think 16 byte alignment and rep stosl is the best compromise.
> Or leaving such code alone, with the assumption that over time the
> setup cost (on a growing number of systems) outweighs the benefits
> (on a shrinking set).

The BSS is large - 295k on the last compile I have from staging.  The
setup cost is lost in the nose compared to the elapsed time to write
that many zeroes to memory.

Therefore, on an ERMBS-capable system, the two options will complete in
the same amount of time.

However, on all AMD hardware and Intel hardware older than IvyBridge,
rep stosl is 4 times faster than rep stosb.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Jan Beulich

References:
- [Xen-devel] [PATCH v2 00/23] x86: multiboot2 protocol support
  - From: Daniel Kiper
- [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Daniel Kiper
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Daniel Kiper
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Andrew Cooper
- Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH 2/6] tools/libx{l, c}: Remove the toolstack_{save, restore} callbacks
Next by Date: Re: [Xen-devel] [PATCH for 4.6 0/6] Prune legacy migration and move migration v2 out of daft status
Previous by thread: Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
Next by thread: Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.