[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 02/16] x86: zero BSS using stosl instead of stosb



On 15/04/16 13:33, Daniel Kiper wrote:
> Speedup BSS initialization by using stosl instead of stosb.
>
> Some may argue that Intel Ivy Bridge and later provide ERMSB feature.
> This means that "rep stosb" gives better throughput than "rep stosl" on
> above mentioned CPUs. However, this feature is only available on newer
> Intel processors and e.g. AMD does not provide it at all. So, stosb will
> just give real benefits and even beat stosl only on limited number of
> machines. On the other hand stosl will speedup BSS initialization on
> all x86 platforms. Hence, use stosl instead of stosb.
>
> Additionally, align relevant comment to coding style.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
> ---
> v3 - suggestions/fixes:
>    - improve comments
>      (suggested by Konrad Rzeszutek Wilk),
>    - improve commit message
>      (suggested by Jan Beulich).
> ---
>  xen/arch/x86/boot/head.S |    5 +++--
>  xen/arch/x86/xen.lds.S   |    3 +++
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
> index f3501fd..32a54a0 100644
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -123,12 +123,13 @@ __start:
>          call    reloc
>          mov     %eax,sym_phys(multiboot_ptr)
>  
> -        /* Initialize BSS (no nasty surprises!) */
> +        /* Initialize BSS (no nasty surprises!). */
>          mov     $sym_phys(__bss_start),%edi
>          mov     $sym_phys(__bss_end),%ecx
>          sub     %edi,%ecx
> +        shr     $2,%ecx
>          xor     %eax,%eax
> -        rep     stosb
> +        rep     stosl
>  
>          /* Interrogate CPU extended features via CPUID. */
>          mov     $0x80000000,%eax
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index 961f48f..6802da1 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -191,6 +191,8 @@ SECTIONS
>         CONSTRUCTORS
>    } :text
>  
> +  /* Align BSS to speedup its initialization. */
> +  . = ALIGN(4);

This is not needed.  There is already appropriate alignment before
__bss_start.

Also, you need to rebase this series onto staging - there are a lot of
changes you are missing.

~Andrew

>    .bss : {                     /* BSS */
>         . = ALIGN(STACK_SIZE);
>         __bss_start = .;
> @@ -205,6 +207,7 @@ SECTIONS
>         *(.bss.percpu.read_mostly)
>         . = ALIGN(SMP_CACHE_BYTES);
>         __per_cpu_data_end = .;
> +       . = ALIGN(4);
>         __bss_end = .;
>    } :text
>    _end = . ;


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.