[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] x86: introduce and use scratch CPU mask



On 08/12/16 16:02, Jan Beulich wrote:
> __get_page_type(), so far using an on-stack CPU mask variable, is
> involved in the recursion when e.g. pinning page tables. This means

"in recursion".

> there may be up two five instances of the function active at a time,

"up to five".

> implying five instances of the (up to 512 bytes large) CPU mask
> variable. With an IRQ happening at the deepest point of the stack, and
> with send_guest_pirq() being called from there (leading to vcpu_kick()
> -> ... -> csched_vcpu_wake() -> __runq_tickle() ->
> cpumask_raise_softirq(), the last two of which also have CPU mask
> variables on their stacks), this has been observed to cause a stack

"stacks), has been".

> overflow with a 4095-pCPU build.
>
> Introduce a per-CPU variable instead, which can then be used by any
> code never running in IRQ context.
>
> The mask can then also be used by other MMU code as well as by
> msi_compose_msg() (and quite likely we'll find further uses down the
> road).
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2477,6 +2477,7 @@ static int __get_page_type(struct page_i
>      int rc = 0, iommu_ret = 0;
>  
>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> +    ASSERT(!in_irq());
>  
>      for ( ; ; )
>      {
> @@ -2509,20 +2510,21 @@ static int __get_page_type(struct page_i
>                   * may be unnecessary (e.g., page was GDT/LDT) but those 
>                   * circumstances should be very rare.
>                   */
> -                cpumask_t mask;
> +                cpumask_t *mask = this_cpu(scratch_cpumask);

This indirection looks suspicious.  Why do you have a per_cpu pointer to
a cpumask, with a dynamically allocated mask?

It would be smaller and more efficient overall to have a fully cpumask
allocated in the per-cpu area, and use it via

cpumask_t *mask = &this_cpu(scratch_cpumask);

except...

> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -56,6 +56,8 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t
>  /* representing HT and core siblings of each logical CPU */
>  DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
>  
> +DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
> +

.. this does look correct, but..

>  cpumask_t cpu_online_map __read_mostly;
>  EXPORT_SYMBOL(cpu_online_map);
>  
> @@ -646,6 +648,7 @@ static void cpu_smpboot_free(unsigned in
>  
>      free_cpumask_var(per_cpu(cpu_sibling_mask, cpu));
>      free_cpumask_var(per_cpu(cpu_core_mask, cpu));
> +    free_cpumask_var(per_cpu(scratch_cpumask, cpu));
>  
>      if ( per_cpu(stubs.addr, cpu) )
>      {
> @@ -734,7 +737,8 @@ static int cpu_smpboot_alloc(unsigned in
>          goto oom;
>  
>      if ( zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, cpu)) &&
> -         zalloc_cpumask_var(&per_cpu(cpu_core_mask, cpu)) )
> +         zalloc_cpumask_var(&per_cpu(cpu_core_mask, cpu)) &&
> +         alloc_cpumask_var(&per_cpu(scratch_cpumask, cpu)) )
>          return 0;

This doesn't.  I'm confused.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.