[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] credit: Limit load balancing to once per millisecond





On Fri, Jun 30, 2023 at 6:48 PM Julien Grall <julien@xxxxxxx> wrote:
Hi George,

On 30/06/2023 12:37, George Dunlap wrote:
> The credit scheduler tries as hard as it can to ensure that it always
> runs scheduling units with positive credit (PRI_TS_UNDER) before
> running those with negative credit (PRI_TS_OVER).  If the next
> runnable scheduling unit is of priority OVER, it will always run the
> load balancer, which will scour the system looking for another
> scheduling unit of the UNDER priority.
>
> Unfortunately, as the number of cores on a system has grown, the cost
> of the work-stealing algorithm has dramatically increased; a recent
> trace on a system with 128 cores showed this taking over 50
> microseconds.
>
> Add a parameter, load_balance_ratelimit, to limit the frequency of
> load balance operations on a given pcpu.  Default this to 1
> millisecond.
>
> Invert the load balancing conditional to make it more clear, and line
> up more closely with the comment above it.
>
> Overall it might be cleaner to have the last_load_balance checking
> happen inside csched_load_balance(), but that would require either
> passing both now and spc into the function, or looking them up again;
> both of which seemed to be worse than simply checking and setting the
> values before calling it.
>
> Without this patch, on a system with a vcpu:pcpu ratio of 2:1, running
> Windows guests (which will end up calling YIELD during spinlock
> contention), this patch increased performance significantly.

I don't understand this sentence. Did you intende to write

"Without this patch, ..., the performance are significantly worse"?

Hmm, yes this was bad editing.  The first clause was written when I was expecting to include actual numbers; but when I looked at the internal numbers I had available, they weren't easy to summarize.  The revised sentence should have simply been:

"On a system with a vcpu:pcpu ratio of 2:1, running Windows guests (which will end up calling YIELD during spinlock contention), this patch increased performance significantly."

 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1856,6 +1856,12 @@ By default, Xen will use the INVPCID instruction for TLB management if
>   it is available.  This option can be used to cause Xen to fall back to
>   older mechanisms, which are generally slower.
>   
> +### load-balance-ratelimit
> +> `= <integer>`
> +
> +The minimum interval between load balancing events on a given pcpu.
> +At the moment only credit honors this parameter.

I would suggest to mention the default value. So a reader don't have to
look up in the code to find out.

Also, AFAICT, there is max value. I would mention it here too.

Ack
 
> +/*
> + * Minimum delay, in microseconds, between load balance operations.
> + * This prevents spending too much time doing load balancing, particularly
> + * when the system has a high number of YIELDs due to spinlock priority inversion.
> + */
> +static unsigned int __read_mostly load_balance_ratelimit_us = CSCHED_DEFAULT_LOAD_BALANCE_RATELIMIT_US;

AFAICT, load_balance_ratelimit_us is not updated after boot. So
shouldn't the attribute be __ro_after_init?

Ack


> +#define XEN_SYSCTL_CSCHED_LB_RATE_MAX_US (1000000)
> +    uint32_t load_balance_ratelimit_us;

Shouldn't this change bump the sysctl interface version?

Er, yes.

v2 on its way.

 -George

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.