[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xentrace buffer size, maxcpus and online cpus


  • To: Olaf Hering <olaf@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 31 May 2023 11:05:52 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TEoluasBZv6hl4jzKFrVbuO0CfRLMLayK1d2uZWJ680=; b=BAPFNYnh1/5dLLtVpTkHqpwYCfkW3RCnqwJyvCudZBEuL4bEHRd6q23oj6AFUbeY6cAXtuxkIUuRtZUgMzdEjNgc6BjhDr2wQuUHJSl4d5R9XHBuLoWB6hm+blI/6Dr+7WiYC36UsMMfZYdo8AMxCbpGlbu8jwiWVTzEaQWSPZgog52PQnr1VZpTAiXA9nn1KpdtAgQYFyxfx4LsQQ5H4NGdAtXe+FyPN1VSJCZdXJnMFGhrGWbCLumfym57pWGXKqgxS0UqiN2cMf7h8Dm8iUUCq7doEwFa6FkJ/04sxHrAJw5ZnmvUCpzI/vtgoOIsL4NyWHXjmTgvzqFFQWe2dQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Bj01/iCT5uuD/U57r0OY+natuJiAofVCf3MKx9xUTA5saa1xbgxvASfzLAWqVSjUQT8632/NV0rKmnaJ/7Dl9oxNC6ygv94xqoEa9tm8I2RZyqruusB87JWuZj3tLKUxngEsQaq7+ZK78ThJXZY0CHj7Q8+HOipUbXOZh5WmLLhTjkfezb+VQVAeQZkhe5ahdd2JIGiZj4kKvfpfgURgm5w6lEgg0zI470GhB24be3lvEIbdipjEJ8gic7RjckJi5EX96ot+NTYu61nkecB4XOsnc5N7HRBNUltCvN2xaMh1x/bReLSSnq58lCnJUyA4n1SQiLrk8zEqZm3zKXWjNA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 31 May 2023 09:06:00 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 30.05.2023 22:06, Olaf Hering wrote:
> Tue, 30 May 2023 10:41:07 +0200 Jan Beulich <jbeulich@xxxxxxxx>:
> 
>> Using this N would be correct afaict, but that N isn't num_online_cpus().
>> CPUs may have been offlined by the time trace buffers are initialized, so
>> without looking too closely I think it would be num_present_cpus() that
>> you're after.
> 
> In my testing num_online_cpus returns N, while num_present_cpus returns
> all available pcpus. There is also num_possible_cpus, but this appears to
> be an ARM thing.
> 
> If Xen is booted with maxcpus=, is there a way to use the remaining cpus?

In general no, because then nr_cpu_ids will be too constrained. But
note that CPU parking also comes into play here, leading to nr_cpu_ids
being set to all possible (present + hotplug) CPUs. Iirc parked CPUs
can be brought online even beyond what "maxcpus=" says (albeit I think
that's more a side effect of the parking implementation than an
intended goal).

> In case this is possible, the code needs adjustment to reinitialize the
> trace buffers. This is not an easy change. But if the remaining cpus
> will remain offline, then something like this may work:
> 
> +++ b/xen/common/trace.c
> @@ -110,7 +110,8 @@ static int calculate_tbuf_size(unsigned int pages, 
> uint16_t t_info_first_offset)
>      struct t_info dummy_pages;
>      typeof(dummy_pages.tbuf_size) max_pages;
>      typeof(dummy_pages.mfn_offset[0]) max_mfn_offset;
> -    unsigned int max_cpus = nr_cpu_ids;
> +    unsigned int nr_cpus = num_online_cpus();
> +    unsigned int max_cpus = nr_cpus;

As said before, num_online_cpus() will under-report for the purpose
here, as CPUs may have been brought offline, and may be brought online
again later (independent of the use of "maxcpus=").

Re-initializing trace buffers when more CPUs come online might of
course be an option, but it would need doing in a race free manner.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.