[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: x86 NUMA error on OSSTest box


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 4 Oct 2022 11:25:32 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1FSt/D5sfWyuRU/uxuitTFK9A0wqUJfFx/hH7DhOg6c=; b=SKWTuig4tl/0zXW3+xfuguN3iZZv+N+hgoXDj+vrpKXI7faAO3nlpE7N/EWGQFIC8Qzv8mWYnlTJpzqvqsmH8/bT6yRiYODYDxeEd20byo7kTsPLSdC4uFRLbVvOXdgbcHHQL/Jf2rQn5AJdD2rTF+8RWhDxuCuj0JMkd3yTKva5nCPtFWX790ctWBFhk66jSV/1nIXZLW96MAyOku1ma1bYiHFCkOV/gqnjLPYtG3mSVF5MnRV5vRGZs3sjXFO/lT0zy1Ji06uZLNiy9s/l+mSlk9GfrNVTIBN1ZH5oN0vrz35kz8K4ftqY3ML1m+RDL8kr1tv+3xdHGurSoNUgHg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aF7ZegPnidQPftQ4D3frcHluTmkzNFklfLBRNDoz4r0l1+T10X8NX2m+ZQaIWUGUWZmeua5iewSLrd+qYuYVWxWM9pKDJ8+7OZCZ3DrpCjxjfAoRrWb3mx7C9tk8ug5XCViTC+oYbr3Hqmni5yyqLj4AeqmlnUvEIftjtVB+X27tejxd+s3B5EarSpySteEqd/2wCyV8OsYxXHLFtywi+2dYB6saCzCx6ylckye49thSLJGpcSEVF+WVvE2w1mEn2I8bLeIqRdciVyk+pBr8Iee1SYVHul7hPhrmsO6F0nlpZAeJbOPnOgynhHPDK0Rxj0fGhz/Rmt5fFx2PztFY/g==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Henry Wang <Henry.Wang@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 04 Oct 2022 09:25:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 03.10.2022 23:21, Andrew Cooper wrote:
> While working on another issue, I spotted this:
> 
> (XEN) ACPI: EINJ 6CB9D638, 0150 (r1 ORACLE     X7-2 41060300 INTL        1)
> (XEN) System RAM: 32429MB (33208204kB)
> (XEN) SRAT: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
> (XEN) SRAT: Node 0 PXM 0 [0000000100000000, 000000047fffffff]
> (XEN) SRAT: Node 1 PXM 1 [0000000480000000, 000000087fffffff]
> (XEN) NUMA: Using 19 for the hash shift.
> (XEN) Your memory is not aligned you need to rebuild your hypervisor
> with a bigger NODEMAPSIZE shift=19
> (XEN) SRAT: No NUMA node hash function found. Contact maintainer
> (XEN) SRAT: SRAT not used.
> (XEN) No NUMA configuration found
> (XEN) Faking a node at 0000000000000000-0000000880000000
> (XEN) Domain heap initialised
> 
> on sabro0 in OSSTest on current staging.  I do not know if it's a recent
> regression or not.
> 
> The SRAT looks reasonable (in fact, far better than most I've seen). 
> Given no legitimate requirement for aligned memory that I'm aware of, I
> think Xen's behaviour here is buggy and wants resolving.

That's yet another off-by-1 afaics, which was not mattering until the
first off-by-1 was eliminated. I'll make a(nother) patch, but I first
want to figure out why I didn't see this issue myself (of whether I
merely overlooked it).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.