[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] libacpi: Remove CPU hotplug and GPE handling from PVH DSDTs


  • To: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Wed, 10 Sep 2025 19:29:42 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7Eg5T/iQcS2XH8Uv+ZB99onyrXxOKCdOx9AyoMMLx7Q=; b=bWUrLn3Zbp3lt1BNUx9OyYoAafHbTD4UrSlv9VQuRc3XpAMWqwwZhv1xHp6X+QPmE7pNmJpvlODPBtOspshzptgPpKBZQbr38D43vnGwdluBWmkr1Jk1sV0Dubp+NGkycehFtymYBw4r7Yb5xEFWS/GrLAfckBRzqs/sbvsqptitFEmUAxlzlbZCwe1nawRjmtHFDP+4oboYRDfRvNFNsQKaYUK39mGdRAU3cA9ryUu7h2rUmllJ4aXm7jDEkuzfBWmFraXV8NIKvxkN5gX1TZqJyN6XjM/WGHsmgU1Yzipc9h4UiHtWUQIuPY73OCb0U3Khv7sB14cc+kfk/bErUg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=v4jzC+STGa0F6fdRtKB0+wEZxkU5DY8RXHSjqGye1whQhG6ckM4xI/SDE6+Ej6rFrD33gduy4gMinngl+k8syBD4/DgGByBPhT9NbP9BQR01OQEVmFmxnL2hoLovXWIPT5WMqI5I4hgG+xtK9Cx/VfxWI+UGhQYWmWgX2h70A2rctRHcjurkfv5EyqsngtI1q8iCJWi4W6SaQd7tGb+ZdarBYSShBq0yhzECor7tfD1sUjBlRS4fvVQSgvIQz6htsT36PkngSqwCRSHczOLImVpoq107FOx2d80BI2J7dtMC2Zpvqwzriwim7a1XazjdjQBckYKgRKGlYJM/yJMeCg==
  • Cc: Anthony PERARD <anthony.perard@xxxxxxxxxx>, Grygorii Strashko <grygorii_strashko@xxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 10 Sep 2025 17:30:01 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed Sep 10, 2025 at 7:01 PM CEST, Alejandro Vallejo wrote:
> On Wed Sep 10, 2025 at 5:31 PM CEST, Jan Beulich wrote:
>> On 10.09.2025 17:16, Alejandro Vallejo wrote:
>>> On Wed Sep 10, 2025 at 5:02 PM CEST, Jan Beulich wrote:
>>>> On 10.09.2025 16:49, Alejandro Vallejo wrote:
>>>>> CPU hotplug relies on the guest having access to the legacy online CPU
>>>>> bitmap that QEMU provides at PIO 0xAF00. But PVH guests have no DM, so
>>>>> this causes the MADT to get corrupted due to spurious modifications of
>>>>> the "online" flag in MADT entries and the table checksum during the
>>>>> initial acpica passes.
>>>>
>>>> I don't understand this MADT corruption aspect, which - aiui - is why
>>>> there's a Fixes: tag here. The code change itself looks plausible.
>>> 
>>> When there's no DM to provide a real and honest online CPU bitmap on PIO 
>>> 0xAF00
>>> then we get all 1s (because there's no IOREQ server). Which confuses the GPE
>>> handler.
>>> 
>>> Somehow, the GPE handler is being triggered. Whether this is due to a real 
>>> SCI
>>> or just it being spuriously executed as part of the initial acpica pass, I 
>>> don't
>>> know.
>>> 
>>> Both statements combined means the checksum and online flags in the MADT get
>>> changed after initial parsing making it appear as-if all 128 CPUs were 
>>> plugged.
>>
>> I can follow this part (the online flags one, that is).
>>
>>> This patch makes the checksums be correct after acpica init.
>>
>> I'm still in trouble with this one. If MADT is modified in the process, 
>> there's
>> only one of two possible options:
>> 1) It's expected for the checksum to no longer be correct.
>> 2) The checksum is being fixed up in the process.
>> That's independent of being HVM or PVH and independent of guest boot or 
>> later.
>> (Of course there's a sub-variant of 2, where the adjusting of the checksum
>> would be broken, but that wouldn't be covered by your change.)
>>
>> Jan
>
> I see what you mean now. The checksum correction code LOOKS correct. But I
> wonder about the table length... We report a table as big as it needs to be,
> but the checksum update is done irrespective of FLG being inside the valid 
> range
> of the MADT. If a guest with 2 vCPUs (in max_vcpus) sees vCPU127 being 
> signalled
> that'd trigger the (unseen) online flag to be enabled and the checksum 
> adjusted,
> except the checksum must not being adjusted.
>
> I could add even more AML to cover that, but that'd be QEMU misbehaving (or
> being absent). This patch covers the latter case, but it might be good to
> change the commit message to reflect the real problem.
>
> Cheers,
> Alejandro

It doesn't quite add up in the mismatch though. There might be something else
lurking in there.

Regardless, I don't want this junk in PVH. Would a commit reword suffice to have
it acked?

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.