[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] libacpi: Remove CPU hotplug and GPE handling from PVH DSDTs
On 10.09.2025 19:29, Alejandro Vallejo wrote: > On Wed Sep 10, 2025 at 7:01 PM CEST, Alejandro Vallejo wrote: >> On Wed Sep 10, 2025 at 5:31 PM CEST, Jan Beulich wrote: >>> On 10.09.2025 17:16, Alejandro Vallejo wrote: >>>> On Wed Sep 10, 2025 at 5:02 PM CEST, Jan Beulich wrote: >>>>> On 10.09.2025 16:49, Alejandro Vallejo wrote: >>>>>> CPU hotplug relies on the guest having access to the legacy online CPU >>>>>> bitmap that QEMU provides at PIO 0xAF00. But PVH guests have no DM, so >>>>>> this causes the MADT to get corrupted due to spurious modifications of >>>>>> the "online" flag in MADT entries and the table checksum during the >>>>>> initial acpica passes. >>>>> >>>>> I don't understand this MADT corruption aspect, which - aiui - is why >>>>> there's a Fixes: tag here. The code change itself looks plausible. >>>> >>>> When there's no DM to provide a real and honest online CPU bitmap on PIO >>>> 0xAF00 >>>> then we get all 1s (because there's no IOREQ server). Which confuses the >>>> GPE >>>> handler. >>>> >>>> Somehow, the GPE handler is being triggered. Whether this is due to a real >>>> SCI >>>> or just it being spuriously executed as part of the initial acpica pass, I >>>> don't >>>> know. >>>> >>>> Both statements combined means the checksum and online flags in the MADT >>>> get >>>> changed after initial parsing making it appear as-if all 128 CPUs were >>>> plugged. >>> >>> I can follow this part (the online flags one, that is). >>> >>>> This patch makes the checksums be correct after acpica init. >>> >>> I'm still in trouble with this one. If MADT is modified in the process, >>> there's >>> only one of two possible options: >>> 1) It's expected for the checksum to no longer be correct. >>> 2) The checksum is being fixed up in the process. >>> That's independent of being HVM or PVH and independent of guest boot or >>> later. >>> (Of course there's a sub-variant of 2, where the adjusting of the checksum >>> would be broken, but that wouldn't be covered by your change.) >> >> I see what you mean now. The checksum correction code LOOKS correct. But I >> wonder about the table length... We report a table as big as it needs to be, >> but the checksum update is done irrespective of FLG being inside the valid >> range >> of the MADT. If a guest with 2 vCPUs (in max_vcpus) sees vCPU127 being >> signalled >> that'd trigger the (unseen) online flag to be enabled and the checksum >> adjusted, >> except the checksum must not being adjusted. >> >> I could add even more AML to cover that, but that'd be QEMU misbehaving (or >> being absent). This patch covers the latter case, but it might be good to >> change the commit message to reflect the real problem. > > It doesn't quite add up in the mismatch though. There might be something else > lurking in there. > > Regardless, I don't want this junk in PVH. Would a commit reword suffice to > have > it acked? I think so, yes. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |