[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] xen/x86/pvh: copy ACPI tables to Dom0 instead of mapping


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 16 May 2023 11:21:06 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0Iaz8Ns+pztBCChtJr+/H8kIaExMISa0flvV/PBsQXs=; b=mM3P+/+1NZFujYwvxqlaewGgA7txyhZEs68XIL3ayJnrPaLzTHpAOydJr9Kr8DHsKWYe3KA8Ekc+hg7JXf3KMA9YOO4zgiWNxXNAnsF7ajB11K4g4KYzZNxM6HYk0SAUpo2v6MbvujnHSYvexgimipeiQsu+/MTDTr18an9N7DuD/G/r//yUIOFy1Te2aaCmoLoTD95T3107Yl9CCB4HflmvnVYTgJgdsYrhUgEu+ZIillkviaGmGl/OUj13Hp4YIYYh6g5nhj93WHQ1tU6ZaelxL4pzWwZCvpTQJfOSpjE0dMEi9eRXPKcx/k6+MNdf+l0Z7cNUaCrbSICuzjVoNQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OVpnLsN0fHSmkvMCfioD73q2hNpKrPq4hW7OIbp50DRHFKHKaG9XBhEIYVzoJ51qS3nGZUIn09w1e1vji8vKHZx9+bKpq2Zd5P5Ld/gawn3pQ1u0SznqPbKFMaA84vQ1xAzO+2JGNJMKrp2Zbs6UIaoidJtO3MifKr0W09WRUSM6IXWmpmnNM6jN9QDC604IGmkpaxp/6x0vu6WEhPUPLGM4CEfOAPjoxs8Ht48rG0NZHc3TzJgwyhDrWOLmUy+9ueuhdjZRFoJ88uc/FuEwg5NRPOPO3B3MyOxaRvkPDFaHKzRoDr40+9rhZYlX78OXIL75XtA49ub52ylxZdpbSw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: andrew.cooper3@xxxxxxxxxx, jbeulich@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, Xenia.Ragiadakou@xxxxxxx, Stefano Stabellini <stefano.stabellini@xxxxxxx>
  • Delivery-date: Tue, 16 May 2023 09:21:47 +0000
  • Ironport-data: A9a23:F+QxhK4P4o41lolxbJLdxAxRtNjGchMFZxGqfqrLsTDasY5as4F+v jQeXGyBaa7ZMDbwet0lOYji9ktVup+DmINqQQY/ryA8Hi5G8cbLO4+Ufxz6V8+wwm8vb2o8t plDNYOQRCwQZiWBzvt4GuG59RGQ7YnRGvynTraCYnsrLeNdYH9JoQp5nOIkiZJfj9G8Agec0 fv/uMSaM1K+s9JOGjt8B5mr9VU+7ZwehBtC5gZlPa0S4geH/5UoJMl3yZ+ZfiOQrrZ8RoZWd 86bpJml82XQ+QsaC9/Nut4XpWVTH9Y+lSDX4pZnc/DKbipq/0Te4Y5iXBYoUm9Fii3hojxE4 I4lWapc6+seFvakdOw1C3G0GszlVEFM0OevzXOX6aR/w6BaGpdFLjoH4EweZOUlFuhL7W5mz cRBE3MiYAu4tryym52aQ85L3eEsBZy+VG8fkikIITDxK98DGMqGaYOaoNhS0XE3m9xEGuvYa 4wBcz1zYR/cYhpJfFAKFJY5m+TujX76G9FagAvN+exrvC6OkUooj+SF3Nn9I7RmQe1PmUmVv CTe9nnRCRAGLt2PjzGC9xpAg8eWxHqlBN5DSOzQGvhC2EK110EiKzwvTUK9quumikOnfN8FJ BlBksYphe1onKCxdfHmRAGxqnOAuh8aWvJTHvc85QXLzbDbiy6bDGUZSj9KaPQ9qdQ7Azct0 zehj97vQDBirrCRYXac7auP6yO/PzAPKm0PbjNCShEKi/HTrYcyh1T1R9liGaK8jdroMTj1z 3aBqy1Wr64PgMAC0aL95kzOiT+oopnPTyY84wmRVWWghj6Vf6agbo2srFLdvfBJKd/DSkHb5 SdY3c+D8OoJEJeB0jSXR/kAF62o4PDDNyDAhVloHN8q8DHFF2OfQL28KQpWfC9BWvvosxewC KMPkWu9PKNuAUY=
  • Ironport-hdrordr: A9a23:1jZeKKt0f2rRjW/guRAkX0Eo7skD+dV00zEX/kB9WHVpmwKj9v xG+85rsSMc6QxhPU3I/OrrBEDuex7hHPJOjbX409+ZLXPbUSiTXfpfBbKL+UycJ8SGzJ8g6U 4DSchD4azLfDpHZJ3BkW6F+r8bqbHtzEnPv4jjJhxWPGJXgs9bgTuRIzzrbXFedU1pBYcZCJ HZ3cZOvTymEE5nFviTNz0qX/Xju9aOr57tYQcHCxk7gTP+9A+A2frVEwW4whxbaD9Ewa4j/W /Z1yT1676uqevT8G6j60bjq7pXhfr8wZ94CMuAhtN9EESLtjqV
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, May 15, 2023 at 05:11:25PM -0700, Stefano Stabellini wrote:
> On Mon, 15 May 2023, Roger Pau Monné wrote:
> > On Fri, May 12, 2023 at 06:17:20PM -0700, Stefano Stabellini wrote:
> > > From: Stefano Stabellini <stefano.stabellini@xxxxxxx>
> > > 
> > > Mapping the ACPI tables to Dom0 PVH 1:1 leads to memory corruptions of
> > > the tables in the guest. Instead, copy the tables to Dom0.
> > > 
> > > This is a workaround.
> > > 
> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxx>
> > > ---
> > > As mentioned in the cover letter, this is a RFC workaround as I don't
> > > know the cause of the underlying problem. I do know that this patch
> > > solves what would be otherwise a hang at boot when Dom0 PVH attempts to
> > > parse ACPI tables.
> > 
> > I'm unsure how safe this is for native systems, as it's possible for
> > firmware to modify the data in the tables, so copying them would
> > break that functionality.
> > 
> > I think we need to get to the root cause that triggers this behavior
> > on QEMU.  Is it the table checksum that fail, or something else?  Is
> > there an error from Linux you could reference?
> 
> I agree with you but so far I haven't managed to find a way to the root
> of the issue. Here is what I know. These are the logs of a successful
> boot using this patch:
> 
> [   10.437488] ACPI: Early table checksum verification disabled
> [   10.439345] ACPI: RSDP 0x000000004005F955 000024 (v02 BOCHS )
> [   10.441033] ACPI: RSDT 0x000000004005F979 000034 (v01 BOCHS  BXPCRSDT 
> 00000001 BXPC 00000001)
> [   10.444045] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS  BXPCAPIC 
> 00000001 BXPC 00000001)
> [   10.445984] ACPI: FACP 0x000000004005FA65 000074 (v01 BOCHS  BXPCFACP 
> 00000001 BXPC 00000001)
> [   10.447170] ACPI BIOS Warning (bug): Incorrect checksum in table [FACP] - 
> 0x67, should be 0x30 (20220331/tbprint-174)
> [   10.449522] ACPI: DSDT 0x000000004005FB19 00145D (v01 BOCHS  BXPCDSDT 
> 00000001 BXPC 00000001)
> [   10.451258] ACPI: FACS 0x000000004005FAD9 000040
> [   10.452245] ACPI: Reserving APIC table memory at [mem 
> 0x40060f76-0x40060fff]
> [   10.452389] ACPI: Reserving FACP table memory at [mem 
> 0x4005fa65-0x4005fad8]
> [   10.452497] ACPI: Reserving DSDT table memory at [mem 
> 0x4005fb19-0x40060f75]
> [   10.452602] ACPI: Reserving FACS table memory at [mem 
> 0x4005fad9-0x4005fb18]
> 
> 
> And these are the logs of the same boot (unsuccessful) without this
> patch:
> 
> [   10.516015] ACPI: Early table checksum verification disabled
> [   10.517732] ACPI: RSDP 0x0000000040060F1E 000024 (v02 BOCHS )
> [   10.519535] ACPI: RSDT 0x0000000040060F42 000034 (v01 BOCHS  BXPCRSDT 
> 00000001 BXPC 00000001)
> [   10.522523] ACPI: APIC 0x0000000040060F76 00008A (v01 BOCHS  BXPCAPIC 
> 00000001 BXPC 00000001)
> [   10.527453] ACPI: ���� 0x000000007FFE149D FFFFFFFF (v255 ������ �������� 
> FFFFFFFF ���� FFFFFFFF)
> [   10.528362] ACPI: Reserving APIC table memory at [mem 
> 0x40060f76-0x40060fff]
> [   10.528491] ACPI: Reserving ���� table memory at [mem 
> 0x7ffe149d-0x17ffe149b]
> 
> It is clearly a memory corruption around FACS but I couldn't find the
> reason for it. The mapping code looks correct. I hope you can suggest a
> way to narrow down the problem. If I could, I would suggest to apply
> this patch just for the QEMU PVH tests but we don't have the
> infrastructure for that in gitlab-ci as there is a single Xen build for
> all tests.

Would be helpful to see the memory map provided to Linux, just in case
we messed up and there's some overlap.

It seems like some of the XSDT entries (the FADT one) is corrupt?

Could you maybe add some debug to the Xen-crafted XSDT placement.

> 
> If it helps to repro on your side, you can just do the following,
> assuming your Xen repo is in /local/repos/xen:
> 
> 
> cd /local/repos/xen
> mkdir binaries
> cd binaries
> mkdir -p dist/install/
> 
> docker run -it -v `pwd`:`pwd` 
> registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.12
> cp /initrd* /local/repos/xen/binaries
> exit
> 
> docker run -it -v `pwd`:`pwd` 
> registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19
> cp /bzImage /local/repos/xen/binaries
> exit
> 
> That's it. Now you have enough pre-built binaries to repro the issue.
> Next you can edit automation/scripts/qemu-alpine-x86_64.sh to add
> 
>   dom0=pvh dom0_mem=1G dom0-iommu=none

Do you get to boot with dom0-iommu=none?  Is there also some trick
here in order to identity map dom0? I would expect things to not work
because addresses used for IO with QEMU emulated devices won't be
correct.

> 
> on the Xen command line. I also removed "timeout" and pipe "tee" at the
> end for my own convenience:
> 
>  # Run the test
> -rm -f smoke.serial
> -set +e
> -timeout -k 1 720 \
>  qemu-system-x86_64 \
>      -cpu qemu64,+svm \
>      -m 2G -smp 2 \
>      -monitor none -serial stdio \
>      -nographic \
>      -device virtio-net-pci,netdev=n0 \
> -    -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& tee smoke.serial
> +    -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0
>  
> 
> make sure to build the Xen hypervisor binary and place the binary under
> /local/repos/xen/binaries/
> 
> You can finally run the test with the below:
> 
> cd ..
> docker run -it -v /local/repos/xen:/local/repos/xen 
> registry.gitlab.com/xen-project/xen/debian:unstable
> cd /local/repos/xen
> bash automation/scripts/qemu-alpine-x86_64.sh
> 
> It usually gets stuck halfway through the boot without this patch.

Thanks for the instructions, will give it a try if I can find some
time.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.