[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PVH dom0 creation fails - the system freezes



On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@xxxxxxxxxx wrote:
On 07/24/2018 12:54 PM, Jan Beulich wrote:
On 23.07.18 at 13:50, <bercarug@xxxxxxxxxx> wrote:
For the last few days, I have been trying to get a PVH dom0 running,
however I encountered the following problem: the system seems to
freeze after the hypervisor boots, the screen goes black. I have tried to
debug it via a serial console (using Minicom) and managed to get some
more Xen output, after the screen turns black.

I mention that I have tried to boot the PVH dom0 using different kernel
images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 4.12).

Below I attached my system / hypervisor configuration, as well as the
output captured through the serial console, corresponding to the latest
versions for Xen and the Linux Kernel (Xen staging and Kernel from the
xen/tip tree).
[...]
(XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, 
iommu reg = ffff82c00021b000
Can you figure out which PCI device is 00:14.0?
This is the output of lspci -vvv for device 00:14.0:

00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) (prog-if 30 [XHCI])         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 178
        Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [70] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
                Address: 00000000fee0e000  Data: 4021
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
(XEN) root_entry[00] = 1021c60001
(XEN) context[a0] = 2_1021d6d001
(XEN) l4[000] = 9c00001021d6c107
(XEN) l3[002] = 9c00001021d3e107
(XEN) l2[06f] = 9c000010218c0107
(XEN) l1[0b3] = 8000000000000000
(XEN) l1[0b3] not present
(XEN) Dom0 callback via changed to Direct Vector 0xf3
This might be a hint at a missing RMRR entry in the ACPI tables, as
we've seen to be the case for a number of systems (I dare to guess
that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
and/or mouse connected). You may want to play with the respective
command line option ("rmrr="). Note that "iommu_inclusive_mapping"
as you're using it does not have any meaning for PVH (see
intel_iommu_hwdom_init()).

Jan



Hello,

Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
I managed to get a PVH dom0 starting. However, some other problems appeared:

1) The USB devices are not usable anymore (keyboard and mouse), so the
system is only accessible through the serial port.
Can you boot with iommu=debug and see if you get any extra IOMMU
information on the serial console?
The debug flag was already set, so the log I attached on the first
message already contains the IOMMU info.
In Xen's command line I used iommu=debug,verbose,workaround_bios_bug.

2) I can run any usual command in dom0, but the ones involving xl (except
for xl info) will make the system run out of memory very fast. Eventually,
when there is no more free memory available, the OOM killer begins removing
processes until the system auto reboots.

I attached a file containing the output of a lsusb, as well as the output of
xl info and xl list -l.
After xl list -l, the “free -m” commands show the available memory
decreasing.
Each command has a timestamp appended, so it can be seen how fast the
available memory decreases.

I removed much of the process killing logs and kept the last one, since they
were following the same pattern.

Dom0 still appears to be of type PV (output of xl list -l), however during
boot, the following messages were displayed: “Building a PVH Dom0” and
“Booting paravirtualized kernel on Xen PVH”.

I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
iommu to get dom0 running.
It seems to me like your ACPI DMAR table contains errors, and I
wouldn't be surprised if those also cause the USB devices to
malfunction.

What could be causing the available memory loss problem?
That seems to be Linux aggressively ballooning out memory, you go from
7129M total memory to 246M. Are you creating a lot of domains?

Roger.

I did not create any guest before issuing "xl list -l". However, creating

a PVH domU will work - "xl create <cfg_file>" does not produce this behavior.


Gabriel




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.