[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PVH dom0 creation fails - the system freezes


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>, bercarug@xxxxxxxxxx
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Wed, 25 Jul 2018 15:41:11 +0200
  • Autocrypt: addr=jgross@xxxxxxxx; prefer-encrypt=mutual; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNHkp1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmRlPsLAeQQTAQIAIwUCU4xw6wIbAwcL CQgHAwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJELDendYovxMvi4UH/Ri+OXlObzqMANruTd4N zmVBAZgx1VW6jLc8JZjQuJPSsd/a+bNr3BZeLV6lu4Pf1Yl2Log129EX1KWYiFFvPbIiq5M5 kOXTO8Eas4CaScCvAZ9jCMQCgK3pFqYgirwTgfwnPtxFxO/F3ZcS8jovza5khkSKL9JGq8Nk czDTruQ/oy0WUHdUr9uwEfiD9yPFOGqp4S6cISuzBMvaAiC5YGdUGXuPZKXLpnGSjkZswUzY d9BVSitRL5ldsQCg6GhDoEAeIhUC4SQnT9SOWkoDOSFRXZ+7+WIBGLiWMd+yKDdRG5RyP/8f 3tgGiB6cyuYfPDRGsELGjUaTUq3H2xZgIPfOwE0EU4xwFgEIAMsx+gDjgzAY4H1hPVXgoLK8 B93sTQFN9oC6tsb46VpxyLPfJ3T1A6Z6MVkLoCejKTJ3K9MUsBZhxIJ0hIyvzwI6aYJsnOew cCiCN7FeKJ/oA1RSUemPGUcIJwQuZlTOiY0OcQ5PFkV5YxMUX1F/aTYXROXgTmSaw0aC1Jpo w7Ss1mg4SIP/tR88/d1+HwkJDVW1RSxC1PWzGizwRv8eauImGdpNnseneO2BNWRXTJumAWDD pYxpGSsGHXuZXTPZqOOZpsHtInFyi5KRHSFyk2Xigzvh3b9WqhbgHHHE4PUVw0I5sIQt8hJq 5nH5dPqz4ITtCL9zjiJsExHuHKN3NZsAEQEAAcLAXwQYAQIACQUCU4xwFgIbDAAKCRCw3p3W KL8TL0P4B/9YWver5uD/y/m0KScK2f3Z3mXJhME23vGBbMNlfwbr+meDMrJZ950CuWWnQ+d+ Ahe0w1X7e3wuLVODzjcReQ/v7b4JD3wwHxe+88tgB9byc0NXzlPJWBaWV01yB2/uefVKryAf AHYEd0gCRhx7eESgNBe3+YqWAQawunMlycsqKa09dBDL1PFRosF708ic9346GLHRc6Vj5SRA UTHnQqLetIOXZm3a2eQ1gpQK9MmruO86Vo93p39bS1mqnLLspVrL4rhoyhsOyh0Hd28QCzpJ wKeHTd0MAWAirmewHXWPco8p1Wg+V+5xfZzuQY0f4tQxvOpXpt4gQ1817GQ5/Ed/wsDtBBgB CAAgFiEEhRJncuj2BJSl0Jf3sN6d1ii/Ey8FAlrd8NACGwIAgQkQsN6d1ii/Ey92IAQZFggA HRYhBFMtsHpB9jjzHji4HoBcYbtP2GO+BQJa3fDQAAoJEIBcYbtP2GO+TYsA/30H/0V6cr/W V+J/FCayg6uNtm3MJLo4rE+o4sdpjjsGAQCooqffpgA+luTT13YZNV62hAnCLKXH9n3+ZAgJ RtAyDWk1B/0SMDVs1wxufMkKC3Q/1D3BYIvBlrTVKdBYXPxngcRoqV2J77lscEvkLNUGsu/z W2pf7+P3mWWlrPMJdlbax00vevyBeqtqNKjHstHatgMZ2W0CFC4hJ3YEetuRBURYPiGzuJXU pAd7a7BdsqWC4o+GTm5tnGrCyD+4gfDSpkOT53S/GNO07YkPkm/8J4OBoFfgSaCnQ1izwgJQ jIpcG2fPCI2/hxf2oqXPYbKr1v4Z1wthmoyUgGN0LPTIm+B5vdY82wI5qe9uN6UOGyTH2B3p hRQUWqCwu2sqkI3LLbTdrnyDZaixT2T0f4tyF5Lfs+Ha8xVMhIyzNb1byDI5FKCb
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, David Woodhouse <dwmw2@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, abelgun@xxxxxxxxxx
  • Delivery-date: Wed, 25 Jul 2018 13:41:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 25/07/18 15:35, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@xxxxxxxxxx wrote:
>> On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>>>> On 23.07.18 at 13:50, <bercarug@xxxxxxxxxx> wrote:
>>>> For the last few days, I have been trying to get a PVH dom0 running,
>>>> however I encountered the following problem: the system seems to
>>>> freeze after the hypervisor boots, the screen goes black. I have tried to
>>>> debug it via a serial console (using Minicom) and managed to get some
>>>> more Xen output, after the screen turns black.
>>>>
>>>> I mention that I have tried to boot the PVH dom0 using different kernel
>>>> images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11, 
>>>> 4.12).
>>>>
>>>> Below I attached my system / hypervisor configuration, as well as the
>>>> output captured through the serial console, corresponding to the latest
>>>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>>>> xen/tip tree).
>>>> [...]
>>>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>>>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>>>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 
>>>> 8deb3000, iommu reg = ffff82c00021b000
> 
> Can you figure out which PCI device is 00:14.0?
> 
>>>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>>>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>>>> (XEN) root_entry[00] = 1021c60001
>>>> (XEN) context[a0] = 2_1021d6d001
>>>> (XEN) l4[000] = 9c00001021d6c107
>>>> (XEN) l3[002] = 9c00001021d3e107
>>>> (XEN) l2[06f] = 9c000010218c0107
>>>> (XEN) l1[0b3] = 8000000000000000
>>>> (XEN) l1[0b3] not present
>>>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>>> This might be a hint at a missing RMRR entry in the ACPI tables, as
>>> we've seen to be the case for a number of systems (I dare to guess
>>> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
>>> and/or mouse connected). You may want to play with the respective
>>> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
>>> as you're using it does not have any meaning for PVH (see
>>> intel_iommu_hwdom_init()).
>>>
>>> Jan
>>>
>>>
>>>
>> Hello,
>>
>> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
>> I managed to get a PVH dom0 starting. However, some other problems appeared:
>>
>> 1) The USB devices are not usable anymore (keyboard and mouse), so the
>> system is only accessible through the serial port.
> 
> Can you boot with iommu=debug and see if you get any extra IOMMU
> information on the serial console?
> 
>> 2) I can run any usual command in dom0, but the ones involving xl (except
>> for xl info) will make the system run out of memory very fast. Eventually,
>> when there is no more free memory available, the OOM killer begins removing
>> processes until the system auto reboots.
>>
>> I attached a file containing the output of a lsusb, as well as the output of
>> xl info and xl list -l.
>> After xl list -l, the “free -m” commands show the available memory
>> decreasing.
>> Each command has a timestamp appended, so it can be seen how fast the
>> available memory decreases.
>>
>> I removed much of the process killing logs and kept the last one, since they
>> were following the same pattern.
>>
>> Dom0 still appears to be of type PV (output of xl list -l), however during
>> boot, the following messages were displayed: “Building a PVH Dom0” and
>> “Booting paravirtualized kernel on Xen PVH”.
>>
>> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
>> iommu to get dom0 running.
> 
> It seems to me like your ACPI DMAR table contains errors, and I
> wouldn't be surprised if those also cause the USB devices to
> malfunction.
> 
>>
>> What could be causing the available memory loss problem?
> 
> That seems to be Linux aggressively ballooning out memory, you go from
> 7129M total memory to 246M. Are you creating a lot of domains?

This might be related to the tools thinking dom0 is a PV domain.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.