[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] piix: fix regression during unplug in Xen HVM domUs


  • To: Paolo Bonzini <pbonzini@xxxxxxxxxx>, John Snow <jsnow@xxxxxxxxxx>
  • From: Olaf Hering <olaf@xxxxxxxxx>
  • Date: Mon, 26 Jun 2023 23:19:01 +0200
  • Arc-authentication-results: i=1; strato.com; arc=none; dkim=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1687814354; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date:Cc:Date: From:Subject:Sender; bh=ZVP9K9M74+FeliftJ4CR3Y0x5CVjLjTW3Jxl9061L0o=; b=Beb4QRo7tQE0pF/u697OkrNMIPbjm0FDYqmJ7GeLUWA1cjHgkjkKA50ovg3hOKSMzE P941MnEi2B1XjwFMELMn3XMGRek5VKj9/Q6ZG+/c12cg2k874+/tJYaoscPG0OSuEYB0 yZBRPVsRs8s2Semw1cE8UMDD2ZyyS6C4GVwOzWqKbjAkX2Htq/10ZANkczERB1MRBTZH YgJRiRKjFJCdQLzzZGHtS0tKWI779WdEDhp2EYYnMwlOI8OL6HXiGtxzlbNxSsDSqTbE 3zBKeG9FTyvrvqcgnGOJYidsznZGbvpBPb8E5/Xdye4FZerQ4HB6RKdEcI6jqfek8dbu FQFg==
  • Arc-seal: i=1; a=rsa-sha256; t=1687814354; cv=none; d=strato.com; s=strato-dkim-0002; b=hTJCBdkND0+z0biJCn1S1yamaOFFq9MwBPlM457hYZqSQqnS0+4df9GxCMaWRBIyIo 9aEuCVRPM745wXq87+gt90itzWvc031C0/aDJPo2bGqK0sI+lNlM470/WcDlQn8o2X2n /Su/yr6F7QUZVttk5kgJBg52kEv4MRHRIVfg0BhbtVxP5+J7P3+iyIbh4uXJRXnRNNT7 SnLGnY8eqqzbedmO2cgHKCiERkxZpngIabllGwlzRU3d6636cZXEDLTni+dfc1HSAvNU BoGbgx4EcsNeh20Gd6noj7XRBqzZ8F50ZF3LI1TyWxdgZpGoziHD1DpXS2YHal4yT0R9 RQDA==
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, qemu-devel@xxxxxxxxxx
  • Delivery-date: Mon, 26 Jun 2023 21:19:46 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

I need advice on how to debug this.

One thing that stands out is uhci_irq().
It reads a u16 from the USBSTS register. 

On the qemu side, this read is served from bmdma_read. Since the read
size is 2, the result is ~0, and uhci_irq() turns the controller off.
In other words, memory_region_ops_read from addr=0xc102 is served from 
"piix-bmdma"

If the pci_set_word calls in piix_ide_reset are skipped, the read is
served from uhci_port_write. This is the expected behavior.
In other words, memory_region_ops_read from addr=0xc102 is served from "uhci".

So far I was unable to decipher how the pci_set_word calls can
possibly affect the outcome and the owner of memory_region_ops_read.


Thanks,
Olaf

Wed, 10 May 2023 00:58:27 +0200 Olaf Hering <olaf@xxxxxxxxx>:

> Resuming this old thread about an unfixed bug, which was introduced in 
> qemu-4.2:
> 
> qemu ends up in piix_ide_reset from pci_unplug_disks.
> This was not the case prior 4.2, the removed call to
> qemu_register_reset(piix3_reset, d) in
> ee358e919e385fdc79d59d0d47b4a81e349cd5c9 did apparently nothing.
> 
> In my debugging (with v8.0.0) it turned out the three pci_set_word
> causes the domU to hang. In fact, it is just the last one:
> 
>    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
> 
> It changes the value from 0xc121 to 0x1.
> 
> The question is: what does this do in practice?
> 
> Starting with recent qemu (like 7.2), the domU sometimes proceeds with
> these messages:
> 
>     [    1.631161] uhci_hcd 0000:00:01.2: host system error, PCI problems?
>     [    1.634965] uhci_hcd 0000:00:01.2: host controller process error, 
> something bad happened!
>     [    1.634965] uhci_hcd 0000:00:01.2: host controller halted, very bad!
>     [    1.634965] uhci_hcd 0000:00:01.2: HC died; cleaning up
>     Loading basic drivers...[    2.398048] Disabling IRQ #23

Attachment: pgprMNTZtk6np.pgp
Description: Digitale Signatur von OpenPGP


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.