[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PCI pass-through problem for SN570 NVME SSD


  • To: "G.R." <firemeteor@xxxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 6 Jul 2022 08:33:41 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=INVfxvlkXcwzrlGO3ailGoI45hIptv/7gcrFGKyVzOM=; b=LYgRf8mMQFd7AaovfmdjwdKhzXY3wAnb+HPVckv9D3YDyZmar9amnC50Wih///99+zSFjqpTGUM9BxcmFWqVUvzxoNmPYZcB6YWfWTsvxOv1mMlxKW+9JTH/zE4qAIP9rriNqOl3WdFN6Kf23XKmGOhM3FzcMQ7ANRnNkngoPUVxWkzr2mm1J+QNsGHglTitGcScc1X6K5PThn1mW/FUptqu+xsiipRH2FWZaJmUEpXmGbAhiVTW8hkSOMkpo2qA7aagxXB6gc8m7zrY6J106zEg8bb/kVGEU25UVuBD8wNy4WebH0zNdUD5NXrMjpPnuCtitrLbzwzxaRaC18lTow==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YTOc4cD0eI9s5FUrabso8vUMpNAMVNE0owpULnCoaZB/jwq07+7Rw6gvq6VOEV7YxwB1Ybbb8IR+b6XVRSPTUwSV5ONjSWh8sRtyVfirmOGh2OL6yKtyAckVQwFXAlBgdnT6ejsDCU+bLZUv/Qg6tpBZMS/kUiIs8uBt5QllP/+dM5xZYaiO0HIJjq4k/MS07n89QFFdf5QMzA2EhRuWb1ZozXL6W64gfef+Y6JcrbjxOiCVTrcqGYSnvvjn3IHJYcIUQY+Fh+ovAuN/sWJ4ThCs04f1Ii0icpCkpZYG0foCFDgV4iSTqrPN8MPN3bEfEvwCjsMY/GTvGMX9MT2psQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 06 Jul 2022 06:33:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 06.07.2022 08:25, G.R. wrote:
> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>> Nothing useful in there. Yet independent of that I guess we need to
>> separate the issues you're seeing. Otherwise it'll be impossible to
>> know what piece of data belongs where.
> Yep, I think I'm seeing several different issues here:
> 1. The FLR related DPC / AER message seen on the 1st attempt only when
> pciback tries to seize and release the SN570
>     - Later-on pciback operations appear just fine.
> 2. MSI-X preparation failure message that shows up each time the SN570
> is seized by pciback or when it's passed to domU.
> 3. XEN tries to map BAR from two devices to the same page
> 4. The "write-back to unknown field" message in QEMU log that goes
> away with permissive=1 passthrough config.
> 5. The "irq 16: nobody cared" message shows up *sometimes* in a
> pattern that I haven't figured out  (See attached)
> 6. The FreeBSD domU sees the device but fails to use it because low
> level commands sent to it are aborted.
> 7. The device does not return to the pci-assignable-list when the domU
> it was assigned shuts-down. (See attached)
> 
> #3 appears to be a known issue that could be worked around with
> patches from the list.
> I suspect #1 may have something to do with the device itself. It's
> still not clear if it's deadly or just annoying.
> I was able to update the firmware to the latest version and confirmed
> that the new firmware didn't make any noticeable difference.
> 
> I suspect issue #2, #4, #5, #6, #7 may be related, and the
> pass-through was not completely successful...
> 
> Should I expect a debug build of XEN hypervisor to give better
> diagnose messages, without the debug patch that Roger mentioned?

Well, "expect" is perhaps too much to say, but with problems like
yours (and even more so with multiple ones) using a debug
hypervisor (or kernel, if there such a build mode existed) is imo
always a good idea. As is using as up-to-date a version as
possible.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.