[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] iommu: make no-quarantine mean no-quarantine


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Scott Davis <scottwd@xxxxxxxxx>
  • From: Scott Davis <scott.davis@xxxxxxxxxx>
  • Date: Tue, 27 Apr 2021 22:00:56 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=starlab.io; dmarc=pass action=none header.from=starlab.io; dkim=pass header.d=starlab.io; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Q8j6s5msbrbc5bV7Il9956h/ZbycnyfmCNCvD4b0gyE=; b=dcw9P13qq6w4CZ9IDqyTzGFo3PWLZMYKqMvvUmPTyFl1ly5nVNxuoUT+CG6pWN9+MkJBluVBLWRWeUMTjbkmdRyXWs5WoifvO8wIKq+7yYyhNjAmDDu7dfKtMmoqRZWvGTssT9sms3W2mYUYGS4z1CHNTSjiCC1t2KjygbiMtukUgmbXSrjWku2dzZGvxyNkkzShtICuZZPKbXnxkVfNvIU03NiIZPA0lJla8aYjyN4grx94UJX1NBerV9haNFRSq47TJVj2F8rqv5JIpA4atBN2UYutPf7OfB+6reBkWdH3GPpjcwSCU5pg8ExLvkV52vUKD5a/FOJP0SpEtFUNlA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UoQJ1EiPsungv3GiBzBTvZSiDXfG22wewC1QxqzQEH2OCyqdF9jwUGaSFiGXumwHDPGVpr7k6C6cB3s6IdFpKVzNhnMiQCXTcUpKGdPYnb5hW8XW6P6afET7iVwsBKprSS5jNN38xXgSH5E7J7IuJ3ptrjW8ncNLfXJSbo3hkeO9CXXGadULUPM2CgMtJe4PXNXu+C+csA7bJRl0ynnWfMd1KiFoH0jY73R+zQ6f25EnpiaIhxVjhKfeUSzDnXN5o4ZmVljVS+keCX683StlxfGHNkevleNBtTOznx6mq8EVBjdGnZCxF7lllBiqvWE7d4neTZkCuFsl4+seyF416g==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=starlab.io;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 27 Apr 2021 22:01:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXOsEjPlXZorMQi0Kt8384oCpmdqrH72EAgAC5poA=
  • Thread-topic: [RFC PATCH] iommu: make no-quarantine mean no-quarantine

Thanks for the feedback, Jan!

On 4/27/21, 2:56 AM, Jan Beulich wrote:
> On 26.04.2021 19:25, Scott Davis wrote:
>> This patch modifies Xen's behavior when making devices assignable while the
>> iommu=no-quarantine command line option is in effect. Currently this option
>> only affects device deassignment, causing devices to get immediately assigned
>> back to Dom0 instead of to the quarantine dom_io domain. This patch extends
>> no-quarantine to device assignment as well, preventing devices from being
>> assigned to dom_io when they are made assignable while no-quarantine is in
>> effect.
>
> Well, the term "quarantine" to me means a safety action taken _after_
> possible exposure to something "bad". Therefore I see this being specific
> to device de-assignment as the logical thing. Hence if a mode like what
> you describe was wanted, I don't think it should be the result of
> "iommu=no-quarantine".

Sorry I'm a bit confused by this. Correct me if wrong, but my understanding is  
that the purpose of assigning a device to dom_io on de-assignment is to protect 
 
Dom0 from the effects of in-flight DMA operations initiated by a DomU. I 
assumed  
that the purpose of (temporarily) assigning to dom_io on assignment was the 
same  
but in reverse: protecting a DomU from Dom0-initiated ops. Is this not the case?

Also, documentation and code already refer to the operation in question as a 
"quarantine" (see xl command line docs and libxl__device_pci_assignable_add) 
and to devices that have undergone it as being "quarantined" (see assign_device 
in xen/drivers/passthrough/pci.c). So if that is not the correct term, there 
may 
be some additional changes needed for consistency. Is there another name that 
would be more appropriate?

I would also point out that, currently, there does not appear to be a way for 
an 
xl user to opt out of quarantine functionality in either direction other than 
by 
manually making devices assignable. There is no xl command line flag to disable 
it and iommu=no-quarantine will have no effect because any device that xl itself
makes assignable will have the struct pci_dev.quarantine flag set, which 
overrides iommu=no-quarantine. Is that intentional?

If I misunderstood and your objection is simply that "quarantine-on-assignment" 
 
and "quarantine-on-deassignment" should be controllable by separate iommu  
options, that's an easy enough change to make. Although I think that might also 
negate the need for/effect of struct pci_dev.quarantine as described above. If 
that's what is desired, any thoughts on what the new option(s) should be called?

>> Background: I am setting up a QEMU-based development and testing environment 
>> for
>> the Crucible team at Star Lab that includes emulated PCIe devices for
>> passthrough and hotplug. I encountered an issue with `xl pci-assignable-add`
>> that causes the host QEMU to rapidly allocate memory until getting 
>> OOM-killed.
>> I then found that the issue could be worked around either by using manual 
>> sysfs
>> commands to rebind devices to pciback or by skipping over the quarantine 
>> logic
>> in `libxl__device_pci_assignable_add`, producing a working system. I hoped 
>> that
>> setting iommu=no-quarantine on the command line would have the same effect, 
>> only
>> to be surprised that it did not.
>
> ... some of this "why do we want this" would belong in the commit message
> imo, not just here.

Thanks for the tip, I will include this info in commit message in any future 
revisions.

Good day,
Scott


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.