[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 05/21] IOMMU/x86: restrict IO-APIC mappings for PV Dom0


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 4 May 2022 15:19:16 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OIegK4KQyJkiHm0c3sXrkiucNtnTVk8d6mJ1eMXyT9k=; b=MqLbC+TG3FP8xG3SWH+1oWgu6+cqSYsTM79sAInC++oUUgkvs9+jbeGJG7u7TE8SYEK2bNJR03mebSEarGANvQEGgYES6UtPyAcch/0oA1MAk4EvRoYdKA4qj+a50/XTEoMZI2RMSRHG7/PVDwUtWggiGY6d2c4VDBXYgZz8NCOfDTkTx3m/4fChiph5FYTxMekHvqCORs7UuHdeN8ljeut8ix+PrdEjRq1PLuxT2M753vmQNLSeEgc3+byoSQysjgA683wQwpNDkSgfPVhAv6J0wZPiyhfxBwubsj1YBsX0EMvw5mHlAFFzCYP/PYtrBulYoiCmmsrtVvP0RkeZ/Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P5QzvFr1BFpvk6Y08TYMJxTQezJVVmarXWbCoqZcc+c/q4HIpHu5oVGc+sY+MjURvXchRn+jlB1vvY2xGntAVM3Y2sE+TpKe4ds4tPbxHMRG5ig0jnz3T4UvqYb6EDumeyspUq8XkdpcGwXNge9kpQg3plxt4nRajoHD2X1y8yN+gItyal9FYRuGlYPDxY/xAx0rNDv1FhqYicAtFPDisPeaMFQCfbIp5F+xTv2fQD3as0uqnrHnauUAbBdeOewwFWy8NXdBEqejYk2BWXJvfXkZlTj6IpZ5QSnXqkyysZ1yWozp5Oo8ZVebNAFV211qvVHhwDj58eCkKftVPYCKwg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 04 May 2022 13:19:30 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.05.2022 15:00, Roger Pau Monné wrote:
> On Wed, May 04, 2022 at 02:12:58PM +0200, Jan Beulich wrote:
>> On 04.05.2022 14:01, Roger Pau Monné wrote:
>>> On Wed, May 04, 2022 at 12:51:25PM +0200, Jan Beulich wrote:
>>>> On 04.05.2022 12:30, Roger Pau Monné wrote:
>>>>> Right, ->iomem_caps is indeed too wide for our purpose.  What
>>>>> about using something like:
>>>>>
>>>>> else if ( is_pv_domain(d) )
>>>>> {
>>>>>     if ( !iomem_access_permitted(d, pfn, pfn) )
>>>>>         return 0;
>>>>
>>>> We can't return 0 here (as RAM pages also make it here when
>>>> !iommu_hwdom_strict), so I can at best take this as a vague outline
>>>> of what you really mean. And I don't want to rely on RAM pages being
>>>> (imo wrongly) represented by set bits in Dom0's iomem_caps.
>>>
>>> Well, yes, my suggestion was taking into account that ->iomem_caps for
>>> dom0 has mostly holes for things that shouldn't be mapped, but
>>> otherwise contains everything else as allowed (including RAM).
>>>
>>> We could instead do:
>>>
>>> else if ( is_pv_domain(d) && type != RAM_TYPE_CONVENTIONAL )
>>> {
>>>     ...
>>>
>>> So that we don't rely on RAM being 'allowed' in ->iomem_caps?
>>
>> This would feel to me like excess special casing.
> 
> What about placing this in the 'default:' label on the type switch a
> bit above?

I'd really like to stick to the present layout of where the special
casing is done, with PV and PVH logic at least next to each other. I
continue to think the construct I suggested (still visible below)
would do.

>>>>>     if ( rangeset_contains_singleton(mmio_ro_ranges, pfn) )
>>>>>         return IOMMUF_readable;
>>>>> }
>>>>>
>>>>> That would get us a bit closer to allowed CPU side mappings, and we
>>>>> don't need to special case IO-APIC or HPET addresses as those are
>>>>> already added to ->iomem_caps or mmio_ro_ranges respectively by
>>>>> dom0_setup_permissions().
>>>>
>>>> This won't fit in a region of code framed by a (split) comment
>>>> saying "Check that it doesn't overlap with ...". Hence if anything
>>>> I could put something like this further down. Yet even then the
>>>> question remains what to do with ranges which pass
>>>> iomem_access_permitted() but
>>>> - aren't really MMIO,
>>>> - are inside MMCFG,
>>>> - are otherwise special.
>>>>
>>>> Or did you perhaps mean to suggest something like
>>>>
>>>> else if ( is_pv_domain(d) && iomem_access_permitted(d, pfn, pfn) &&
>>>>           rangeset_contains_singleton(mmio_ro_ranges, pfn) )
>>>>     return IOMMUF_readable;
>>>
>>> I don't think this would be fully correct, as we would still allow
>>> mappings of IO-APIC pages explicitly banned in ->iomem_caps by not
>>> handling those?
>>
>> CPU side mappings don't deal with the IO-APICs specifically. They only
>> care about iomem_caps and mmio_ro_ranges. Hence explicitly banned
>> IO-APIC pages cannot be mapped there either. (Of course we only do
>> such banning if IO-APIC pages weren't possible to represent in
>> mmio_ro_ranges, which should effectively be never.)
> 
> I think I haven't expressed myself correctly.
> 
> This construct won't return 0 for pfns not in iomem_caps, and hence
> could allow mapping of addresses not in iomem_caps?

I'm afraid I don't understand: There's an iomem_access_permitted()
in the conditional. How would this allow mapping pages outside of
iomem_caps? The default case higher up has already forced perms to
zero for any non-RAM page (unless iommu_hwdom_inclusive).

>>>> ? Then there would only remain the question of whether mapping r/o
>>>> MMCFG pages is okay (I don't think it is), but that could then be
>>>> special-cased similar to what's done further down for vPCI (by not
>>>> returning in the "else if", but merely updating "perms").
>>>
>>> Well part of the point of this is to make CPU and Device mappings
>>> more similar.  I don't think devices have any business in poking at
>>> the MMCFG range, so it's fine to explicitly ban that range.  But I
>>> would have also said the same for IO-APIC pages, so I'm unsure why are
>>> IO-APIC pages fine to be mapped RO, but not the MMCFG range.
>>
>> I wouldn't have wanted to allow r/o mappings of the IO-APICs, but
>> Linux plus the ACPI tables of certain vendors require us to permit
>> this. If we didn't, Dom0 would crash there during boot.
> 
> Right, but those are required for the CPU only.  I think it's a fine
> goal to try to have similar mappings for CPU and Devices, and then
> that would also cover MMCFG in the PV case.  Or else it fine to assume
> CPU vs Device mappings will be slightly different, and then don't add
> any mappings for IO-APIC, HPET or MMCFG to the Device page tables
> (likely there's more that could be added here).

It being different is what Andrew looks to strongly dislike. And I agree
with this up to a certain point, i.e. I'm having a hard time seeing why
we should put in MMCFG mappings just for this reason. But if consensus
was that consistency across all types of MMIO is the goal, then I could
live with also making MMCFG mappings ...

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.