[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 05/21] IOMMU/x86: restrict IO-APIC mappings for PV Dom0


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 4 May 2022 12:51:25 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Qswr8SjRBWKZEizsfbh4whqdiXqN2ADspLgyl7Io1AY=; b=hi1t+AmmLtjkjJfaa6n+bAKJL0W/VwYC/MqRJMQW4ORCyuv3xMQDsFKdrvMmOSd4UjmkYIi7t8YWu7/Y6KIMW0oO0s9O5pWX4YnmHm4Wl6eo2Zxg0LzYthxmKoG/v9yty7M+90tfT2jJm9bkYzX3F8WNMXm8EAAi5WdT9ZZ22i1DvOFD0XEZ4nM15gYXfruV8podRh36tQf8c43DdHzrqlxeCdfijQLss4aMcL6tVfdOzrnHLHDAhS48tcQnHBAXE3g7Vm2JNJuUM00rVX3YaD/N+DxMO6Mkr7NqiMoYYRfs8djRTrA0doPtWjr45jKAiHjUAaoPL3v+FL+CxftVVg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bgoGVzG0berEc0c6nbj39pTHQ5CIbjTg88dQKp2mAEP1Q32nxDMQCieHx136OoqakbnL0AYtsu9WBSvCol8d48XAL1uIUSUcCkOcWmATkFyCc9araG1452gWDC8KfpNB5KvsR5wLEWWEyLUplMRRmfkuM/5XCHUnLo0CcsLZf8QAQD6Xp7noTkyWNKuU/+adHeXi5D14YTZKiiqkmGrjDivhFlQd74IG64QQld7m+0iD//IkLcDnNuYjIJM3PemovbBhco4ne6l4oIsZeVBgxiNsumnrB8S+5D0OtDzhDeJre7lcIOfae3gjFUA+Md0EDtrI29aBcmlrtC35tFIFFg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 04 May 2022 10:51:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.05.2022 12:30, Roger Pau Monné wrote:
> On Wed, May 04, 2022 at 11:32:51AM +0200, Jan Beulich wrote:
>> On 03.05.2022 16:50, Jan Beulich wrote:
>>> On 03.05.2022 15:00, Roger Pau Monné wrote:
>>>> On Mon, Apr 25, 2022 at 10:34:23AM +0200, Jan Beulich wrote:
>>>>> While already the case for PVH, there's no reason to treat PV
>>>>> differently here, though of course the addresses get taken from another
>>>>> source in this case. Except that, to match CPU side mappings, by default
>>>>> we permit r/o ones. This then also means we now deal consistently with
>>>>> IO-APICs whose MMIO is or is not covered by E820 reserved regions.
>>>>>
>>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>>> ---
>>>>> [integrated] v1: Integrate into series.
>>>>> [standalone] v2: Keep IOMMU mappings in sync with CPU ones.
>>>>>
>>>>> --- a/xen/drivers/passthrough/x86/iommu.c
>>>>> +++ b/xen/drivers/passthrough/x86/iommu.c
>>>>> @@ -275,12 +275,12 @@ void iommu_identity_map_teardown(struct
>>>>>      }
>>>>>  }
>>>>>  
>>>>> -static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
>>>>> -                                         unsigned long pfn,
>>>>> -                                         unsigned long max_pfn)
>>>>> +static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d,
>>>>> +                                                 unsigned long pfn,
>>>>> +                                                 unsigned long max_pfn)
>>>>>  {
>>>>>      mfn_t mfn = _mfn(pfn);
>>>>> -    unsigned int i, type;
>>>>> +    unsigned int i, type, perms = IOMMUF_readable | IOMMUF_writable;
>>>>>  
>>>>>      /*
>>>>>       * Set up 1:1 mapping for dom0. Default to include only conventional 
>>>>> RAM
>>>>> @@ -289,44 +289,60 @@ static bool __hwdom_init hwdom_iommu_map
>>>>>       * that fall in unusable ranges for PV Dom0.
>>>>>       */
>>>>>      if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
>>>>> -        return false;
>>>>> +        return 0;
>>>>>  
>>>>>      switch ( type = page_get_ram_type(mfn) )
>>>>>      {
>>>>>      case RAM_TYPE_UNUSABLE:
>>>>> -        return false;
>>>>> +        return 0;
>>>>>  
>>>>>      case RAM_TYPE_CONVENTIONAL:
>>>>>          if ( iommu_hwdom_strict )
>>>>> -            return false;
>>>>> +            return 0;
>>>>>          break;
>>>>>  
>>>>>      default:
>>>>>          if ( type & RAM_TYPE_RESERVED )
>>>>>          {
>>>>>              if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
>>>>> -                return false;
>>>>> +                perms = 0;
>>>>>          }
>>>>> -        else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
>>>>> max_pfn )
>>>>> -            return false;
>>>>> +        else if ( is_hvm_domain(d) )
>>>>> +            return 0;
>>>>> +        else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
>>>>> +            perms = 0;
>>>>>      }
>>>>>  
>>>>>      /* Check that it doesn't overlap with the Interrupt Address Range. */
>>>>>      if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
>>>>> -        return false;
>>>>> +        return 0;
>>>>>      /* ... or the IO-APIC */
>>>>> -    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
>>>>> -        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>>>>> -            return false;
>>>>> +    if ( has_vioapic(d) )
>>>>> +    {
>>>>> +        for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
>>>>> +            if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>>>>> +                return 0;
>>>>> +    }
>>>>> +    else if ( is_pv_domain(d) )
>>>>> +    {
>>>>> +        /*
>>>>> +         * Be consistent with CPU mappings: Dom0 is permitted to 
>>>>> establish r/o
>>>>> +         * ones there, so it should also have such established for 
>>>>> IOMMUs.
>>>>> +         */
>>>>> +        for ( i = 0; i < nr_ioapics; i++ )
>>>>> +            if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
>>>>> +                return rangeset_contains_singleton(mmio_ro_ranges, pfn)
>>>>> +                       ? IOMMUF_readable : 0;
>>>>
>>>> If we really are after consistency with CPU side mappings, we should
>>>> likely take the whole contents of mmio_ro_ranges and d->iomem_caps
>>>> into account, not just the pages belonging to the IO-APIC?
>>>>
>>>> There could also be HPET pages mapped as RO for PV.
>>>
>>> Hmm. This would be a yet bigger functional change, but indeed would further
>>> improve consistency. But shouldn't we then also establish r/w mappings for
>>> stuff in ->iomem_caps but not in mmio_ro_ranges? This would feel like going
>>> too far ...
>>
>> FTAOD I didn't mean to say that I think such mappings shouldn't be there;
>> I have been of the opinion that e.g. I/O directly to/from the linear
>> frame buffer of a graphics device should in principle be permitted. But
>> which specific mappings to put in place can imo not be derived from
>> ->iomem_caps, as we merely subtract certain ranges after initially having
>> set all bits in it. Besides ranges not mapping any MMIO, even something
>> like the PCI ECAM ranges (parts of which we may also force to r/o, and
>> which we would hence cover here if I followed your suggestion) are
>> questionable in this regard.
> 
> Right, ->iomem_caps is indeed too wide for our purpose.  What
> about using something like:
> 
> else if ( is_pv_domain(d) )
> {
>     if ( !iomem_access_permitted(d, pfn, pfn) )
>         return 0;

We can't return 0 here (as RAM pages also make it here when
!iommu_hwdom_strict), so I can at best take this as a vague outline
of what you really mean. And I don't want to rely on RAM pages being
(imo wrongly) represented by set bits in Dom0's iomem_caps.

>     if ( rangeset_contains_singleton(mmio_ro_ranges, pfn) )
>         return IOMMUF_readable;
> }
> 
> That would get us a bit closer to allowed CPU side mappings, and we
> don't need to special case IO-APIC or HPET addresses as those are
> already added to ->iomem_caps or mmio_ro_ranges respectively by
> dom0_setup_permissions().

This won't fit in a region of code framed by a (split) comment
saying "Check that it doesn't overlap with ...". Hence if anything
I could put something like this further down. Yet even then the
question remains what to do with ranges which pass
iomem_access_permitted() but
- aren't really MMIO,
- are inside MMCFG,
- are otherwise special.

Or did you perhaps mean to suggest something like

else if ( is_pv_domain(d) && iomem_access_permitted(d, pfn, pfn) &&
          rangeset_contains_singleton(mmio_ro_ranges, pfn) )
    return IOMMUF_readable;

? Then there would only remain the question of whether mapping r/o
MMCFG pages is okay (I don't think it is), but that could then be
special-cased similar to what's done further down for vPCI (by not
returning in the "else if", but merely updating "perms").

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.