[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Wed, 27 Jan 2016 14:28:17 +0000
Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>
Delivery-date: Wed, 27 Jan 2016 14:28:27 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 27/01/16 13:37, Jan Beulich wrote:
>>>> On 27.01.16 at 13:32, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 25/01/16 16:18, Jan Beulich wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2491,7 +2491,7 @@ static int vmx_alloc_vlapic_mapping(stru
>>>      share_xen_page_with_guest(pg, d, XENSHARE_writable);
>>>      d->arch.hvm_domain.vmx.apic_access_mfn = mfn;
>>>      set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE), _mfn(mfn),
>>> -                       p2m_get_hostp2m(d)->default_access);
>>> +                       PAGE_ORDER_4K, p2m_get_hostp2m(d)->default_access);
>>>  
>> This should ASSERT() success, in case we make further changes to the
>> error handling.
> Maybe, but since it didn't before I don't see why this couldn't /
> shouldn't be an independent future patch.

Can be.  IMO it is a bug that it isn't already checked.  (-ENOMEM when
allocating p2m leaves perhaps?)

>
>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -899,48 +899,62 @@ void p2m_change_type_range(struct domain
>>>      p2m_unlock(p2m);
>>>  }
>>>  
>>> -/* Returns: 0 for success, -errno for failure */
>>> +/*
>>> + * Returns:
>>> + *    0        for success
>>> + *    -errno   for failure
>>> + *    order+1  for caller to retry with order (guaranteed smaller than
>>> + *             the order value passed in)
>>> + */
>>>  static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t 
>>> mfn,
>>> -                               p2m_type_t gfn_p2mt, p2m_access_t access)
>>> +                               unsigned int order, p2m_type_t gfn_p2mt,
>>> +                               p2m_access_t access)
>>>  {
>>>      int rc = 0;
>>>      p2m_access_t a;
>>>      p2m_type_t ot;
>>>      mfn_t omfn;
>>> +    unsigned int cur_order = 0;
>>>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>  
>>>      if ( !paging_mode_translate(d) )
>>>          return -EIO;
>>>  
>>> -    gfn_lock(p2m, gfn, 0);
>>> -    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
>>> +    gfn_lock(p2m, gfn, order);
>>> +    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, &cur_order, NULL);
>>> +    if ( cur_order < order )
>>> +    {
>>> +        gfn_unlock(p2m, gfn, order);
>>> +        return cur_order + 1;
>> Your comment states that the return value is guarenteed to be less than
>> the passed-in order, but this is not the case here.  cur_order could, in
>> principle, be only 1 less than order, at which point your documentation
>> is incorrect.
>>
>> Does this rely on the x86 architectural orders to function as documented?
> No. Maybe the comment text is ambiguous, but I don't see how to
> improve it without making it too lengthy: The return value is
> <order>+1, telling the caller to retry with <order>, which is
> guaranteed to be less than the order that got passed in. I.e. taking
> the variable naming above, the caller would have to retry with
> cur_order, which - due to the if() - is smaller than order.

Ah - I see.  The text is indeed confusing.  How about:

"1 + new order: for caller to retry with smaller order (guaranteed to be
smaller than order passed in)"

>
>>> +    }
>>>      if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
>>>      {
>>> -        gfn_unlock(p2m, gfn, 0);
>>> +        gfn_unlock(p2m, gfn, order);
>>>          domain_crash(d);
>>>          return -ENOENT;
>>>      }
>>>      else if ( p2m_is_ram(ot) )
>>>      {
>>> +        unsigned long i;
>>> +
>>>          ASSERT(mfn_valid(omfn));
>> Shouldn't this check should be extended to the top of the order?
> Well, yes, perhaps better to move it into ...
>
>>> -        set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY);
>>> +        for ( i = 0; i < (1UL << order); ++i )
>>> +            set_gpfn_from_mfn(mfn_x(omfn) + i, INVALID_M2P_ENTRY);
> ... the body of the for(). But I'll wait with v6 until we settled on
> the other aspects you raise.
>
>>>  int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
>>> -                       p2m_access_t access)
>>> +                       unsigned int order, p2m_access_t access)
>>>  {
>>> -    return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access);
>>> +    if ( order &&
>>> +         rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
>>> +                                 mfn_x(mfn) + (1UL << order) - 1) &&
>>> +         !rangeset_contains_range(mmio_ro_ranges, mfn_x(mfn),
>>> +                                  mfn_x(mfn) + (1UL << order) - 1) )
>>> +        return order;
>> Should this not be a hard error?  Even retrying with a lower order is
>> going fail.
> Why? The latest when order == 0, rangeset_overlaps_range()
> will return the same as rangeset_contains_range(), and hence
> the condition above will always be false (one of the two reasons
> for checking order first here).

It isn't the order check which is an issue.

One way or another, if the original (mfn/order) fails the rangeset
checks, the overall call is going to fail, but it will be re-executed
repeatedly with an order decreasing to 0.  Wouldn't it be better just to
short-circuit this back&forth?

Relatedly, is there actually anything wrong with making a superpage
read-only mapping over some scattered read-only 4K pages?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich

References:
- [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Ian Campbell
- Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Ian Campbell
- Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings
  - From: Ian Campbell
- [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
  - From: Andrew Cooper
- Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] Error booting Xen
Next by Date: Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
Previous by thread: Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
Next by thread: Re: [Xen-devel] [PATCH v5] x86/p2m: use large pages for MMIO mappings
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.