[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access



>> +/* Get the page permission of the mfn from page_info->shadow_flags */
>> +static mfn_t
>> +p2m_mem_access_get_entry(struct p2m_domain *p2m, unsigned long
>gfn,
>> +                         p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
>> +                         unsigned int *page_order)
>> +{
>> +    struct domain *d = p2m->domain;
>> +    /* For PV guests mfn == gfn */
>
>Valid, but why is this not being checked in any way in the earlier
>"set" counterpart?

I will add that check.

>> +    mfn_t mfn = _mfn(gfn);
>> +    struct page_info *page = mfn_to_page(mfn);
>> +
>> +    ASSERT(shadow_mode_enabled(d));
>> +
>> +    *t = p2m_ram_rw;
>> +
>> +    if ( page_get_owner(page) != d )
>
>I think there's a mfn_valid() check missing prior to this re-ref of page.

I will add that too.

>> @@ -1356,7 +1357,7 @@ void shadow_prealloc(struct domain *d, u32 type,
>> unsigned int count)
>>
>>  /* Deliberately free all the memory we can: this will tear down all of
>>   * this domain's shadows */
>> -static void shadow_blow_tables(struct domain *d)
>> +void shadow_blow_tables(struct domain *d)
>
>This doesn't belong here - the patch doesn't add any new/external
>caller of this function.

I was trying to bunch shadow changes in to one patch. I will add this to the 
mem_access patch where the external caller is being added.

>> @@ -2435,15 +2436,20 @@ int sh_remove_all_mappings(struct vcpu *v,
>mfn_t gmfn)
>>      /* If that didn't catch the mapping, something is very wrong */
>>      if ( !sh_check_page_has_no_refs(page) )
>>      {
>> -        /* Don't complain if we're in HVM and there are some extra mappings:
>> +        /*
>> +         * Don't complain if we're in HVM and there are some extra mappings:
>>           * The qemu helper process has an untyped mapping of this dom's RAM
>>
>>           * and the HVM restore program takes another.
>>           * Also allow one typed refcount for xenheap pages, to match
>> -         * share_xen_page_with_guest(). */
>> +         * share_xen_page_with_guest().
>> +         * PV domains that have a mem_access listener, runs in shadow mode
>> +         * without refcounts.
>> +         */
>>          if ( !(shadow_mode_external(v->domain)
>>                 && (page->count_info & PGC_count_mask) <= 3
>>                 && ((page->u.inuse.type_info & PGT_count_mask)
>> -                   == !!is_xen_heap_page(page))) )
>> +                   == !!is_xen_heap_page(page))) &&
>> +             !mem_event_check_ring(&v->domain->mem_event->access) )
>
>To me this doesn't look to be in sync with the comment, as the new
>check is being carried out regardless of domain type. Furthermore
>this continues to have the problem of also hiding issues unrelated
>to mem-access handling.

I will add a check for PV domain. This check is wrt to refcouting. From what 
Tim told me PV domains cannot run in that mode, so I don't think any issues 
will be hidden if I add the check for PV domain.

>> @@ -3009,7 +3020,84 @@ static int sh_page_fault(struct vcpu *v,
>>
>>      /* What mfn is the guest trying to access? */
>>      gfn = guest_l1e_get_gfn(gw.l1e);
>> -    gmfn = get_gfn(d, gfn, &p2mt);
>> +    if ( likely(!mem_event_check_ring(&d->mem_event->access)) )
>> +        gmfn = get_gfn(d, gfn, &p2mt);
>> +    /*
>> +     * A mem_access listener is present, so we will first check if a 
>> violation
>> +     * has occurred.
>> +     */
>> +    else
>> +    {
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
>> +        p2m_access_t p2ma;
>> +
>> +        gmfn = get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, 0,
>NULL);
>> +        if ( mfn_valid(gmfn) && !sh_mfn_is_a_page_table(gmfn)
>> +             && regs->error_code & PFEC_page_present
>> +             && !(regs->error_code & PFEC_reserved_bit) )
>> +        {
>> +            int violation = 0;
>> +            bool_t access_w = !!(regs->error_code & PFEC_write_access);
>> +            bool_t access_x = !!(regs->error_code & PFEC_insn_fetch);
>> +            bool_t access_r = access_x ? 0 : !access_w;
>
>"violation" looks to be a boolean just like the other three, so wants
>to also be bool_t.

Done.

>> +
>> +            /* If the access is against the permissions, then send to 
>> mem_event
>*/
>> +            switch ( p2ma )
>> +            {
>> +            case p2m_access_r:
>> +                violation = access_w || access_x;
>> +                break;
>> +            case p2m_access_rx:
>> +            case p2m_access_rx2rw:
>> +                violation = access_w;
>> +                break;
>> +            case p2m_access_rw:
>> +                violation = access_x;
>> +                break;
>> +            case p2m_access_rwx:
>> +            default:
>> +                break;
>> +            }
>> +
>> +            /*
>> +             * Do not police writes to guest memory from the Xen hypervisor.
>> +             * This keeps PV mem_access on par with HVM. Turn off CR0.WP
>here to
>> +             * allow the write to go through if the guest has marked the 
>> page as
>> +             * writable. Turn it back on in the guest access functions
>> +             * __copy_to_user / __put_user_size() after the write is 
>> completed.
>> +             */
>> +            if ( violation && access_w &&
>> +                 regs->eip >= XEN_VIRT_START && regs->eip <= XEN_VIRT_END )
>
>Definitely < instead of <= on the right side. But - is this safe, the more
>that this again doesn't appear to be sitting in a guest kind specific block?
>I'd at least expect this to be qualified by a regs->cs and/or
>guest_mode() check.

I will add the guest kind check. Should I do it for the whole block i.e add 
is_pv_domain() in addition to mem_event_check_ring()? That will also address 
next comment below. I will add the guest_mode() in addition to the above check 
for policing Xen writes to guest memory.

>> +            {
>> +                unsigned long cr0 = read_cr0();
>> +
>> +                violation = 0;
>> +                if ( cr0 & X86_CR0_WP &&
>> +                     guest_l1e_get_flags(gw.l1e) & _PAGE_RW )
>> +                {
>> +                    cr0 &= ~X86_CR0_WP;
>> +                    write_cr0(cr0);
>> +                    v->arch.pv_vcpu.need_cr0_wp_set = 1;
>
>PV field access within a non-PV-only code block?
>
>> +                }
>
>I wonder how well the window where you're running with CR0.WP clear
>is bounded: The flag serves as a kind of security measure, and hence
>shouldn't be left off for extended periods of time.

I agree. Is there a way I can bound this?

>> +            }
>> +
>> +            if ( violation )
>> +            {
>> +                paddr_t gpa = (mfn_x(gmfn) << PAGE_SHIFT) +
>> +                              (va & ((1 << PAGE_SHIFT) - 1));
>> +                if ( !p2m_mem_access_check(gpa, 1, va, access_r, access_w,
>> +                                           access_x, &req_ptr) )
>> +                {
>> +                    SHADOW_PRINTK("Page access %c%c%c for gmfn=%"PRI_mfn"
>p2ma: %d\n",
>> +                                  (access_r ? 'r' : '-'),
>> +                                  (access_w ? 'w' : '-'),
>> +                                  (access_x ? 'x' : '-'), mfn_x(gmfn), 
>> p2ma);
>> +                    /* Rights not promoted, vcpu paused, work here is done 
>> */
>> +                    goto out_put_gfn;
>
>Rather than re-using just two lines from the normal exit path and
>introducing to it a mem-access specific code block, put the exit
>processing here, at once allowing req_ptr to be limited in scope?

Good idea. I will do that.

>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -43,6 +43,7 @@
>>  #include <asm/page.h>
>>  #include <asm/numa.h>
>>  #include <asm/flushtlb.h>
>> +#include <asm/shadow.h>
>
>Breaking ARM?
>>  #ifdef CONFIG_X86
>>  #include <asm/p2m.h>
>>  #include <asm/setup.h> /* for highmem_start only */
>> @@ -1660,6 +1661,8 @@ int assign_pages(
>>          page_set_owner(&pg[i], d);
>>          smp_wmb(); /* Domain pointer must be visible before updating refcnt.
>*/
>>          pg[i].count_info = PGC_allocated | 1;
>> +        if ( is_pv_domain(d) )
>> +            shadow_set_access(&pg[i], p2m_get_hostp2m(d)->default_access);
>
>I don't think you should call shadow code from here.

Should I add a p2m wrapper for this, so that it is valid only for x86 and a 
no-op for ARM?

>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -380,6 +380,12 @@ struct pv_vcpu
>>      /* Deferred VA-based update state. */
>>      bool_t need_update_runstate_area;
>>      struct vcpu_time_info pending_system_time;
>> +
>> +    /*
>> +     * Flag that tracks if CR0.WP needs to be set after a Xen write to guest
>> +     * memory when a PV domain has a mem_access listener attached to it.
>> +     */
>> +    bool_t need_cr0_wp_set;
>>  };
>
>The new field would better go above need_update_runstate_area
>for better space usage.

Done.

>> --- a/xen/include/asm-x86/x86_64/uaccess.h
>> +++ b/xen/include/asm-x86/x86_64/uaccess.h
>> @@ -1,6 +1,8 @@
>>  #ifndef __X86_64_UACCESS_H
>>  #define __X86_64_UACCESS_H
>>
>> +#include <xen/sched.h>
>
>This is pretty ugly. You only reference the needed fields in a macro,
>so you don't strictly need to include this here as long as all (or at least
>most - the rest could be patched up) use sites include it.

OK, I will give that a shot.

>> @@ -65,6 +67,11 @@ do {
>                       \
>>      case 8: __put_user_asm(x,ptr,retval,"q","","ir",errret);break;  \
>>      default: __put_user_bad();                                      \
>>      }                                                               \
>> +    if ( unlikely(current->arch.pv_vcpu.need_cr0_wp_set) ) \
>> +    { \
>> +        write_cr0(read_cr0() | X86_CR0_WP); \
>> +        current->arch.pv_vcpu.need_cr0_wp_set = 0; \
>> +    } \
>>  } while (0)
>
>Please obey to the original indentation method.

I thought tab characters were not allowed for indentation. But I guess you do 
not want tabs and spaces for indentation purposes to be mixed within a macro?

Thanks,
Aravindh


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.