[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v17 1/2] mem_sharing: fix sharability check during fork reset



On Sat, Apr 25, 2020 at 3:01 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Fri, Apr 24, 2020 at 06:18:24AM -0600, Tamas K Lengyel wrote:
> > On Fri, Apr 24, 2020 at 3:44 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > wrote:
> > >
> > > On Thu, Apr 23, 2020 at 08:30:06AM -0700, Tamas K Lengyel wrote:
> > > > When resetting a VM fork we ought to only remove pages that were 
> > > > allocated for
> > > > the fork during it's execution and the contents copied over from the 
> > > > parent.
> > > > This can be determined if the page is sharable as special pages used by 
> > > > the
> > > > fork for other purposes will not pass this test. Unfortunately during 
> > > > the fork
> > > > reset loop we only partially check whether that's the case. A page's 
> > > > type may
> > > > indicate it is sharable (pass p2m_is_sharable) but that's not a 
> > > > sufficient
> > > > check by itself. All checks that are normally performed before a page is
> > > > converted to the sharable type need to be performed to avoid removing 
> > > > pages
> > > > from the p2m that may be used for other purposes. For example, 
> > > > currently the
> > > > reset loop also removes the vcpu info pages from the p2m, potentially 
> > > > putting
> > > > the guest into infinite page-fault loops.
> > > >
> > > > Signed-off-by: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
> > >
> > > Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >
> > Thanks!
> >
> > >
> > > I've been looking however and I'm not able to spot where you copy the
> > > shared_info data, which I think it's also quite critical for the
> > > domain, as it contains the info about event channels (when using L2).
> > > Also for HVM forks the shared info should be mapped at the same gfn as
> > > the parent, or else the child trying to access it will trigger an EPT
> > > fault that won't be able to populate such gfn, because the shared_info
> > > on the parent is already shared between Xen and the parent.
> >
> > Pages that cause an EPT fault but can't be made shared get a new page
> > allocated for them and copied in mem_sharing_fork_page. There are many
> > pages like that, mostly due to QEMU mapping them and thus holding an
> > extra reference. That said, shared_info is manually copied during
> > forking in copy_special_pages, at the end of the function you will
> > find:
> >
> > old_mfn = _mfn(virt_to_mfn(d->shared_info));
> > new_mfn = _mfn(virt_to_mfn(cd->shared_info));
> >
> > copy_domain_page(new_mfn, old_mfn);
>
> Yes, that's indeed fine, you need to copy the contents of the shared
> info page, but for HVM you also need to make sure the shared info page
> is mapped at the same gfn as the parent. HVM guest issue a hypercall
> to request the mapping of the shared info page to a specific gfn,
> since it's not added to the guest physmap by default.
> XENMAPSPACE_shared_info is used in order to request the shared info
> page to be mapped at a specific gfn.
>
> AFAICT during fork such shared info mapping is not replicated to the
> child, and hence the child trying to access the gfn of the shared info
> page would just result in EPT faults that won't be fixed (because the
> parent shared info page cannot be shared with the child)?
>
> You should be able to use get_gpfn_from_mfn in order to get the parent
> gfn where the shared info mfn is mapped (if any), and then replicate
> this in the child using it's own shared info.
>
> On fork reset you should check if the child shared info gfn != parent
> shared info gfn and reinstate the parent state if different from the
> child.

OK, I see what you mean. In the way things set up currently the EPT
fault-loop problem doesn't happen since the page does get copied, it
just gets copied to a new mfn not the one d->shared_info points to. So
whatever issue that may bring it must be subtle because we haven't
noticed any instability.

Also, looking at the save/restore code in libxc it seems to me that
shared_info is actually a PV specific page and it doesn't get
saved/restored with an HVM domain. The hvm code paths don't process
REC_TYPE_SHARED_INFO at all. So since forks are exclusively for HVM
domains, do we really need it and if so, why doesn't HVM save/restore
need it?

Tamas



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.