[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

To: Tim Deegan <tim@xxxxxxx>
From: "Xu, Quan" <quan.xu@xxxxxxxxx>
Date: Mon, 28 Sep 2015 03:08:43 +0000
Accept-language: en-US
Cc: "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "keir@xxxxxxx" <keir@xxxxxxx>, "ian.campbell@xxxxxxxxxx" <ian.campbell@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxxxxx" <george.dunlap@xxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>
Delivery-date: Mon, 28 Sep 2015 03:09:22 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Thread-index: AQHQ8EAwCWbQ/68pUU+KYZ+iGCXCXZ5HDM7wgALF3YCAB30yQA==
Thread-topic: [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device

Tim, thanks for your review.

>>> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote:
> Hi,
> 
> At 14:09 +0000 on 21 Sep (1442844587), Xu, Quan wrote:
> > George / Tim,
> > Could you help me review these memory patches? Thanks!
> 
> The interrupt-mapping and chipset control parts of this are outside my
> understanding. :)  And I'm not an x86/mm maintainer any more, but I'll have a
> look:
> 

I've heard so much about you from my teammates. :):)

> 7/13: I'm not convinced that making the vcpu spin calling
> sched_yield() is a very good plan.  Better to explicitly pause the domain if 
> you
> need its vcpus not to run.  But first -- why does IOMMU flushing mean that
> vcpus can't be run?
> 

Ensure that the required Device-TLB flushes are applied before returning to 
guest mode via hypercall completion.
the domain can also DMA this freed pages.
For example, Call do_memory_op HYPERCALL to free a pageX (gfn --- mfn) from 
domain, and assume that there is
a mapping(gfn --- mfn) in Device-TLB, once the vcpu has returned to guest mode, 
then the domain can still DMA this freed pageX.
Domain kernel cannot use this being freed page, otherwise this is a domain 
kernel bug.

In fact, any time you want to reschedule, you need to raise SCHEDULE_SOFTIRQ, 
which is then checked and serviced
in do_softirq().
It can also pause the domain and unpause the domain in QI interrupt handler, or 
block the vCPU and unblock all of the 
domain's vCPU in QI interrupt handler.  It is the performance consideration 
while using sched_yield().

> 8/13: This doesn't seem like it's enough make this safe. :(  Yes, you can't 
> allocate
> a page to another VM while there are IOTLB entries pointing to it, but you 
> also
> can't use it for other things inside the same domain!
> 
> It might be enough, if you can argue that e.g. the IOMMU tables only ever have
> mappings of pages owned by the domain, and that any other feature that might
> rely on the daomin's memory being made read-only (e.g. sharing) is explicily
> disallowed, but I'd want to see those things mechanically enforced.
> 

As mentioned above,  Device-TLB flushes are applied before returning to guest 
mode via hypercall completion.
If the hypercall is not completed, the freed page will not be released back to 
xen. Domain kernel cannot use this being freed page, otherwise this is a domain 
kernel bug.
So I think it is enough.   :):)

> I think the safest answer is to make the IOMMU table take typed refcounts to
> anything it points to, and only drop those refcounts when the flush completes,
> but I can imaging that becoming complex.
> 

Yes, i also think so.
I have a similar design, taking typed refcount and increasing the page refcount 
(as PGT_pinned / PGC_allocated), but i did it only in page freed invoker flow.
I dropped this design, as i should introduce anther flag to aware that the 
domain is assigned a ATS device(A ugly design). 
I didn't make the IOMMU table to take typed refcount to anything it points to. 
This is really complex.

> You may also need to consider grant-mapped memory - you need to make sure
> the grant isn't released until after the flush completes.
> 
 I can introduce a new grant type for this case (such as GNTMAP_iommu_dma)..
For ending foreign access, it should _only_ be used when the remote domain has 
unmapped the frame. 
gnttab_query_foreign_access( gref ) will indicate the state of any mapping.

> 12/13: Ah, I see you are looking at grant table stuff, at least for the 
> transfer path. :)
> Still the mapping path needs to be looked at.
> 

Yes, I think you can do it. :):)

> Cheers,
> 
> Tim.

Tim, thanks again. I really appreciate your review.

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
  - From: Tim Deegan
- Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
  - From: Jan Beulich

References:
- Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
  - From: Xu, Quan
- Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
  - From: Tim Deegan

Prev by Date: [Xen-devel] [xen-4.4-testing test] 62389: regressions - FAIL
Next by Date: Re: [Xen-devel] [PATCH 1/5] libxc: remove allocate member from struct xc_dom_image
Previous by thread: Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
Next by thread: Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.