[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VT-d async invalidation for Device-TLB.



>>> On 03.06.15 at 09:49, <quan.xu@xxxxxxxxx> wrote:
> Design Overview
> =============
> This design implements a non-spinning model for Device-TLB invalidation - 
> using 
> an interrupt based mechanism. Each domain maintains a invalidation table, and 
> the hypervisor has an entry of invalidation tables. The invalidation table 

entry? Do you mean array or table?

> keeps the count of in-flight Device-TLB invalidation queues, and also 
> provides 
> the same polling parameter for mutil in-flight Device-TLB invalidation queues 
> of each domain.

Which "same polling parameter"? I.e. I'm not sure what this is about
in the first place.

> When a domain issues a request to Device-TLB invalidation queue, update 
> invalidation table's count of in-flight Device-TLB invalidation queue and 
> assign the Status Data of wait descriptor of the invalidation queue. An 
> interrupt is sent out to the hypervisor once a Device-TLB invalidation 
> request 
> is done. In interrupt handler, we will schedule a soft-irq to do the 
> following 
> check: 
>     if invalidation table's count of in-flight Device-TLB invalidation queues 
> == polling parameter:
>          This domain has no in-flight invalidation requests.
>     else
>          This domain has in-flight invalidation requests.
> The domain is put into the "blocked" status if it has in-flight Device-TLB 
> invalidation requests, and awoken when all the requests are done. A fault 
> event will be generated if an invalidation failed. We can either crash the 
> domain or crash Xen.

Crashing Xen can't really be considered an option except when you
can't contain the failed invalidation to a particular VM (which, from
what was written above, should never happen).

>     For Context Invalidation and IOTLB invalidation without Device-TLB 
> invalidation, Invalidation Queue flushes synchronous invalidation as 
> before(This is a tradeoff and the cost of interrupt is overhead).

DMAR_OPERATION_TIMEOUT being 1s, are you saying that you're
not intending to replace the current spinning for the non-ATS case?
Considering that expiring these loops results in panic()s, I would
expect these to become asynchronous _and_ contained to the
affected VM alongside the ATS induced changed behavior. You
talking of overhead - can you quantify that?

> More details:
> 
> 1. invalidation table. We define iommu _invl structure in domain.
> Struct iommu _invl {
>     volatile u64 iommu _invl _poll_slot :62;
>     domid_t dom_id;
>     u64 iommu _invl _status_data :32;
> }__attribute__ ((aligned (64)));
> 
>    iommu _invl _poll_slot: Set it equal to the status address of wait 
> descriptor when the invalidation queue is with Device-TLB.
>    dom_id: Keep the id of the domain.
>    iommu _invl _status_data: Keep the count of in-flight queue with 
> Device-TLB 
> invalidation.

Without further explanation above/below I don't think I really
understand the purpose of this structure, nor its organization: Is
this something imposed by the VT-d specification? If so, a reference
to the respective section in the spec would be useful. If not, I can't
see why the structure is laid out the (odd) way it is.

> 2. Modification to Device IOTLB invalidation:
>     - Enabled interrupt notification when hardware completes the 
> invalidations: 
>         Set FN, IF and SW bits in Invalidation Wait Descriptor. The reason 

A god design document would either give a (short) explanation of
these bits, or at the very least a precise reference to where in the
spec they're being defined. The way the VT-d spec is organized I
generally find it quite hard to locate the definition of specific fields
when I have only a vague reference in hand. Yet reading the doc
here should require the reader to spend meaningful extra amounts
of time hunting down the corresponding pieces of the spec.

> why also set SW bit is that the interrupt for notification is global not per 
> domain. So we still need to poll the status address to know which domain's 
> flush request is
>         completed in interrupt handler.

With the above taken care of, I would then hope to also be able to
understand this (kind of an) explanation.

>     - A new per-domain flag (iommu_pending_flush) is used to track the flush 
> status of IOTLB invalidation with Device-TLB invalidation:
>         iommu_pending_flush will be set before flushing the Device-TLB 
> invalidation.

What is "flushing an invalidation" supposed to mean? I think there's
some problem with the wording here...

> 4. New interrupt handler for invalidation completion:
>     - when hardware completes the invalidations with Device IOTLB, it 
> generates an interrupt to notify hypervisor.
>     - In interrupt handler, we will schedule a soft-irq to handle the 
> finished 
> invalidations.
>     - soft-irq to handle finished invalidation:
>         Scan the pending flush list
>           for each entry in list
>             check the values of iommu _invl _poll_slot and iommu _invl 
> _status_data in each domain's invalidation table.
>             if yes, clear iommu_pending_flush and invalidation table, then 
> wakeup the domain.

Did you put some consideration into how long this list may get, and
hence how long it may take you to iterate through the entire list?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.