[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Linux grant map/unmap improvement proposal (Draft B)

On 13/10/2014 14:41, David Vrabel wrote:
Packets with foreign pages from other sources cannot be successfully
copied, since netback does not know the grant reference.  Once such
"... One such"
configuration is a VM providing an iSCSI or other network-based
storage that presents a block device in the backend that is then used
by another VM on the same host.
If the packet coming from the storage target VM is delivered to L3 in Dom0's stack, the foreign pages will be swapped out with local copies. That's a feature of the zerocopy framework used by netback, mostly due to fears that strange things can happen in and above the IP layer. So unless the storage backend in Dom0 implements an own TCP/IP stack and uses the vifX.Y device directly, it probably won't see foreign frames from the storage target. Of course it wouldn't be smart to rely on this on the long term, it would be good to remove that copy. Or do you mean the other direction, when the guest using this storage writes to it, and that date is mapped by the block backend and used to construct an SKB? (by the time I finished the sentence I realized you meant this scenarie, but I leave the above comments just for the sake of clarification)

Blkback and network storage

Blkback unmaps the foreign pages in a I/O request when the request is
completed.  If networked storage is used it is possible for requests
to be completed while the skbs referring to those pages are still
queued for transmit (e.g., because a retransmission was queued while
the responds to the original packet was in flight).

When the network driver attempts to send the packet with the unmapped
page it may:

- Fault while trying to access the unmapped page.

- Transmit from a frame that is no longer granted (potentially
   transmitting sensitive guest or Xen data).

The fault does not occur with userspace storage backends since gntdev
replaces the foreign mapping with one to a local scratch page.  It
uses GNTOP_unmap_and_replace which atomically replaces the foreign
mapping with another (source) mapping.  However, this cannot be used
with batched operations since it clears the source mapping and it does
not prevent against transmitting from a non-granted frame.

Safe grant unmap

Grant references will only be unmapped when they are no longer in use.
i.e., the page reference count is one.

     int gnttab_unmap_refs_async(struct gnttab_unmap_grant_ref *unmap_ops,
         struct gnttab_unmap_grant_ref *kunmap_ops,
         struct page **pages, unsigned int count,
         void (*done)(void *data), void *data);

The `gnttab_unmap_refs_async()` function will unmap the grant
references using the supplied unmap operations and call `done(data)`.
The grant unmap will only be done once all pages are no longer in use.
I'm a bit confused about this function. I guess it checks the refcount before unmap. But then what does the done(data) function does?

It shall run synchronously on the first attempt (this is expected to
be the most common case).  If any page is in use, it shall queue the
unmap request to be tried at a later time.
Who will own this queue? The caller (e.g. blkback)? How often should it retry? That retry is triggered by a timer?

Only the blkback and gntdev devices need to use asynchronouse unmaps.


Identifying foreign pages

A new page flag is introduced: PG_foreign.  This will alias PG_pinned
so it does not require an additional bit.

If PG_foreign is set then `page->private` contains the grant reference
and domid for this foreign page.  This information can only be packed
into an unsigned long on 64-bit platforms.  32-bit platforms will have
to allocate an additional structure to store the domid and gref.

The aliasing of PG_foreign and PG_pinned is safe because:

- Page table pages will never be foreign.
- Foreign pages shall have `p2m[P] & FOREIGN_FRAME_BIT`.

The use of the private field is safe because:

- The page is allocated by the balloon driver and thus it owns the
   private field.

- The other fields in the union (ptl, slab_cache, and first_page) will
   not be used because the page is not used in a page table, slab or
   compound page.

This flag sounds similar to the flag used in classic for netback grant mapping. Would it be accepted in upstream? Aliasing PG_pinned would make sure of that?
Netback can thus:

1. Test PG_foreign.
2. Verify that the page is foreign via the p2m.
3. Extract the domid and gref from page->private.

The PG_foreign test is not strictly necessary as the p2m lookup is
sufficient, but it should be quicker for non-foreign pages.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.