[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: “Backend has not unmapped grant” errors



On 24.08.22 02:20, Marek Marczykowski-Górecki wrote:
FWIW, I hit this issue twice already in this week CI run, while it never
happened before. The difference compared to previous run is Linux
5.15.57 vs 5.15.61. The latter reports persistent grants disabled. The
only related commits I see there are three commits indeed related to
persistent grants:

   c98e956ef489 xen-blkfront: Apply 'feature_persistent' parameter when connect
   ef26b5d530d4 xen-blkback: Apply 'feature_persistent' parameter when connect
   7304be4c985d xen-blkback: fix persistent grants negotiation

But none of the commit messages suggests intentional disabling it
without explicit request for doing so. I did not requested disabling it
in toolstack (although I have set backend as "trusted" - XSA-403).
I have confirmed it's the frontend version that matters. Running older
frontend kernel with 5.15.61 backend results in persistent grants
enabled (and both frontend and backend xenstore "feature-persistent"
entries are "1" in this case).

This is a mess.

I think the main problem seems to be that the feature negotiation process
isn't specified in a sane way.

From the blkif.h header:

Backend-side:
 * feature-persistent
 *      Values:         0/1 (boolean)
 *      Default Value:  0
 *      Notes: 7
 *
 *      A value of "1" indicates that the backend can keep the grants used
 *      by the frontend driver mapped, so the same set of grants should be
 *      used in all transactions. The maximum number of grants the backend
 *      can map persistently depends on the implementation, but ideally it
 *      should be RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST. Using this
 *      feature the backend doesn't need to unmap each grant, preventing
 *      costly TLB flushes. The backend driver should only map grants
 *      persistently if the frontend supports it. If a backend driver chooses
 *      to use the persistent protocol when the frontend doesn't support it,
 *      it will probably hit the maximum number of persistently mapped grants
 *      (due to the fact that the frontend won't be reusing the same grants),
 *      and fall back to non-persistent mode. Backend implementations may
 *      shrink or expand the number of persistently mapped grants without
 *      notifying the frontend depending on memory constraints (this might
 *      cause a performance degradation).

Frontend-side:
 * feature-persistent
 *      Values:         0/1 (boolean)
 *      Default Value:  0
 *      Notes: 7, 8, 9
 *
 *      A value of "1" indicates that the frontend will reuse the same grants
 *      for all transactions, allowing the backend to map them with write
 *      access (even when it should be read-only). If the frontend hits the
 *      maximum number of allowed persistently mapped grants, it can fallback
 *      to non persistent mode. This will cause a performance degradation,
 *      since the the backend driver will still try to map those grants
 *      persistently. Since the persistent grants protocol is compatible with
 *      the previous protocol, a frontend driver can choose to work in
 *      persistent mode even when the backend doesn't support it.

Those definitions don't make clear, which side is the one to decide whether
the feature should be used or not. In my understanding the related drivers
should just advertise their setting (the _ability_ to use the feature), and
it should be used only if both sides have written a "1".

With above patches applied, the frontend will set 'feature-persistent' in
Xenstore only, if the backend has done so, but the backend will set it
only, if the frontend has done it. This results in persistent grants
always being disabled.

This is wrong, as the value written should not reflect the current state
of the interface. That state should be set according to both sides' value,
probably a cached one on the blkback side (using a new flag for caching it,
not the current state).

The blkif.h comments should be updated to make it clear that the values in
Xenstore don't reflect the state of the connection, but the availability of
the feature in the related driver.

Comments?


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.