[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5 2/3] x86/mm: allocate logdirty_ranges for altp2ms


  • To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Wed, 14 Nov 2018 11:58:54 +0000
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABzSRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT7CwYAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO/OwE0EVFq6vQEIAO2idItaUEplEemV2Q9mBA8YmtgckdLmaE0uzdDWL9To 1PL+qdNe7tBXKOfkKI7v32fe0nB4aecRlQJOZMWQRQ0+KLyXdJyHkq9221sHzcxsdcGs7X3c 17ep9zASq+wIYqAdZvr7pN9a3nVHZ4W7bzezuNDAvn4EpOf/o0RsWNyDlT6KECs1DuzOdRqD oOMJfYmtx9hMzqBoTdr6U20/KgnC/dmWWcJAUZXaAFp+3NYRCkk7k939VaUpoY519CeLrymd Vdke66KCiWBQXMkgtMGvGk5gLQLy4H3KXvpXoDrYKgysy7jeOccxI8owoiOdtbfM8TTDyWPR Ygjzb9LApA8AEQEAAcLBZQQYAQoADwIbDAUCWmTXMwUJB+tP9gAKCRCmNjwxBZC0bb+2D/9h jn1k5WcRHlu19WGuH6q0Kgm1LRT7PnnSz904igHNElMB5a7wRjw5kdNwU3sRm2nnmHeOJH8k Yj2Hn1QgX5SqQsysWTHWOEseGeoXydx9zZZkt3oQJM+9NV1VjK0bOXwqhiQyEUWz5/9l467F S/k4FJ5CHNRumvhLa0l2HEEu5pxq463HQZHDt4YE/9Y74eXOnYCB4nrYxQD/GSXEZvWryEWr eDoaFqzq1TKtzHhFgQG7yFUEepxLRUUtYsEpT6Rks2l4LCqG3hVD0URFIiTyuxJx3VC2Ta4L H3hxQtiaIpuXqq2D4z63h6vCx2wxfZc/WRHGbr4NAlB81l35Q/UHyMocVuYLj0llF0rwU4Aj iKZ5qWNSEdvEpL43fTvZYxQhDCjQTKbb38omu5P4kOf1HT7s+kmQKRtiLBlqHzK17D4K/180 ADw7a3gnmr5RumcZP3NGSSZA6jP5vNqQpNu4gqrPFWNQKQcW8HBiYFgq6SoLQQWbRxJDHvTR YJ2ms7oCe870gh4D1wFFqTLeyXiVqjddENGNaP8ZlCDw6EU82N8Bn5LXKjR1GWo2UK3CjrkH pTt3YYZvrhS2MO2EYEcWjyu6LALF/lS6z6LKeQZ+t9AdQUcILlrx9IxqXv6GvAoBLJY1jjGB q+/kRPrWXpoaQn7FXWGfMqU+NkY9enyrlw==
  • Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • Delivery-date: Wed, 14 Nov 2018 11:59:23 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 11/13/18 6:43 PM, Razvan Cojocaru wrote:
> On 11/13/18 7:57 PM, George Dunlap wrote:
>> On 11/11/18 2:07 PM, Razvan Cojocaru wrote:
[snip]
>> I think everything above here could usefully be in its own patch; it
>> would make it easier to verify that there were no functional changes in
>> the refactoring.
> 
> Right, I'll split this patch then.

Thanks.

>>> @@ -2341,6 +2380,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, 
>>> unsigned int idx)
>>>          {
>>>              p2m_flush_table(d->arch.altp2m_p2m[idx]);
>>>              /* Uninit and reinit ept to force TLB shootdown */
>>> +            p2m_free_logdirty(d->arch.altp2m_p2m[idx]);
>>>              ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
>>>              ept_p2m_init(d->arch.altp2m_p2m[idx]);
>>>              d->arch.altp2m_eptp[idx] = mfn_x(INVALID_MFN);
>>
>> (In case I forget: Also, this is called without holding the appropriate
>> p2m lock. )
> 
> Could you please provide more details? I have assumed that at the point
> of calling a function called p2m_destroy_altp2m_by_id() it should be
> safe to tear the altp2m down without further precaution.

Are you absolutely positive that at this point there's no way anywhere
else in Xen might be doing something with this p2m struct?

If so, then 1) there should be a comment there explaining why that's the
case, and 2) ideally we should refactor p2m_flush_table such that we can
call what's now p2m_flush_table_locked() without the lock.

> I think you're saying that I should p2m_lock(d->arch.altp2m_p2m[idx])
> just for the duration of the p2m_free_logdirty() call?

The same argument really goes for ept_p2m_uninit/init -- uninit actually
frees a data structure; if anyone else may be using that, you'll run
into a use-after-free bug.  (Although that really needs to be changed as
well -- freeing and re-allocating a structure just to set all the bits
is ridiculous.)

If we need locking, then I'd grab the p2m lock before p2m_flush_table()
(calling p2m_flush_table_locked() instead), and release it after the
ept_p2m_init().

I realize you didn't write this code, and so I'm not holding you
responsible for all the changes I mentioned above.  But if we're going
to add the p2m_free_logdirty() call, we do need to either grab the lock
or add a comment explaining why it's not necessary; we might as well fix
it properly at the same time.

p2m_flush_table() already grabs and releases the lock; so grabbing the
lock over all four calls won't add any more overhead (or risk of
deadlock) than what we already have.

>> I'm a bit suspicious of long strings of these sorts of functions in the
>> middle of another function.  It turns out that there are three copies of
>> this sequence of function calls (p2m_flush_table -> ept_p2m_uninit ->
>> ept_p2m_init):
>>
>> * Here (p2m_destroy_altp2m_id), when the user asks for the alt2m index
>> to be destroyed
>>
>> * In p2m_flush_altp2m(), which is called when altp2m is disabled for a
>> domain
>>
>> * In p2m_reset_altp2m(), which is called when an entry in the hostp2m is
>> set to INVALID_MFN.
>>
>> Presumably in p2m_reset_altp2m() we don't want to call
>> p2m_free_logdirty(), as the altp2m is still active and we want to keep
>> the logdirty ranges around.  But in p2m_flush_altp2m(), I'm pretty sure
>> we do want to discard them: when altp2m is enabled again,
>> p2m_init_logdirty() will return early, leaving the old rangesets in
>> place; if the hostp2m rangesets have changed between the time altp2m was
>> disabled and enabled again, the rangeset_merge() may have incorrect results.
> 
> I'll call p2m_free_logdirty() in p2m_flush_altp2m() as well.

I was more thinking of refactoring those two to share the same code, and
potentially having p2m_reset_altp2m() share the same code as well.  The
reason you missed the p2m_flush_altp2m() there was because of the code
duplication.

Or alternately...

>>> Is there any particular reason we allocate the p2m structures on domain
>>> creation, but not logdirty range structures?  It seems like allocating
>>> altp2m structures on-demand, rather than at domain creation time, might
>>> make a lot of the reasoning here simpler.
>>
>> I assume that this question is not addressed to me, since I'm not able
>> to answer it - I can only assume that having less heap used has been
>> preferred.

I'm asking you because you've recently been going through this code, and
probably have at least as much familiarity with it as I do.  I can't
immediately see any reason to allocate them at domain creation time.
Maybe you can and maybe you can't, but I won't know until I ask. :-)

> Actually I now realize that you're asking why the hostp2m rangeset is
> created via paging_domain_init() in arch_domain_create() (so immediately
> on domain creation) while I'm allocating the altp2m rangesets on altp2m
> init.
>
> I'm doing that to save memory, since we can have MAX_ALTP2M altp2ms
> (which is currently 10), and only two active altp2ms - that means that I
> would allocate 10 rangesets and only use two. In fact we're currently
> only using 2 altp2ms and the hostp2m for our #VE work. That saves the
> space required for 8 rangesets. If that's not much, or if you think that
> the benefits of allocating them early outweigh the costs we can switch
> to allocating them on domain creation, like the hostp2m, and perhaps
> always keeping them in sync.

On the contrary, I was thinking of leaving the altp2m_p2m[N] NULL until
it becomes used; and at that point allocating both the p2m structure and
the logdirty rangesets; and when deactivating altp2m_p2m[N], freeing
both the logdirty rangesets and the p2m structure.

One advantage of that is that we'd reduce the amount of memory used --
not just for you, but for the vast majority of people who aren't using
the altp2m functionality; the other advantage is that it simplifies the
disable / enable logic: Everything that needs to be done is done in one
place, rather than half in one place and half in another.

I don't necessarily expect you to do that refactoring, but as you're
familiar with the code, and have the most investment in its future, it
makes sense to discuss the possibilities with you. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.