[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early


  • To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Fri, 16 Nov 2018 17:59:15 +0000
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABzSRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT7CwYAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO/OwE0EVFq6vQEIAO2idItaUEplEemV2Q9mBA8YmtgckdLmaE0uzdDWL9To 1PL+qdNe7tBXKOfkKI7v32fe0nB4aecRlQJOZMWQRQ0+KLyXdJyHkq9221sHzcxsdcGs7X3c 17ep9zASq+wIYqAdZvr7pN9a3nVHZ4W7bzezuNDAvn4EpOf/o0RsWNyDlT6KECs1DuzOdRqD oOMJfYmtx9hMzqBoTdr6U20/KgnC/dmWWcJAUZXaAFp+3NYRCkk7k939VaUpoY519CeLrymd Vdke66KCiWBQXMkgtMGvGk5gLQLy4H3KXvpXoDrYKgysy7jeOccxI8owoiOdtbfM8TTDyWPR Ygjzb9LApA8AEQEAAcLBZQQYAQoADwIbDAUCWmTXMwUJB+tP9gAKCRCmNjwxBZC0bb+2D/9h jn1k5WcRHlu19WGuH6q0Kgm1LRT7PnnSz904igHNElMB5a7wRjw5kdNwU3sRm2nnmHeOJH8k Yj2Hn1QgX5SqQsysWTHWOEseGeoXydx9zZZkt3oQJM+9NV1VjK0bOXwqhiQyEUWz5/9l467F S/k4FJ5CHNRumvhLa0l2HEEu5pxq463HQZHDt4YE/9Y74eXOnYCB4nrYxQD/GSXEZvWryEWr eDoaFqzq1TKtzHhFgQG7yFUEepxLRUUtYsEpT6Rks2l4LCqG3hVD0URFIiTyuxJx3VC2Ta4L H3hxQtiaIpuXqq2D4z63h6vCx2wxfZc/WRHGbr4NAlB81l35Q/UHyMocVuYLj0llF0rwU4Aj iKZ5qWNSEdvEpL43fTvZYxQhDCjQTKbb38omu5P4kOf1HT7s+kmQKRtiLBlqHzK17D4K/180 ADw7a3gnmr5RumcZP3NGSSZA6jP5vNqQpNu4gqrPFWNQKQcW8HBiYFgq6SoLQQWbRxJDHvTR YJ2ms7oCe870gh4D1wFFqTLeyXiVqjddENGNaP8ZlCDw6EU82N8Bn5LXKjR1GWo2UK3CjrkH pTt3YYZvrhS2MO2EYEcWjyu6LALF/lS6z6LKeQZ+t9AdQUcILlrx9IxqXv6GvAoBLJY1jjGB q+/kRPrWXpoaQn7FXWGfMqU+NkY9enyrlw==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 16 Nov 2018 17:59:50 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 11/16/18 2:10 PM, Razvan Cojocaru wrote:
> On 11/16/18 2:03 PM, George Dunlap wrote:
>> The code is definitely complicated enough, though, that I may have
>> missed something, which is why I asked Razvan if there was a reason he
>> changed it.
>>
>> For the purposes of this patch, I propose having p2m_altp2m_init_ept()
>> set max_mapped_pfn to 0 (if that works), and leaving "get rid of
>> max_remapped_pfn" for a future clean-up series.
> 
> I've retraced my previous analysis and re-ran some tests, and I now
> remember (sorry it took a while) why the p2m->max_mapped_pfn =
> hostp2m->max_mapped_pfn was both necessary and not accidental.
> 
> Let's say we set it to 0 in p2m_altp2m_init_ept(). Then,
> hap_track_dirty_vram() calls p2m_change_type_range(), which calls the
> newly added change_type_range().
> 
> Change_type_range() looks like this:
> 
> static void change_type_range(struct p2m_domain *p2m,
>                               unsigned long start, unsigned long end,
>                               p2m_type_t ot, p2m_type_t nt)
> {
>     unsigned long gfn = start;
>     struct domain *d = p2m->domain;
>     int rc = 0;
> 
>     p2m->defer_nested_flush = 1;
> 
>     if ( unlikely(end > p2m->max_mapped_pfn) )
>     {
>         if ( !gfn )
>         {
>             p2m->change_entry_type_global(p2m, ot, nt);
>             gfn = end;
>         }
>         end = p2m->max_mapped_pfn + 1;
>     }
>     if ( gfn < end )
>         rc = p2m->change_entry_type_range(p2m, ot, nt, gfn, end - 1);
>     if ( rc )
>     {
>         printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from
> %d to %d\n",
>                rc, d->domain_id, start, end - 1, ot, nt);
>         domain_crash(d);
>     }
> 
>     switch ( nt )
>     {
>     case p2m_ram_rw:
>         if ( ot == p2m_ram_logdirty )
>             rc = rangeset_remove_range(p2m->logdirty_ranges, start, end
> - 1);
>         break;
>     case p2m_ram_logdirty:
>         if ( ot == p2m_ram_rw )
>             rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1);
>         break;
>     default:
>         break;
>     }
>     if ( rc )
>     {
>         printk(XENLOG_G_ERR "Error %d manipulating Dom%d's log-dirty
> ranges\n",
>                rc, d->domain_id);
>         domain_crash(d);
>     }
> 
>     p2m->defer_nested_flush = 0;
>     if ( nestedhvm_enabled(d) )
>         p2m_flush_nestedp2m(d);
> }
> 
> If we set p2m->max_mapped_pfn to 0, we're guaranteed to run into the if
> ( unlikely(end > p2m->max_mapped_pfn) ) body, where end =
> p2m->max_mapped_pfn + 1; will make end 1.
> 
> Then, we will crash the hypervisor in rangeset_add_range(), where
> there's an ASSERT() stating that start <= end.

Ah, right, this was the original crash that you ran into several months
ago, which flagged up the whole logdirty range synchronization issue.

But that's partly a logic hole in change_entry_type_range(), which
assumes that start < p2m->max_mapped_pfn.  It would be better to fix
that than to work around it by changing the meaning of max_mapped_pfn.

On the other hand, we want the logdirty rangesets to actually match the
host's rangesets; using altp2m->max_mapped_pfn for this is clearly
wrong. The easiest fix would be just to explicitly use the host's
max_mapped_pfn when calculating the clipping.  A more complete fix would
involve calculating two different ranges -- a "rangeset" range and a
"invalidate" range, the second of which would be clipped on altp2ms by
{min,max}_remapped_gfn.

Something like the attached (compile-tested only).  I'm partial to
having both patches applied, but I'd be open to arguments that we should
only use the first.

 -George

Attachment: 0001-p2m-Always-use-hostp2m-when-clipping-rangesets.patch
Description: Text Data

Attachment: 0002-p2m-change_range_type-Only-invalidate-remapped-gfns.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.