[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fix VGA logdirty related display freezes with altp2m



On Thu, Oct 25, 2018 at 9:08 AM Tamas K Lengyel
<tamas.k.lengyel@xxxxxxxxx> wrote:
>
> On Thu, Oct 25, 2018 at 9:02 AM Razvan Cojocaru
> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> >
> > On 10/25/18 5:55 PM, Tamas K Lengyel wrote:
> > > On Thu, Oct 25, 2018 at 8:24 AM Razvan Cojocaru
> > > <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> > >>
> > >> On 10/24/18 8:52 PM, Tamas K Lengyel wrote:
> > >>> On Wed, Oct 24, 2018 at 11:31 AM Tamas K Lengyel
> > >>> <tamas.k.lengyel@xxxxxxxxx> wrote:
> > >>>>
> > >>>> On Wed, Oct 24, 2018 at 11:20 AM Razvan Cojocaru
> > >>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> > >>>>>
> > >>>>> On 10/24/18 8:09 PM, Tamas K Lengyel wrote:
> > >>>>>> On Tue, Oct 23, 2018 at 6:37 AM Razvan Cojocaru
> > >>>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> > >>>>>>>
> > >>>>>>> Tamas, could you please give this a spin?
> > >>>>>>>
> > >>>>>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take2
> > >>>>>>>
> > >>>>>>> It _should_ solve the crashes.
> > >>>>>>
> > >>>>>> Indeed, I no longer see the crash. However, there might be some
> > >>>>>> locking issue present because the whole system freezes up shortly
> > >>>>>> after starting DRAKVUF on a domain - within a couple seconds. I mean
> > >>>>>> Xen itself locks up: no response on the serial, dom0 screen frozen,
> > >>>>>> etc.
> > >>>>>
> > >>>>> Do you have any type of log / backtrace / way I could reproduce it
> > >>>>> without Drakvuf? All the ways I've tested it were fine (including
> > >>>>> xen-access).
> > >>>>
> > >>>> I don't have a standalone test that produces that error. With DRAKVUF
> > >>>> it is easily reproducible though. If you have a Windows guest
> > >>>> installed, setting up DRAKVUF should really not be much trouble. With
> > >>>> xen-access it indeed doesn't lock up but since the guest is pretty
> > >>>> much unresponsive during that test I can't verify whether the VGA
> > >>>> issue is now resolved or not. Also the xen-access tests are fairly
> > >>>> limited and don't use all aspects of altp2m.
> > >>>>
> > >>>
> > >>> What I see from the DRAKVUF log is that the last thing it prints is
> > >>> sending a vm_event response that both enables singlestepping and
> > >>> switches altp2m view. This looks to be consistent. It didn't matter if
> > >>> the guest had 1 or 2 vCPUs, the freeze occurs just the same. It's
> > >>> definitely racey because it doesn't happen right away, the system
> > >>> works as expected for a couple seconds.
> > >>
> > >> After having to install clang because my GCC couldn't build Drakvuf:
> > >>
> > >> ../../src/plugins/plugins.h:188:1: sorry, unimplemented: non-trivial
> > >> designated initializers not supported
> > >
> > > Please follow the instruction for compiling it, clang is a
> > > requirement. I don't even know how you got pass the ./configure stage
> > > without clang being installed. You could also just copy-paste things
> > > from the travis script directly:
> > > https://github.com/tklengyel/drakvuf/blob/master/.travis.yml#L51
> > >
> > >>
> > >> then rekall via pip, then having to mount my Windows disk to do "rekal
> > >> peinfo", I finally gave up when "rekall fetch_pdb" couldn't find the
> > >> debug files on the Microsoft server. :)
> > >
> > > If your version if Windows is that brand new then yes, Microsoft takes
> > > a couple days to publish their debug information and you will just
> > > have to wait or use an older version of Windows.
> > >
> > >>
> > >> So if you could find a way to reproduce the issue with a simple
> > >> libxc-based application alone (or at least with something
> > >> libvmi-related, which I do have set up), I'd really appreciate it.
> > >>
> > >> Or maybe try to hack around with patch no 3 of the series (for a start,
> > >> just revert it and see if the problem persists - of course the display
> > >> will freeze) and see if there's an easy fix?
> > >
> > > Unfortunately I won't have time to do either of these any time soon.
> > > If you are having that much trouble setting it up I can perhaps send
> > > you a pre-compiled version with a version of Windows for which
> > > Microsoft already published the debug info for.
> >
> > It's a Windows 7 x64 guest. But the problem was that the right command
> > line is:
> >
> > rekall fetch_pdb ntkrnlmp
> >
> > instead of the suggested "rekall fetch_pdb ntkrpamp" on the drakvuf.com
> > website.
>
> The kernel filename is specific to the version of Windows you have
> installed. The instructions specify _an example_ for the 32-bit
> version of Windows 7 and you will need to adjust it according to the
> kernel filename. For 64-bit it is ntkrnlmp. The instruction explicitly
> say that you need to use the PDB filename that was printed for your
> specific kernel version.
>
> >
> > I'll try to continue - in any case should I have more trouble I'll
> > contact you privately so as not to spam the list. Just wanted to leave
> > this here in case someone else has this problem in the hope that it's
> > useful.
>
> Of course, also please feel free to open an issue on github if you run
> into something that's blocking you. Chances are if you run into it,
> others would too :)

We can chalk the freeze issue up to buggy hardware on my side. We
couldn't reproduce the issue on two other systems. The screen issue is
definitely gone now which is awesome! :) Thanks Razvan!

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.