[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A KernelShark plugin for Xen traces analysis


  • To: Steven Rostedt <rostedt@xxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 14 Apr 2021 21:05:00 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gkeqm/sH7FYDWFyVdqLE9SnwERsA0fQfcUkhFvqRAco=; b=eRq7Oqe5BuSoBwEv4JTyydvhiQ3EHVMSCjxIXHA7XS868w1ZL/ha2RjBEJ4/cfbXaRteXeu9X4N8yjh2M1BlzEXw80mqEdW04jwE4rkPT2kSoKV4C0n+26IBA4nXCTxpc32IMBPSlif8dSvcWZGVdVNZPfOn/ar6+Cs5aml6/pgkuk0agqYKkxZ/3eQHDuwR/+cCqQcmYpgZ6WjapGsR6Omxkvv6KfOnUdvJl0zGL3HyGS3H/hWJukzsGzkpTOXo6mlHQRHGxVTRSW+6il9ggUib3bM3LEluZBmovn0Ol9qfTtChThRWMk0ZI9G2BL2BXmjKpkTZeJLrDbQemazqBA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BRq1RIDMKsAL3LXh9mEW03E+0RlzIpb5s6G1TBgKsLZBUpSm9H08PElq3KlSVB8V3POPnt5PEs4jkQ0/ZNWF5nC0/OiQ/+8qaOoa2wW0yTnCFrnDRqxG92BO/SxKRXmlM6UfNO4kyXC0OLVzpxkqJ2Z0644bxcCj2nulFBOV1oLLBDusfVUJYultX0HvWZcee8r4YA7PQKvdTzhqa8xZKr1MJZ2DaR1XyOTKqMt0lzOaWtBxPNxxLamDaIfM4lK1rTX7EerqsVzNmb178PqUoQmAKWpu4duFabuR/uql7K7jYQBtOvb0sSjvDi2C5O5eQtHdeJFyEt9M/71/o0Cavw==
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Giuseppe Eletto <giuseppe.eletto@xxxxxxxxxxxx>, <linux-trace-devel@xxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Dario Faggioli" <dfaggioli@xxxxxxxx>, Enrico Bini <enrico.bini@xxxxxxxx>
  • Delivery-date: Wed, 14 Apr 2021 20:17:59 +0000
  • Ironport-hdrordr: A9a23:9qlpYal8rLd/uwNU/n24GB/rlNjpDfOnj2dD5ilNYBxZY6Wkvu iUtrAyyQL0hDENWHsphNCHP+26TWnB8INuiLN+AZ6LZyOjnGezNolt4c/ZwzPmEzDj7eI178 hdWoBEIpnLAVB+5PyX3CCRD8sgzN6b8KqhmOfZyDNXQRt3brx7hj0YNi+wOCRNNW17LLA+E4 eR4dcCgjKmd2geYMjTPAh6Y8HoodrXmJX6JSMcDxk85wWUyR+u4rj2Ex+Xty1uLA9n67Ek7G TDjkjF9ryu2svLtiP0+k3yy9BtmNXnwsZeH8DksKkoAxjllwrAXvUbZ5SspzYwydvfjmoCsN 6JmBs4OtQ21nW5RBDJnTLI+y3NlAkj8GXjz1jwuwqQneXcSCghA8RMwaJ1GyGpk3YIh9133K JV02/xjfM+Znms7UeNham9azhQmkW5unYkm+II5kYvNrc2U7NNsZcZuHpcDZZoJlOI1KkcDO JsAMvAjcwmCG+yUnaxhBgL/PWRRHgpWj+JTk8e0/blqQR+rTRSyksVw9EnhXEQ9J4xYIks3Z W1Do1Y0J5JVcMYdqR7GaMoRta2EHXERVb2PHuVOkmPLtBJB1v977rMpJkl7uCjf5IFiLM0hZ T6SVtd8Uo/YVjnB8Gi1IBCmyq9DlmVbHDI8IVz9pJ5srrzSP7AKiuYUm0jlMOmvrE2HtDbc+ zbAuMUP9bTaU/VXapZ1Qz3XJdfbVMEVtcOh9o9U1WS5urWN4zRsPDBevq7HsusLR8UHkfERl cTVjn6I8tNqmqxXGXjvRTXU3TxPmPl+5ZdF7Xb4vgzxIABOpYkiHlRtX2JouWwbRFSuK0/e0 VzZJn9lLmgmGWw9WHUq0VlUyAtSnp90fHFaTdntAUKO0T7ffIooNOEY11f23OBO1taR8PSGw hPmkRv9cuMXtut7BFnL+jiHnORjnMVqn7PZYwbgLe/6cDsfY59KZo6RqprF0HuGwZukQhn7E dPATV0B3P3J3fLs+GInZYUDObQe51XmwGwO/NZrnrZqAG7vsEgRnwSWha0Ss6JiQMSRz5Z72 cBsZM3sf6lo3KCOGE/iOM3PBlnc2KMGo9LCwyDecFpgLzxQRpxSm2LnDSerBk2dgPRhgMvr1 2kCRfRVeDAA1JbtHwd9qrx6lt7el+QeF9KZmlgvZdwEnnHvXhPwfaGD5DDple5Wx8n+KUwIT vFaTwdLkdVy9e72AW8tRyCGX8lr69edND1PfAGSfX+y3mtIIqHmeU6BPdS5o9iL82rmPQMS/ ijdwicKy7YB+sl1xeOnGssPDB5pRAf4KrV8SygyFL9+nExAfDfegs7A54aJsyR9GjiSbKj1o 5jgdc8oOu3NSHQZ7e9uNfqRg8GDimWh2i8C9wMg9Rzm4kZsbNoBZnVUTfSzhh8rV4DBfaxsH lbebhx5bDKB5RmcMMTcR9I51ZBrqX5EGIb9ijNRtIkdV4jj3XnL8qEzrrBp70oGFCArmLLSB Ci2hwY2/fORC2Y07EGT4o2PGRNcUA5gU4Ssd+qRsn1CA+wcftE80f/GnihcKVFQKzAPbkLtB 5175WpmOCQHhCIlDz4jH9eIqhU9XygTt73KAWQGfRQ+9j/AG+yuMKRkYaOpQaybyC6ZUQejZ BEckJVTv0rsEhSsKQHlg6oSqL2pUo5lUB5+j8PrC+05rSb
  • Ironport-sdr: 3rE2lE+2ixL/GAICBPbjoyMbDVOOVP0gbyqerQ0ubUSZPjHGMkzMGTfW6mB2h55RvZUat1bPuQ q9K1rN3JGKh9ms1x0/NHLwrHpFJgsob58f44E+duxjkhdrM0W6jU0zObQOwL/O8eLbJrAaeD8s O45GVLItZqy0AU+Efm9OLelBtDzk7gxi2wNUH0ih2FyvUY7jVipIaFMIBtBMaCc9xr7SNO74Tq /HVJz6c/OBzUSExKpWOp6oKDhOe2FCDWQzprAj/fchz+A5gOfleUrG2w7muClik8ETAx/mugj9 RaY=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14/04/2021 14:43, Steven Rostedt wrote:
>> This causes major problems for `perf` support under Xen, which assumes
>> that the kernel's idea of CPUs matches that of the system.
> Things are different with KernelShark.

That is very encouraging to hear.

>> When rendering a trace including Xen data, Xen can provide the real
>> system CPUs, and dom0 wants to be rendered as a VM under Xen, similar to
>> trace-Fedora21 in your screenshot above.  (Obviously, if you're doing
>> nested virt, things need to start nesting.)
> Right.
>
> What I would envision how this would work, is that you would produce a
> set of tracing files. One for each guest (including Dom0), and one for the
> Xen hypervisor itself. The trick is to have a way to synchronize the time
> stamps. What we just did with KVM is to have all the tracing record the
> CPUs TSC, including the shift and multiplier that the CPU might change for
> the guests. Then we have a way to convert the TSC to nanoseconds. This way
> all tracing data has the same clock. It's somewhat complicated to get
> right, and requires access to how the guests clocks are modified by the CPU.

Hmm.  In the past, I have had success by modifying Xen to refuse any
shift/scale settings, at which point VMs and the hypervisor have
directly-comparable raw TSC values.

Xen certainly has enough information to describe what TSC rate/epoch
each guest is seeing, but I doubt any of this is coherently exposed at
the moment.

> For KVM, each machine has a unique id and is stored in the trace.dat files.
> We have the host store a mapping of what thread represents which guest VCPU
> (virtual CPU). Then the "-a" option tells KernelShark to append the
> tracing data as a dependency. I would imagine we can have something like
> this:
>
>  kernelshark xen.dat -a trace-dom0.dat -a trace-guest1.dat -a trace-guest2.dat
>
> The Xen plugin would then need to read the how the threads in xen.dat map
> to the virtual CPUs of each of the guest files. Which would give you the
> layering.

Looks good.  I suspect we might need to do a little work on Xen's trace
data to make this mesh together nicely.  In particular, Xen doesn't have
a terribly good scheme on unique IDs for "a VM".

We've got domain ID's which are Xen's unique instances of a running
"thing", but they change across VM reboot/migrate/etc.  I suspect we
have some atomicity problems with unique identification information and
VM-fork too.

There is a UUID field but we leave that entirely up to the toolstack to
manage.  (A good test for naive toolstack code comes on the a localhost
live migrate, because suddenly the toolstack is presented with one
logical VM (=> one UUID) and two concurrent domid's.)


I'll try to have a play with the plugin in some copious free time, but
this work does look exciting.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.