[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

FW: IRQ latency measurements in hypervisor


  • To: "fusa-sig@xxxxxxxxxxxxxxxxxxxx" <fusa-sig@xxxxxxxxxxxxxxxxxxxx>
  • From: Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>
  • Date: Wed, 13 Jan 2021 11:56:09 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NKYItMKWn9KvN74PZ9JQaWOciSAlaFUITldNjKqg2u0=; b=YPJQ7iMa5ytC5FdAhzMisyC70ksDQPiHZpddH4ASSKg+6Ufc2m55TFtJBqb3VqgT4QRzLGtQw36ufmhKwcte73rLv4ffa2S7OuLb+KF7psqPETKv1Bfe2pHRvR2Ln7xwQfL9iq1nKc6tV+91v47yuNZb1S17QL5dZkn0OWCqhs7XGDZ9pfu7TZq4soAAqxWAQmseDil9JGO6Dhj93KIv4mANKV8ywZVVG2X1xxux6wp0vKnueE/aAenrixa5dfW5nrGjF54WnmYk8Eb33lkIOuA0JitUPTAqDTSNDrbyfko+2YiVzGRkVrioqbyCeu2sp1nRGemaqGOB0nh3ShnD8g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dKkDiWAWBtzLirPoOD5T8yRtHSD4bSCAGc72sMkev/Ap8iPRGkgw2Lfsd8EmK6dqNl+LhtD/cDkE2XAvbt8jVxdWt6PY7gen2vJhM4h8nwVbMCddQvwjkQnxwBnVQ2Ky6JzZvLkgv1tBYU3UZGwW6QjgpExsI//KkJA16xEuKyJCe5NgNi2/flBJC517twB9se5iSK5QUmVSQI1zZVC/DlFHKibCmSACBeUreQtTRZsKnofj2Tsai3SKy6JLk/2p1/bu89YW7ik25eB04jURdzx9qrk5TLDwOwUfsONjuktWkLCqNpK7j13D02TuoaWuF7CAU8MHDtq0vJT4X4vtAw==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=epam.com;
  • Delivery-date: Wed, 13 Jan 2021 11:56:18 +0000
  • List-id: This is a discussion list for members of the Xen Project FuSa SIG <fusa-sig.lists.xenproject.org>
  • Thread-index: AQHW6T1firbO0EOYpEKUiW6vsavsPaolcwNQ
  • Thread-topic: IRQ latency measurements in hypervisor

Hi all

We have just published a preliminary research on Xen RT performance at 
xen-devel ML, I guess it would be useful for SIG group to review.

Best regards,
 -- Artem

-----Original Message-----
From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Volodymyr 
Babchuk
Sent: среда, 13 января 2021 г. 01:48
To: xen-devel@xxxxxxxxxxxxxxxxxxxx
Cc: Julien Grall <jgrall@xxxxxxxxxx>; Stefano Stabellini 
<stefano.stabellini@xxxxxxxxxx>; Dario Faggioli <dario.faggioli@xxxxxxxx>
Subject: [!SPOOF] IRQ latency measurements in hypervisor


Hello community,

Recently I was experimenting with context switching latencies in Xen. I want to 
share my findings and discuss some issues I have found.

My target was to ensure that no more 30us passes between interrupt generation 
and handling in a guest VM. Main goal was to determine if Xen is ready for soft 
RT and to identify main issues that prevent this.

TLDR: I was able to achieve mean 15us latency with tuned RTDS scheduler, 
disabled serial console and some hacks to the Xen toolstack. This was done on a 
setup with 1 PCPU.

Test setup.

ARMv8 platform. Namely Renesas Rcar H3 SoC on Salvator board.

To accurately determine latency, I employed one of timer counter units
(TMUs) available on the SoC. This is 32-bit timer with auto-reload, that can 
generate interrupt on underflow. I fed it with 33.275MHz clock, which gave me 
resolution of about 30ns. I programmed the timer to generate interrupt every 
10ms. My ISR then read the current timer value and determined how much time 
passed since last underrun. This gave me time interval between IRQ generation 
and ISR invocation.
Those values were collected and every 10 seconds statistics was calculated. 
There is an example of output from my Linux driver:

[   83.873162] rt_eval_tmu e6fc0000.tmu: Mean: 44 (1320 ns) stddev: 8 (240 ns)
[   94.136632] rt_eval_tmu e6fc0000.tmu: Mean: 44 (1320 ns) stddev: 8 (240 ns)
[  104.400098] rt_eval_tmu e6fc0000.tmu: Mean: 50 (1500 ns) stddev: 129 (3870 
ns) [  114.663828] rt_eval_tmu e6fc0000.tmu: Mean: 44 (1320 ns) stddev: 8 (240 
ns) [  124.927296] rt_eval_tmu e6fc0000.tmu: Mean: 56 (1680 ns) stddev: 183 
(5490 ns)

This is the baremetal Linux. And there is Dom0:

[  237.431003] rt_eval_tmu e6fc0000.tmu: Mean: 306 (9180 ns) stddev: 25 (750 
ns) [  247.694506] rt_eval_tmu e6fc0000.tmu: Mean: 302 (9060 ns) stddev: 17 
(510 ns)

Driver outputs both the raw timer value (eg. 4) and the same value scaled to 
nanoseconds (eg. 1320 ns). As you can see baremetal setup is much faster. But 
experiments showed that Linux does not provide consistent values, even when 
running in baremetal mode. You can see sporadic spikes in "std dev" values.

So my next step was to use proper RT OS to do the measurements. I chose Zephyr. 
My PR that adds Xen support to Zephyr can be found at [1]. Support for RCAR 
Gen3 is not upstreamed, but is present on my GitHub([2]). At [3] you can find 
the source code for application that does the latency measurements. It behaves 
exactly as my linux driver, but provides a bit more information:

 *** Booting Zephyr OS build zephyr-v2.4.0-2750-g0f2c858a39fc  *** RT Eval app

Counter freq is 33280000 Hz. Period is 30 ns Set alarm in 0 sec (332800 ticks)
Mean: 600 (18000 ns) stddev: 3737 (112110 ns) above thr: 0% [265 (7950 ns) - 
66955 (2008650 ns)] global [265 (7950 ns) 66955 (2008650 ns)]
Mean: 388 (11640 ns) stddev: 2059 (61770 ns) above thr: 0% [266 (7980 ns) - 
58830 (1764900 ns)] global [265 (7950 ns) 66955 (2008650 ns)]
Mean: 358 (10740 ns) stddev: 1796 (53880 ns) above thr: 0% [265 (7950 ns) - 
57780 (1733400 ns)] global [265 (7950 ns) 66955 (2008650 ns)] ...

So there you can see: mean time, standard deviation, % of interrupts that was 
processed above 30us threshold, minimum and maximum latency values for the 
current 10s run, global minimum and maximum.

Zephyr running as baremetal showed very stable results (this is an older build, 
so no extended statistics there):

## Starting application at 0x480803C0 ...
 *** Booting Zephyr OS build zephyr-v2.4.0-1137-g5803ee1e8183  *** RT Eval app

Counter freq is 33280000 Hz. Period is 30 ns
Mean: 31 (930 ns) stddev: 0 (0 ns)
Mean: 31 (930 ns) stddev: 0 (0 ns)
Mean: 31 (930 ns) stddev: 0 (0 ns)
Mean: 31 (930 ns) stddev: 0 (0 ns)
Mean: 31 (930 ns) stddev: 0 (0 ns)
Mean: 31 (930 ns) stddev: 0 (0 ns)
...

As Zephyr provided stable readouts with no jitter, I used it to do all 
subsequent measurements.


IMPORTANT! All subsequent tests was conducted with only 1 CPU core enabled. My 
goal was to ensure that system can timely react to an external interrupt even 
under load.


Test results and latency sources

As you can see, baremetal OS provides latency of ~0.9us without any deviations. 
The same code running as DomU on idle system shows mean latency of 12us and 
deviation of about 10us. Range of latencies in a 10s batch can vary from 8us to 
25us. This fits into required 30us threshold, so no big issue there.

But this worsens under certain conditions.

1. Serial console. RCAR serial driver (scif) works in synchronous
   mode, so any heavy serial console output leads to higher
   latency. Tests shows mean latency of 1000us and deviation of 1332
   us. 54% of interrupts are handled outside of 30us period. Real
   values may be even higher, because in some cases timer would do
   full cycle and instead of say 11ms ISR would read 11 % 10 = 1ms
   latency. I tried to enable asynchronous mode for the serial
   console. This made things better, but it lead to gaps in output, so
   I just turned the serial console off completely.

2. RTDS scheduler. With console disabled, things like "hexdump -v
   /dev/zero" didn't affected the latency so badly, but anyways,
   sometimes I got ~600us spikes. This is not a surprise, because of
   default RTDS configuration. I changed period for DomU from default
   10ms to 100us and things got better: with Dom0 burning CPU I am
   rarely getting max latency of about ~30us with mean latency of ~9us
   and deviation of ~0.5us. On other hand, when I tried to set period
   to 30us, max latency rose up to ~60us.

3. Huge latency spike during domain creation. I conducted some
   additional tests, including use of PV drivers, but this didn't
   affected the latency in my "real time" domain. But attempt to
   create another domain with relatively large memory size of 2GB led
   to huge spike in latency. Debugging led to this call path:

   XENMEM_populate_physmap -> populate_physmap() ->
   alloc_domheap_pages() -> alloc_heap_pages()-> huge
   "for ( i = 0; i < (1 << order); i++ )" loop.

   This loops handles struct page* for every one of 262144 pages that
   was allocated by calling populate_physmap(). This function is not
   preemptible and it takes about a second to complete on my
   setup. Needless to say that it completely screws any responsiveness
   of the hypervisor.

I managed to overcome the issue #3 by commenting out all calls to
populate_one_size() except the populate_one_size(PFN_4K_SHIFT) in xg_dom_arm.c. 
This lengthened domain construction, but my "RT" domain didn't experienced so 
big latency issues. Apparently all other hypercalls which are used during 
domain creation are either fast or preemptible. No doubts that my hack lead to 
page tables inflation and overall performance drop.

Conclusions.

I wanted to:

1. Share my approach to IRQ latency measurements.

2. Share results of said measurements.

3. Ask for other potential scenarios, which can affect IRQ latency so I
   can test them on my setup.

4. Discuss the issue with alloc_domheap_pages(). I brooded over ways
   to make it preemptible. But it is a) located deep in call chain and
   b) used not only by hypercalls. So I can't see an easy way to make
   it preemptible. The best solution would be of course to make the
   whole hypervisor preemptible. I even heard about some work long
   time ago...

5. Show that under certain conditions Xen can be used in a soft RT
   system. Of course, additional tests are needed, but with some minor
   tweaks it can provide ~15us IRQ latency on a typical workload.

[1] 
https://urldefense.com/v3/__https://github.com/zephyrproject-rtos/zephyr/pull/30164__;!!GF_29dbcQIUBPA!lvgIcOe8f-s-fHEdC5E6wcUUKkn4ACfQymgjidLrIbx4BkoJBYOZjAntdCuh_O81uQ$
 [github[.]com] [2] 
https://urldefense.com/v3/__https://github.com/lorc/zephyr/tree/renesas_salvator_x__;!!GF_29dbcQIUBPA!lvgIcOe8f-s-fHEdC5E6wcUUKkn4ACfQymgjidLrIbx4BkoJBYOZjAntdCvKECE47w$
 [github[.]com] [3] 
https://urldefense.com/v3/__https://github.com/lorc/zephyr/blob/rt_eval/samples/rt_eval/src/main.c__;!!GF_29dbcQIUBPA!lvgIcOe8f-s-fHEdC5E6wcUUKkn4ACfQymgjidLrIbx4BkoJBYOZjAntdCtFdmBz2g$
 [github[.]com]


--
Volodymyr Babchuk at EPAM



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.