[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 01/10] xen/arm: vpl011: Add pl011 uart emulation in Xen



Hi Bhupinder,

On 26/04/2017 08:49, Bhupinder Thakur wrote:
Regarding the optimization you introduced in this patch, delaying write
notifications until we receive a notification from xenconsoled, how many
notifications from xen to xenconsoled does it actually save? xenconsoled
is going to send a notification for every read: we might end up sending
the same number of notifications, only delayed.


In the PV console design, the events from the guest are sent to
xenconsole for every chunk of data. Since in the pl011 emulation, the
data comes in bytes only, it would generate lot of events to
xenconsole. To reduce the flurry of events, this optimisation was
added.

Xenconsole sends an event in the following conditions:

1. There is data available for Xen to process
2. It has finished processing the data and Xen can send more data

In the 2nd case, xenconsole will keep reading the data from the ring
buffer until it goes empty. At that point, it would send an event to
Xen. Between sending of this event and processing of this event by
Xen, there could be more data added for the xenconsole to process.
While handling an event, the Xen will check for that condition and if
there is data to be processed by xenconsole, it would send an event.

Also sending delayed events helps with the rate limit check in
xenconsole. If there are too many events, they maybe masked by
xenconsole. I could test whether this rate limit check is really
getting hit with and without this optimisation.

I understand the idea behind, my question is whether this approach was
actually verified by any scientific measurements. Did you run a test
to count how many notifications were skipped thanks to this
optimization? If so, what was the scenario? How many notifications were
saved? If you didn't run a test, I suggest you do :-)

Today I did some instrumentation and count the number of events sent
by Xen to xenconsole and how many are really processed by xenconsole.

I could not see any difference in the number of events processed by
xenconsole with or without the optimization. The total number of
events processed by xenconsole were about 500 for the complete guest
booting till the login prompt (for both optimised and non-optimised
case).

Although Xen calls notify_via_xen_event_channel() far more number of
times (about 12000 times until the guest loging prompt comes) without
the optimisation, it does not translate into sending those many events
to xenconsole though. With the optmization it just reduces the number
of times notify_via_xen_event_channel() is called which is about 500
times.

I believe the reason could be that if the event is still pending with
xenconsole when the next event comes via
notify_via_xen_event_channel() then all such events would be coalesced
and delivered to xenconsole as a single event.

So the optimization does not help with saving any processing on
xenconsole though it saves the overhead of calling
notify_via_xen_event_channel() very frequently in Xen.

I don't see any issue to call notify_via_xen_event_channel many time because you should do it for every batch sent.

Yes, the batch consists only of one character, but this is how an UART is designed.

You rely on the behavior of notify_via_xen_event_channel. What if in the future it changes? Then maybe, you will miss event and data.

So I would rather avoid this premature optimization and see how it is behaving. If it is too slow, then we can think about it.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.