[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [ARM] Native application design and discussion (I hope)





On 26/04/17 22:44, Volodymyr Babchuk wrote:
Hi Julien,

Hi Volodymyr,

On 25 April 2017 at 14:43, Julien Grall <julien.grall@xxxxxxx> wrote:
We will also need another type of application: one which is
periodically called by XEN itself, not actually servicing any domain
request. This is needed for a
coprocessor sharing framework scheduler implementation.


EL0 apps can be a powerful new tool for us to use, but they are not the
solution to everything. This is where I would draw the line: if the
workload needs to be scheduled periodically, then it is not a good fit
for an EL0 app.


From my last conversation with Volodymyr I've got a feeling that notions
"EL0" and "XEN native application" must be pretty orthogonal.
In [1] Volodymyr got no performance gain from changing domain's
exception level from EL1 to EL0.
Only when Volodymyr stripped the domain's context  abstraction (i.e.
dropped GIC context store/restore) some noticeable results were reached.



Do you have numbers for part that take times in the save/restore? You
mention GIC and I am a bit surprised you don't mention FPU.

I did it in the other thread. Check out [1]. The most speed up I got
after removing vGIC context handling


Oh, yes. Sorry I forgot this thread. Continuing on that, you said that "Now
profiler shows that hypervisor spends time in spinlocks and p2m code."

Could you expand here? How the EL0 app will spend time in p2m code?
I don't quite remember. It was somewhere around p2m save/restore
context functions.
I'll try to restore that setup and will provide more details.

Similarly, why spinlocks take time? Are they contented?
Problem is that my profiler does not show stack, so I can't say which
spinlock causes this. But profiler didn't showed that CPU spend much
time in spinlock wait loop. So looks like there are no contention.


I would have a look at optimizing the context switch path. Some ideas:
        - there are a lot of unnecessary isb/dsb. The registers used by
the
guests only will be synchronized by eret.

I have removed (almost) all of them. No significant changes in latency.

        - FPU is taking time to save/restore, you could make it lazy

This also does not takes much time.

        - It might be possible to limit the number of LRs saved/restored
depending on the number of LRs used by a domain.

Excuse me, what is LR in this context?


Sorry I meant GIC LRs (see GIC save/restore code). They are used to list the
interrupts injected to the guest. All of they may not be used at the time of
the context switch.
As I said, I don't call GIC save and restore routines, So, that should
no be an issue (if I got that right).

Well, my point was that maybe you can limit the time in gic save/restore code rather than completely ignore them.

For instance, if you don't save/restore the GIC you will need to disable the vGIC (GICH_HCR.En) to avoid interrupt injection when running the EL0 app. I don't see this code here.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.