Xen project Mailing List

Re: [Xen-devel] null scheduler bug

From: Milan Boberic <milanboberic94@xxxxxxxxx>

Date: Tue, 25 Sep 2018 13:51:14 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, stefano@xxxxxxxxxxxxxx, Dario Faggioli <dfaggioli@xxxxxxxx>

Delivery-date: Tue, 25 Sep 2018 11:51:38 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Reply for Julien, yes, my platform have 4 CPUs it's UltraZed-EG board with carrier card. I use only 2 CPUs, one for dom0 which is PetaLinux and one for domU which is bare-metal application that blinks LED on the board (I use it to measure jitter with oscilloscope), other two CPUs are unused (in idle loop). About command, commad is from xen-overlay.dtsi file which is included in system-user.dtsi file in my project. Whole file is included in atatchment in my earlier reply. About this options: maxvcpus=1 core_parking=performance cpufreq=xen:performance I was just testing them to see will I get any performance improvement, I will remove them right away. Best regards, Milan Boberic! On Tue, Sep 25, 2018 at 1:15 PM Julien Grall <julien.grall@xxxxxxx> wrote: > > Hi Dario, > > On 09/25/2018 10:02 AM, Dario Faggioli wrote: > > On Mon, 2018-09-24 at 22:46 +0100, Julien Grall wrote: > >> On 09/21/2018 05:20 PM, Dario Faggioli wrote: > >>> > >>> What I'm after, is how log, after domain_destroy(), > >>> complete_domain_destroy() is called, and whether/how it relates the > >>> the > >>> grace period idle timer we've added in the RCU code. > >> > >> NULL scheduler and vwfi=native will inevitably introduce a latency > >> when > >> destroying a domain. vwfi=native means the guest will not trap when > >> it > >> has nothing to do and switch to the idle vCPU. So, in such > >> configuration, it is extremely unlikely the execute the idle_loop or > >> even enter in the hypervisor unless there are an interrupt on that > >> pCPU. > >> > > Ah! I'm not familiar with wfi=native --and in fact I was completely > > ignoring it-- but this analysis makes sense to me. > > > >> Per my understanding of call_rcu, the calls will be queued until the > >> RCU > >> reached a threshold. We don't have many place where call_rcu is > >> called, > >> so reaching the threeshold may just never happen. But nothing will > >> tell > >> that vCPU to go in Xen and say "I am done with RCU". Did I miss > >> anything? > >> > > Yeah, and in fact we added the timer _but_, in this case, it does not > > look that the timer is firing. It looks much more like "some random > > interrupt happens", as you're suggesting. OTOH, in the case where there > > are no printk()s, it might be that the timer does fire, but the vcpu > > has not gone through Xen, so the grace period is, as far as we know, > > not expired yet (which is also in accordance with Julien's analysis, as > > far as I understood it). > > The timer is only activated when sched_tick_suspend() is called. With > vwfi=native, you will never reach the idle_loop() and therefore never > setup a timer. > > Milan confirmed that guest can be destroyed with vwfi=native removed. So > this is confirming my thinking. Trapping wfi will end up to switch to > idle vCPU and trigger the grace period. > > I am not entirely sure you will be able to reproduce it on x86, but I > don't think it is a Xen Arm specific. > > When I looked at the code, I don't see any grace period in other context > than idle_loop. Rather than adding another grace period, I would just > force quiescence for every call_rcu. > > This should not be have a big performance impact as we don't use much > call_rcu and it would allow domain to be fully destroyed in timely manner. > > Cheers, > > -- > Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.