Re: [Xen-devel] null scheduler bug

[Adding Julien as well.

Julien, this seems related to the RCU issue we fought on ARM when using
Credit2, although this is null, but it's being even more weird...]

On Fri, 2018-09-21 at 16:14 +0200, Milan Boberic wrote:
> Hey,
> yes, I can see prink's outputs on console and in xl dmesg. Also added
> timestamps, here are the results (created and destroyed domU a few
> times, just to get more values), this is from xl dmesg:
> NULL SCHEDULER - Not stressed PetaLinux host domain.
> (XEN) t=218000327743:End of a domain_destroy function
> (XEN) t=218000420874:End of a complete_domain_destroy function
> (XEN) <G><3>memory_map:add: dom2 gfn=ff0a0 mfn=ff0a0 nr=1
> ...
> Stressed PetaLinux host with command: yes > /dev/null &
> (XEN) t=3247747255872:End of a domain_destroy function
> (XEN) t=3247747349863:End of a complete_domain_destroy function
> ...
> CREDIT SCHEDULER - not stressed PetaLinux host
> (XEN) t=86245669606:End of a domain_destroy function
> (XEN) t=86245761127:End of a complete_domain_destroy function
> ...
> Stressed PetaLinux host with yes > /dev/null &
> (XEN) t=331229997499:End of a domain_destroy function
> (XEN) t=331230091770:End of a complete_domain_destroy function
> ...
Which, if I'm doing the math properly, tells us that
complete_domain_destroy() is called within ~90us, for both schedulers,
and in all stress/load conditions. That wouldn't be too bad, I think.

And in fact, if I remember correctly, you're saying that adding the
printk()s fixes the issue in null. I wonder why that is... Can you like
kill the printks, store 5 or so of the timestamps (or just the delta)
in a static array or something, and print it from somewhere else (like
a debug-key handler in the same file)?

What I'm after, is how log, after domain_destroy(),
complete_domain_destroy() is called, and whether/how it relates the the
grace period idle timer we've added in the RCU code.

