[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null scheduler and vwfi native problem



On 1/21/21 7:32 PM, Dario Faggioli wrote:
On Thu, 2021-01-21 at 11:54 +0100, Anders Törnqvist wrote:
Hi,

Hello,

I see a problem with destroy and restart of a domain. Interrupts are
not
available when trying to restart a domain.

The situation seems very similar to the thread "null scheduler bug"
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg01213.html
.

Right. Back then, PCI passthrough was involved, if I remember
correctly. Is it the case for you as well?

The target system is a iMX8-based ARM board and Xen is a 4.13.0
version
built from https://source.codeaurora.org/external/imx/imx-xen.git.

Mmm, perhaps it's me, but neither going at that url with a browser not
trying to clone it, I do not see anything. What I'm doing wrong?
Sorry. The link is https://source.codeaurora.org/external/imx/imx-xen.

Xen is booted with sched=null vwfi=native.
One physical CPU core is pinned to the domu.
Some interrupts are passed through to the domu.

Ok, I guess it is involved, since you say "some interrupts are passed
through..."

When destroying the domain with xl destroy etc it does not complain
but
then when trying to restart the domain
again with a "xl create <domain cfg>" I get:
(XEN) IRQ 210 is already used by domain 1

"xl list" does not contain the domain.

Repeating the "xl create" command 5-10 times eventually starts the
domain without complaining about the IRQ.

Inspired from the discussion in the thread above I have put printks
in
the xen/common/domain.c file.
In the function domain_destroy I have a printk("End of domain_destroy
function\n") in the end.
In the function complete_domain_destroy have a printk("Begin of
complete_domain_destroy function\n") in the beginning.

With these printouts I get at "xl destroy":
(XEN) End of domain_destroy function

So it seems like the function complete_domain_destroy is not called.

Ok, thanks for making these tests. It's helpful to have this
information right away.

"xl create" results in:
(XEN) IRQ 210 is already used by domain 1
(XEN) End of domain_destroy function

Then repeated "xl create" looks the same until after a few tries I
also get:
(XEN) Begin of complete_domain_destroy function

After that the next "xl create" creates the domain.


I have also applied the patch from
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg02469.html
.
This does seem to change the results.

Ah... Really? That's a bit unexpected, TBH.

Well, I'll think about it.

Starting the system without "sched=null vwfi=native" does not result
in
the problem.

Ok, how about, if you're up for some more testing:

  - booting with "sched=null" but not with "vwfi=native"
  - booting with "sched=null vwfi=native" but not doing the IRQ
    passthrough that you mentioned above

?

Regards





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.