[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vcpu_show_execution_state() difference between Arm and x86

To: Jan Beulich <jbeulich@xxxxxxxx>
From: Julien Grall <julien@xxxxxxx>
Date: Thu, 2 Sep 2021 12:48:32 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>
Delivery-date: Thu, 02 Sep 2021 11:48:42 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Jan,

On 02/09/2021 07:45, Jan Beulich wrote:

On 01.09.2021 20:11, Julien Grall wrote:

On 01/09/2021 14:39, Jan Beulich wrote:

back in 2016 Andrew added code to x86'es variant to avoid interleaving
of output. The same issue ought to exist on Arm.


Agree. I guess we got away so far because it is pretty rare to have two
CPUs running at the same.


I guess you've meant "crashing", not "running"?


Yes.

The lock acquired,
or more importantly the turning off of IRQs while doing so, is now
getting in the way of having PVH Dom0's state dumped the 2nd time.


I am not quite too sure to understand the problem with PVH dom0. Do you
have a pointer to the issue?


Locking in both cases: For registers it was VMX'es vmx_vmcs_enter()
acquiring a (non-IRQ-safe) lock, and for the stack it is the P2M lock.
Neither can / should sensibly be made IRQ-safe.

For
register state I did find a sufficiently simple (yet not pretty)
workaround. For the stack, where I can't reasonably avoid using p2m
functions, this is going to be more difficult. >
Since I expect Arm to want to also have interleave protection at some
point, and since Arm also acquires the p2m lock while accessing Dom0's
stacks, I wonder whether anyone has any clever idea on how to avoid
the (valid) triggering of check_lock()'s assertion without intrusive
changes. (As to intrusive changes - acquiring the p2m lock up front in
recursive mode, plus silencing check_lock() for nested acquires of a
lock that's already being held by a CPU was my initial idea.)


At least one Arm, the P2M lock is a rwlock which is not yet recursive.


Right; the same is tru on x86. Hence the expected intrusiveness.

But then it feels to me that this solution is only going to cause us
more trouble in the future.


Same here - we'd need to be very careful not only when making such
a change, but also going forward. Hence my desire to come up with a
better approach.

I looked at the original commit to find out the reason to use the
console lock. AFAICT, this was to allow console_force_unlock() to work
properly. But it is not entirely clear why we couldn't get a new lock
(with IRQ enabled) that could be forced unlocked in that function.

Can either you or Andrew clarify it?


AIUI any new lock would need to be IRQ-safe as well, as the lock
would be on paths taken to output stuff when the system crashed.

Hmmm... Just to confirm, what you are saying is some of the callers ofvcpu_show_execution_state() & co may do it with IRQ-disabled. Is thatcorrect?

I have tried to play with it on Arm but then I realized thatcheck_lock() is not properly working on Arm because we don't callspin_debug_enable() at boot. :/ So it would make sense why we never sawany issue there...

Hence it would be pointless to introduce yet another lock when the
one we have is already good enough. But I may be missing aspects,
in which case I'd have to defer to Andrew.

The other solution I can think off is buffering the output for
show_registers and only print it once at the end. The downside is we may
not get any output if there is an issue in the middle of the dump.


Indeed; I'd prefer to avoid that, the more that it may be hard to
predict how much output there's going to be.

And it is not going to work as we couldn't grab the P2M lock with IRQdisabled.

On Arm, the only problem is going to be the P2M lock for dump the guesttrace. A possible controversial approach for Arm is to just not dump theguest stack or move it outside of the new lock and dump when IRQ isenabled (I am not aware of any places where the guest stack will bedumped and we have IRQ disabled).

Bertrand, Stefano? Do you use the guest stack in the dump? If so, whatare the cases?


Cheers,

--
Julien Grall

Follow-Ups:
- Re: vcpu_show_execution_state() difference between Arm and x86
  - From: Jan Beulich

References:
- vcpu_show_execution_state() difference between Arm and x86
  - From: Jan Beulich
- Re: vcpu_show_execution_state() difference between Arm and x86
  - From: Julien Grall
- Re: vcpu_show_execution_state() difference between Arm and x86
  - From: Jan Beulich

Prev by Date: Re: [PATCH v2 0/6] x86/PVH: Dom0 building adjustments
Next by Date: [ovmf test] 164758: regressions - FAIL
Previous by thread: Re: vcpu_show_execution_state() difference between Arm and x86
Next by thread: Re: vcpu_show_execution_state() difference between Arm and x86
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.