[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vcpu_show_execution_state() difference between Arm and x86



Hi Jan,

On 02/09/2021 07:45, Jan Beulich wrote:
On 01.09.2021 20:11, Julien Grall wrote:
On 01/09/2021 14:39, Jan Beulich wrote:
back in 2016 Andrew added code to x86'es variant to avoid interleaving
of output. The same issue ought to exist on Arm.

Agree. I guess we got away so far because it is pretty rare to have two
CPUs running at the same.

I guess you've meant "crashing", not "running"?

Yes.


The lock acquired,
or more importantly the turning off of IRQs while doing so, is now
getting in the way of having PVH Dom0's state dumped the 2nd time.

I am not quite too sure to understand the problem with PVH dom0. Do you
have a pointer to the issue?

Locking in both cases: For registers it was VMX'es vmx_vmcs_enter()
acquiring a (non-IRQ-safe) lock, and for the stack it is the P2M lock.
Neither can / should sensibly be made IRQ-safe.

For
register state I did find a sufficiently simple (yet not pretty)
workaround. For the stack, where I can't reasonably avoid using p2m
functions, this is going to be more difficult. >
Since I expect Arm to want to also have interleave protection at some
point, and since Arm also acquires the p2m lock while accessing Dom0's
stacks, I wonder whether anyone has any clever idea on how to avoid
the (valid) triggering of check_lock()'s assertion without intrusive
changes. (As to intrusive changes - acquiring the p2m lock up front in
recursive mode, plus silencing check_lock() for nested acquires of a
lock that's already being held by a CPU was my initial idea.)

At least one Arm, the P2M lock is a rwlock which is not yet recursive.

Right; the same is tru on x86. Hence the expected intrusiveness.

But then it feels to me that this solution is only going to cause us
more trouble in the future.

Same here - we'd need to be very careful not only when making such
a change, but also going forward. Hence my desire to come up with a
better approach.

I looked at the original commit to find out the reason to use the
console lock. AFAICT, this was to allow console_force_unlock() to work
properly. But it is not entirely clear why we couldn't get a new lock
(with IRQ enabled) that could be forced unlocked in that function.

Can either you or Andrew clarify it?

AIUI any new lock would need to be IRQ-safe as well, as the lock
would be on paths taken to output stuff when the system crashed.

Hmmm... Just to confirm, what you are saying is some of the callers of vcpu_show_execution_state() & co may do it with IRQ-disabled. Is that correct?

I have tried to play with it on Arm but then I realized that check_lock() is not properly working on Arm because we don't call spin_debug_enable() at boot. :/ So it would make sense why we never saw any issue there...

Hence it would be pointless to introduce yet another lock when the
one we have is already good enough. But I may be missing aspects,
in which case I'd have to defer to Andrew.

The other solution I can think off is buffering the output for
show_registers and only print it once at the end. The downside is we may
not get any output if there is an issue in the middle of the dump.

Indeed; I'd prefer to avoid that, the more that it may be hard to
predict how much output there's going to be.

And it is not going to work as we couldn't grab the P2M lock with IRQ disabled.

On Arm, the only problem is going to be the P2M lock for dump the guest trace. A possible controversial approach for Arm is to just not dump the guest stack or move it outside of the new lock and dump when IRQ is enabled (I am not aware of any places where the guest stack will be dumped and we have IRQ disabled).

Bertrand, Stefano? Do you use the guest stack in the dump? If so, what are the cases?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.