[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vcpu_show_execution_state() difference between Arm and x86


  • To: Julien Grall <julien@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 2 Sep 2021 08:45:30 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2CCz0N6kirv4Y+bKxgrtsTMCCf7h9FVJy0MMlSYiGu8=; b=V2ssBBQLYaczegaAcnkwCLAqnWV1Zecrq6x12M4GdjjPlh70kcy6kLcHQ0ysEDceE9tVMNoE/Gl91niyR5vvQYaSm4mN2tVRK5ZzBx7NERqKzQDr4gyo+cwoz4WykjGkzYIVRiuRouEQ4kZY40O9mo/CItEXna3O2DAlvMs4tRScaQxOAhZhGTsltxBbrcF2BpIu/pyGUDMX9KPQfeNb2+B7EoYyGCz285nhGQmjVzlY+q/OBzhRxDKQmNjQoyU3WgOqXlXLsQPzm+U7pyzXRBvYn2G/EhM1wN2ZvnVE0/ebQ+QcgDfQcTQI/YgsWjpLNbqyLGqwWyZkZCt+2O+9yA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=H98I8P6tAF8SV0fMbBr7XkMZcpT8SHAdmIPkz+Ze0mQ8iieyAUSEAsWrspYe9MZ8qu2kVEhD0UO8TMXu+RHOWvR2o9iesqrranmdOHMOB6VYl4E40RFQXbJwhgTekIgpV+777kSrT4hTVTT89pV6iSdqab/L7/wDFirM4g/GFy+AU2SFylqjsVGW1VNnCP1DMN+pXLdiVI+006NTKO9xgmyowSgPq58AzdXd3t70vWniTAN1q0xgqtqDsxCROmtH5XFfDFioEMbPdV7KXRXi2bebdOdCwazH3MZh0QR99jjdODRWFbGrNq4Eog2n/9puvNDCAjmM0NINecg/H3lJZw==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Thu, 02 Sep 2021 06:45:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.09.2021 20:11, Julien Grall wrote:
> On 01/09/2021 14:39, Jan Beulich wrote:
>> back in 2016 Andrew added code to x86'es variant to avoid interleaving
>> of output. The same issue ought to exist on Arm.
> 
> Agree. I guess we got away so far because it is pretty rare to have two 
> CPUs running at the same.

I guess you've meant "crashing", not "running"?

>> The lock acquired,
>> or more importantly the turning off of IRQs while doing so, is now
>> getting in the way of having PVH Dom0's state dumped the 2nd time. 
> 
> I am not quite too sure to understand the problem with PVH dom0. Do you 
> have a pointer to the issue?

Locking in both cases: For registers it was VMX'es vmx_vmcs_enter()
acquiring a (non-IRQ-safe) lock, and for the stack it is the P2M lock.
Neither can / should sensibly be made IRQ-safe.

>> For
>> register state I did find a sufficiently simple (yet not pretty)
>> workaround. For the stack, where I can't reasonably avoid using p2m
>> functions, this is going to be more difficult. >
>> Since I expect Arm to want to also have interleave protection at some
>> point, and since Arm also acquires the p2m lock while accessing Dom0's
>> stacks, I wonder whether anyone has any clever idea on how to avoid
>> the (valid) triggering of check_lock()'s assertion without intrusive
>> changes. (As to intrusive changes - acquiring the p2m lock up front in
>> recursive mode, plus silencing check_lock() for nested acquires of a
>> lock that's already being held by a CPU was my initial idea.)
> 
> At least one Arm, the P2M lock is a rwlock which is not yet recursive. 

Right; the same is tru on x86. Hence the expected intrusiveness.

> But then it feels to me that this solution is only going to cause us 
> more trouble in the future.

Same here - we'd need to be very careful not only when making such
a change, but also going forward. Hence my desire to come up with a
better approach.

> I looked at the original commit to find out the reason to use the 
> console lock. AFAICT, this was to allow console_force_unlock() to work 
> properly. But it is not entirely clear why we couldn't get a new lock 
> (with IRQ enabled) that could be forced unlocked in that function.
> 
> Can either you or Andrew clarify it?

AIUI any new lock would need to be IRQ-safe as well, as the lock
would be on paths taken to output stuff when the system crashed.
Hence it would be pointless to introduce yet another lock when the
one we have is already good enough. But I may be missing aspects,
in which case I'd have to defer to Andrew.

> The other solution I can think off is buffering the output for 
> show_registers and only print it once at the end. The downside is we may 
> not get any output if there is an issue in the middle of the dump.

Indeed; I'd prefer to avoid that, the more that it may be hard to
predict how much output there's going to be.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.