[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 0/1] xen: fix HVM kexec kernel panic


  • To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
  • From: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
  • Date: Mon, 28 Feb 2022 17:18:17 -0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=78X/H5zf90bXeuGJATSx0g0BM4dbFOBb56DOWNT6poM=; b=OEaQyMXyU4aNN05K0kCc3JblpEEtw6wADCUdRail+gitWE8MzXqdo2IsYoPP1AA1K7cx+IrtU476KNXYhZubp0FzAzt147ZrhB2j2ZKWnsklJxgud3qhLcCZwwJP0XpzgDecT4VOVGmrvMP6q2BGymGvPHmbh09CNRF5DHS2un1LCOL8a35S2MkXLIhSimnFJj1AZM7NwVHYPdihUn/4ceAW+Ygi7vBmz2rrlTKMb3bkAF0R1UDYmpdc7EtmTnaD5cWw20MX7jIOaaWm5t0S3HwzJ6H05d6cnHmpHJYKJzKx34kZiGDXTWbZpZCuAQf3IXrZmLbRxnUUqBeGLzaWmg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XXlyH51NZgdK5KZaemroAOne0FGq38YGpd+k1eZ0q13+3jHwkDU+vys24zSHMtHPXZYW/sRyq8xvODGm9aCxaf8zt9J++UlLXLKhHU3ZJLwq5Jp8zCuEuo5SPzaYareOSr4TePEPhLkRCuQA6NdUBvdQ8N65zqKWVyA2yynaH9pw21x6YafPn56CC+GPRDX/XJbz2aGfzI4kq2FUdCkElPCPYw35aUeXq49jgbE7x+BYb+G0pMmT3+l9SNflSt0o9jCvNU7G88Nw3kPQrgcf6Oj6jOl0hzgNO3p8RCDLPN1R+MjKxmHjH6sR3JwJ2eV/St/zLCNXyIF1BvHycpiihg==
  • Cc: linux-kernel@xxxxxxxxxxxxxxx, jgross@xxxxxxxx, sstabellini@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, mingo@xxxxxxxxxx, bp@xxxxxxxxx, dave.hansen@xxxxxxxxxxxxxxx, hpa@xxxxxxxxx, joe.jin@xxxxxxxxxx
  • Delivery-date: Tue, 01 Mar 2022 01:19:14 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Boris,

On 2/28/22 12:45 PM, Boris Ostrovsky wrote:
> 
> 
> On 2/25/22 8:17 PM, Dongli Zhang wrote:
>> Hi Boris,
>>
>> On 2/25/22 2:39 PM, Boris Ostrovsky wrote:
>>>
>>> On 2/24/22 4:50 PM, Dongli Zhang wrote:
>>>> This is the v3 of the patch to fix xen kexec kernel panic issue when the
>>>> kexec is triggered on VCPU >= 32.
>>>>
>>>> PANIC: early exception 0x0e IP 10:ffffffffa96679b6 error 0 cr2 0x20
>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
>>>> 5.17.0-rc4xen-00054-gf71077a4d84b-dirty #1
>>>> [    0.000000] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 12/15/2020
>>>> [    0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
>>>> ... ...
>>>> [    0.000000] RSP: 0000:ffffffffaae03e10 EFLAGS: 00010082 ORIG_RAX:
>>>> 0000000000000000
>>>> [    0.000000] RAX: 0000000000000000 RBX: 0000000000010000 RCX:
>>>> 0000000000000002
>>>> [    0.000000] RDX: 0000000000000003 RSI: ffffffffaac37515 RDI:
>>>> 0000000000000020
>>>> [    0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09:
>>>> 0000000000000001
>>>> [    0.000000] R10: ffffffffaae03df8 R11: ffffffffaae03c68 R12:
>>>> 0000000040000004
>>>> [    0.000000] R13: ffffffffaae03e50 R14: 0000000000000000 R15:
>>>> 0000000000000000
>>>> [    0.000000] FS:  0000000000000000(0000) GS:ffffffffab588000(0000)
>>>> knlGS:0000000000000000
>>>> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [    0.000000] CR2: 0000000000000020 CR3: 00000000ea410000 CR4:
>>>> 00000000000406a0
>>>> [    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>> 0000000000000000
>>>> [    0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>> 0000000000000400
>>>> [    0.000000] Call Trace:
>>>> [    0.000000]  <TASK>
>>>> [    0.000000]  ? xen_clocksource_read+0x24/0x40
>>>
>>>
>>> This is done to set xen_sched_clock_offset which I think will not be used 
>>> for a
>>> while, until sched_clock is called (and the other two uses are for
>>> suspend/resume)
>>>
>>>
>>> Can we simply defer 'xen_sched_clock_offset = xen_clocksource_read();' until
>>> after all vcpu areas are properly set? Or are there other uses of
>>> xen_clocksource_read() before ?
>>>
>>
>> I have tested that below patch will panic kdump kernel.
>>
> 
> 
> 
> Oh well, so much for that then. Yes, sched_clock() is at least called from
> printk path.
> 
> 
> I guess we will have to go with v2 then, we don't want to start seeing time
> going back, even if only with older hypervisors. The only thing I might ask is
> that you roll the logic inside xen_hvm_init_time_ops(). Something like
> 
> 
> xen_hvm_init_time_ops()
> {
>     /*
>      * Wait until per_cpu(xen_vcpu, 0) is initialized which may happen
>      * later (e.g. when kdump kernel runs on >=MAX_VIRT_CPUS vcpu)
>      */
>     if (__this_cpu_read(xen_vcpu_nr(0)) == NULL)
>         return;
> 

I think you meant __this_cpu_read(xen_vcpu).

I will call xen_hvm_init_time_ops() at both places, and move the logic into
xen_hvm_init_time_ops().

Thank you very much!

Dongli Zhang



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.