[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 0/1] xen: fix HVM kexec kernel panic


  • To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • From: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
  • Date: Mon, 28 Feb 2022 20:56:26 -0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LJao+ms5p15QT1W5aB9NeYZncCwkzWmTSSzyACuGHiI=; b=gUMUYIdZt/WdCjZArcvPOlc0yZsEF6rGo2JYfKf7GDmfPEfKsXdfKOjzz2otQk18l61nG0+RQMCqPozJ4jQ8f2angoA3BWQMxF7HGJd/dxqxRXCmOwGEyw9hlx0FmG4AVSVEuN7dpQ4BaxGnKejhyJpP9FkLdO1wvnwJBQNFAkVYngXJ2eLoN0puOdF9xDgN8Bp/TpY+QhQ4EE6pxCyIluKwtO7y7949ZfKcrl/7xrNI2keGZo5qHJJE9smMmOiFjld5AJJdK71SoMfl0kMqs5kmu8h2e/jaA0Y7KBZwbo4kilsHiNQSg8XOFPmtO7JLrpRbEx9LS0mEzqBlkZs+bQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KRAzGOL40RX5LFXlESdGAPECtSZEJlvKqEPq4lsHkHDuLAEWCUD/JHNlojquJaEJdNpunIj01Jf4dy+hgNuHDi9fTZs/4CRFQIBgbvZRcY8bAaGkbIGtTE46KyvDqFQOXTCOCYYKQiwgXAZfhteZDxPw3gaT/V6yyQVIQgJUqBxtuHTZZmkF3Wpu4OtMKM9Sv26CoyZ3BCrKItu+Z3qOOl+RBUEjR7V/3PlH0QMhC2DEYkeTeU6aSUDQ/fhiHA18RhY2WJn/a6XW2tVGxqka+4OWwipEaEIDF9qxgpLjple4wCpcBjvgGUvuWNDsObpvyM8A5+zBkUOsRnIeRLLbGg==
  • Cc: linux-kernel@xxxxxxxxxxxxxxx, jgross@xxxxxxxx, sstabellini@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, mingo@xxxxxxxxxx, bp@xxxxxxxxx, dave.hansen@xxxxxxxxxxxxxxx, hpa@xxxxxxxxx, joe.jin@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
  • Delivery-date: Tue, 01 Mar 2022 04:57:22 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Boris,

On 2/28/22 5:18 PM, Dongli Zhang wrote:
> Hi Boris,
> 
> On 2/28/22 12:45 PM, Boris Ostrovsky wrote:
>>
>>
>> On 2/25/22 8:17 PM, Dongli Zhang wrote:
>>> Hi Boris,
>>>
>>> On 2/25/22 2:39 PM, Boris Ostrovsky wrote:
>>>>
>>>> On 2/24/22 4:50 PM, Dongli Zhang wrote:
>>>>> This is the v3 of the patch to fix xen kexec kernel panic issue when the
>>>>> kexec is triggered on VCPU >= 32.
>>>>>
>>>>> PANIC: early exception 0x0e IP 10:ffffffffa96679b6 error 0 cr2 0x20
>>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
>>>>> 5.17.0-rc4xen-00054-gf71077a4d84b-dirty #1
>>>>> [    0.000000] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 12/15/2020
>>>>> [    0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
>>>>> ... ...
>>>>> [    0.000000] RSP: 0000:ffffffffaae03e10 EFLAGS: 00010082 ORIG_RAX:
>>>>> 0000000000000000
>>>>> [    0.000000] RAX: 0000000000000000 RBX: 0000000000010000 RCX:
>>>>> 0000000000000002
>>>>> [    0.000000] RDX: 0000000000000003 RSI: ffffffffaac37515 RDI:
>>>>> 0000000000000020
>>>>> [    0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09:
>>>>> 0000000000000001
>>>>> [    0.000000] R10: ffffffffaae03df8 R11: ffffffffaae03c68 R12:
>>>>> 0000000040000004
>>>>> [    0.000000] R13: ffffffffaae03e50 R14: 0000000000000000 R15:
>>>>> 0000000000000000
>>>>> [    0.000000] FS:  0000000000000000(0000) GS:ffffffffab588000(0000)
>>>>> knlGS:0000000000000000
>>>>> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [    0.000000] CR2: 0000000000000020 CR3: 00000000ea410000 CR4:
>>>>> 00000000000406a0
>>>>> [    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>> 0000000000000000
>>>>> [    0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>>> 0000000000000400
>>>>> [    0.000000] Call Trace:
>>>>> [    0.000000]  <TASK>
>>>>> [    0.000000]  ? xen_clocksource_read+0x24/0x40
>>>>
>>>>
>>>> This is done to set xen_sched_clock_offset which I think will not be used 
>>>> for a
>>>> while, until sched_clock is called (and the other two uses are for
>>>> suspend/resume)
>>>>
>>>>
>>>> Can we simply defer 'xen_sched_clock_offset = xen_clocksource_read();' 
>>>> until
>>>> after all vcpu areas are properly set? Or are there other uses of
>>>> xen_clocksource_read() before ?
>>>>
>>>
>>> I have tested that below patch will panic kdump kernel.
>>>
>>
>>
>>
>> Oh well, so much for that then. Yes, sched_clock() is at least called from
>> printk path.
>>
>>
>> I guess we will have to go with v2 then, we don't want to start seeing time
>> going back, even if only with older hypervisors. The only thing I might ask 
>> is
>> that you roll the logic inside xen_hvm_init_time_ops(). Something like
>>
>>
>> xen_hvm_init_time_ops()
>> {
>>     /*
>>      * Wait until per_cpu(xen_vcpu, 0) is initialized which may happen
>>      * later (e.g. when kdump kernel runs on >=MAX_VIRT_CPUS vcpu)
>>      */
>>     if (__this_cpu_read(xen_vcpu_nr(0)) == NULL)
>>         return;
>>
> 
> I think you meant __this_cpu_read(xen_vcpu).
> 
> I will call xen_hvm_init_time_ops() at both places, and move the logic into
> xen_hvm_init_time_ops().
> 
> Thank you very much!
> 
> Dongli Zhang
> 


How about we do not move the logic into xen_hvm_init_time_ops()?

Suppose we move the logic into xen_hvm_init_time_ops() line 573, the line line
570 might be printed twice.


559 void __init xen_hvm_init_time_ops(void)
560 {
561         /*
562          * vector callback is needed otherwise we cannot receive interrupts
563          * on cpu > 0 and at this point we don't know how many cpus are
564          * available.
565          */
566         if (!xen_have_vector_callback)
567                 return;
568
569         if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
570                 pr_info("Xen doesn't support pvclock on HVM, disable pv 
timer");
571                 return;
572         }
573
574         xen_init_time_common();
575
576         x86_init.timers.setup_percpu_clockev = xen_time_init;
577         x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
578
579         x86_platform.set_wallclock = xen_set_wallclock;
580 }

I feel the code looks better if we keep the logic at caller side. Would you mind
letting me know your feedback?

Thank you very much!

Dongli Zhang



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.