[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Hit ASSERT in credit2 code with NR_CPUS=1 build


  • To: Dario Faggioli <dfaggioli@xxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 9 Mar 2021 17:24:45 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fO1ef3mSDffnRRp2Ck8LauB0yKia47IRx9JQnaqgBnw=; b=kgo/f3y1gKspVaXOiDExArBJnsOblBXwueZfC9mTbPA0mx9mxJ1UG9Nvo9xcpkdg31Ly+34mmQbhwQaGYxdHGjl+ZXHAo3mx/I4e31VmhfeG/ERt4eOnnyhqcfgE/D7sh5ppDHS2u9QG2M1DZRUuq+WQoLk1uDT0eikOa+fhgGSrNXP1IHSyueJmncpSVai6BBJ6Ah9NyaMsS8mIcf/OZwjyfsRwfvvpQ31mwDZ588xUS1+BvJsWaTf1KIDreIOdgcrlkAkpoEmOHGnk/1lqhePM5zhsMYJ7wQyWfShWtsGMpA2LOu/wh6vDphadMcfQN/BH1ORY5rScIqh60cav9g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Jmv6bUbERCx9Z88EoLwQQnVv/zjfPry1oiRqCACktGmLVmjiCxiCZsG48pApjwMkQX3kVWVh2HKMf0Hg0fyYelagS5QkUJsTX5kN6q8KpqGCdn/TsAYwjBscVrckgR1zveQ+Wkk2PY9K05XSKtstjZyK91mkkXfVdL90AacQGIV8eEPBCLJa/awXwTE9ro2rU5p3sYGKY0Ws49kMvvoi1z7i7CUb2hIaKysKG685V6TIdlpsBFbAtg7bdmy7f2D7OQhWj5FsZPS/n0DmWJD4vxKvUNGoVsPOlI0/HITHWG+x3dmOghOt78lydkfebTZj0ODBZfWP67MLEMdwiRJytg==
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 09 Mar 2021 16:25:13 +0000
  • Ironport-sdr: Fb3Olky8jubWGzKWhGDYEsB56sHo4DoXNvrb5mJKHNBjCOTyXK3cHrUE6jLm2O9ycrVTewRM/Y ioPdjAHT4fv7HbFvfeLkdHgKtaWWGzwb+3GY9Sh/ZS8PgxkBkIkmyPtICK/GYfrEauxGgMmCcX tH7mTJC/l5AeHQkGL2cdUbPneNXjHT916Sz3fMH2qrCaRo7K5dN8RPoRmGrgkxeE82uqAQKnfq nQUbwJJ+jpbwlk1Qt+5NSRGFLMB/G9CJDJqdYrIJfjpBzv1QU9GoKZBDai2pm8oKecCPMqRoSg +/o=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello,

While looking at the NR_CPUS == 1 build I realized I could reliable
trigger the following ASSERT by creating a guest (note that dom0 seems
to be fine):

(XEN) Assertion 'i != cpu' failed at credit2.c:1725
(XEN) ----[ Xen-4.15.0-rc  x86_64  debug=y  Tainted:   C   ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d040249399>] 
common/sched/credit2.c#runq_tickle+0x469/0x571
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor (d4v0)
(XEN) rax: ffffffffffffffff   rbx: 0000000000000000   rcx: 0000000000000000
(XEN) rdx: ffff83086c62feb0   rsi: 0000012774fba66c   rdi: ffff8307e11d5d40
(XEN) rbp: ffff83008c8c7cf8   rsp: ffff83008c8c7c68   r8:  ffff83086c66d6c0
(XEN) r9:  ffff82d0405d1218   r10: 0000000000000000   r11: ffff83086c631000
(XEN) r12: ffff83086c6437c0   r13: 0000000000000000   r14: ffff83086c62fe20
(XEN) r15: ffff82d0405d0320   cr0: 0000000080050033   cr4: 00000000003526e0
(XEN) cr3: 00000007e130d000   cr2: ffff88826910cb38
(XEN) fsb: 00007efee038b780   gsb: ffff888273400000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d040249399> 
(common/sched/credit2.c#runq_tickle+0x469/0x571):
(XEN)  ac ff 75 3d 0f 0b 0f 0b <0f> 0b c7 45 ac 00 00 00 00 48 8d 05 6f 7e 38 00
(XEN) Xen stack trace from rsp=ffff83008c8c7c68:
(XEN)    ffff83008c8c7c88 0000012774fba66c ffff82d0404ab4e0 ffff82d0405d0320
(XEN)    ffff82d0405d0320 ffff83086c62feb0 ffff83086c643760 ffffffff00000002
(XEN)    ffff83008c8c7cd8 ffff82d0402f9234 ffff8307e131e000 0000000000000000
(XEN)    ffff83008c8c7ce8 ffff83086c62feb0 ffff83086c62fe20 ffff83086c6437c0
(XEN)    0000012774fba66c ffff82d0405e64a0 ffff83008c8c7d58 ffff82d040249eb6
(XEN)    ffff83008c8c7fff ffff82d0404ab4e0 ffff83008c8c7d18 ffff83008c8c7d18
(XEN)    ffff83008c8c7d48 ffff83086c62fe20 ffff83086c66d620 ffff8307e11d5d40
(XEN)    ffff83086c66d620 0000000000000000 ffff83008c8c7d78 ffff82d040253752
(XEN)    ffff83086c61a000 ffff8307e11ca000 ffff83008c8c7da8 ffff82d040255d8b
(XEN)    ffff8307e11ca000 ffff83086c61a000 ffff8307e131e000 ffff83086c631000
(XEN)    ffff83008c8c7df8 ffff82d04031edd6 ffff83086c61a060 0000000000000296
(XEN)    0000012774fb8f84 ffff8307e11ca000 ffff83086c61a000 0000000000000001
(XEN)    0000012774fb8f84 0000000000000000 ffff83008c8c7e38 ffff82d040255e37
(XEN)    ffff83008c8c7e28 ffff83086c66d620 0000000000000000 ffff8307e11d5d40
(XEN)    0000000000000000 0000000000000001 ffff83008c8c7e98 ffff82d040256844
(XEN)    ffff83086c61a000 ffff83086c62fe20 ffff82d0403953d4 ffff83086c6437c0
(XEN)    deadbeefdeadf00d ffff82d0405d6e80 ffff82d0405d6e80 ffffffffffffffff
(XEN)    ffff83008c8c7fff 0000000000000000 ffff83008c8c7ed8 ffff82d04022dc68
(XEN)    ffff82d0403953d4 ffff83086c61a000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffff83008c8c7ee8 ffff82d04022dcfd
(XEN) Xen call trace:
(XEN)    [<ffff82d040249399>] R common/sched/credit2.c#runq_tickle+0x469/0x571
(XEN)    [<ffff82d040249eb6>] F 
common/sched/credit2.c#csched2_context_saved+0x27f/0x284
(XEN)    [<ffff82d040253752>] F common/sched/core.c#unit_context_saved+0x56/0x84
(XEN)    [<ffff82d040255d8b>] F sched_context_switched+0x14d/0x161
(XEN)    [<ffff82d04031edd6>] F context_switch+0x15b/0x11dd
(XEN)    [<ffff82d040255e37>] F 
common/sched/core.c#sched_context_switch+0x98/0x1e0
(XEN)    [<ffff82d040256844>] F common/sched/core.c#schedule+0x216/0x2e5
(XEN)    [<ffff82d04022dc68>] F common/softirq.c#__do_softirq+0x8a/0xb6
(XEN)    [<ffff82d04022dcfd>] F do_softirq+0x13/0x15
(XEN)    [<ffff82d040318176>] F x86_64/entry.S#process_softirqs+0x6/0x20
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'i != cpu' failed at credit2.c:1725
(XEN) ****************************************

This corresponds to the following chunk of code:

    for_each_cpu(i, &mask)
    {
        s_time_t score;

        /* Already looked at this one above */
        ASSERT(i != cpu); <====

        score = tickle_score(ops, now, new, i);

        if ( score > max )
        {
            max = score;
            ipid = i;
        }
    }

In runq_tickle. I'm afraid I have no clue of what's going on. FTR
using a non-debug build with NR_CPUS == 1 does seem to work fine and I
don't see any ill effects. Obviously the ASSERT is no longer there.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.