[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

x86/HVM: Linux'es apic_pending_intr_clear() warns about stale IRR


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 31 Oct 2022 16:55:49 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=43LYhshjj694LGAizJ/O2VDaAhBGlD3rOZX6GAlF0Ow=; b=MR7g2/YzniL0esrvvU7PQ0ZHAu5lVig9cONdz3sUs8sUNBZ2gl6kEkGrDtmWGUIt0lTWEgts++m1lxKuF6onWbjaWnfJtw15nwsc9tuOE7TvzmvcAmclv12lEbh/sUCcus89tevP/bmPlOulJYrFZhWo4Ix9NEaxL8fok9klTH/skn5Yvbn70rviaWBIKVDPF6qSMf3Is8KjDYtQslCWbfOLeqs1oyPDWs0DRol4D1eMYeU2NDS4nI3+XIKWiGqkN6AMcZ17bjv3rlyhW+7z4e0rTd5lAchvjfaTmXh9VseOoKLHUOyPGY4NIqEk7v6Bw1gxNqKdl0am/i0IOahytw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AIs+i+ZmvQvvqvVMX2jXqF13+2jFG9U9TlZEniV4ZgcmUbgrh+HB4k1Cn2Q+Te8QSQv4YsY5nnbkO+y+CjjN/AJcQR0b925vwkobyFYkcGN3BsGi+M4gqeDu6AbbCF3LYAJX09m6kt3+tus4i5p3IOapxnczhfWsR1wadwjGPlz01+qNUL0uKYPkdg7tSIQhEya5DtSwIL6yob4QuiKorttiALLtkO7TRQaa/CBKXjS3FLE9Nl60ewjX7AC2/F/vO6jtXqgjDkFigWgSlkYTGbpdTRqbFpJ66tJi6JxrR5v6+vx+fbz0baHvB7ssgbSw6t8xGXone3O8Md08L3Dddg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Mon, 31 Oct 2022 15:56:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello,

quite likely this isn't new, but I've ended up noticing it only recently:
On an oldish system where I hand a HVM guest an SR-IOV NIC (not sure yet
whether that actually matters) all APs have that warning issued, with all
reported values zero except for the very first IRR one - that's 00080000.
Which is suspicious by itself, for naming vector 0x13, i.e. below 0x20
and hence within CPU exception range.

For one I wonder about their logic: The function is called after setting
TPR to 0x10, which prevents the handling of vectors below 0x20 (and in
particular their propagation from ISR to IRR, if my understanding of the
process is right and the convoluted and imo partly incomplete SDM
description hasn't confused me). Plus the function runs when IRQs are
still off, which is another reason why nothing would ever propagate from
IRR to ISR while the function performs it work. Yet a comment there says

        /*
         * If the ISR map is not empty. ACK the APIC and run another round
         * to verify whether a pending IRR has been unblocked and turned
         * into a ISR.
         */

suggesting IRR bits could "promote" to ISR ones. And this, to me, is the
only justification for warning about leftover IRR bits (whereas I
certainly agree that the logic should result in all clear ISR bits, and
hence warning when one is still set is appropriate).

And then I got puzzled by our logic: vlapic_get_ppr() is called only by
vlapic_set_ppr(), vlapic_lowest_prio(), and vlapic_read_aligned(). Yet
in particular not by vlapic_has_pending_irq(). While it looks like we
don't really ignore TPR during delivery, this appears to be a strange
split approach: hvm_interrupt_blocked() checks TPR, whereas
vlapic_has_pending_irq() checks ISR. I wonder if subtle issues can't
result from that ...

Of course I'm yet to figure out how IRR bit 0x13 ends up being set in
the first place.

Any correction to my understanding as well as any useful insight would
be appreciated.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.