[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ryzen 6000 (Mobile)


  • To: Dylanger Daly <dylangerdaly@xxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 16 Aug 2022 13:22:58 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rDO+mH3UWS6nzQIOhSSUbBVjLaPUTUnSFrsjETx8g6I=; b=CyyjbiYcwCd/uO/vLXf/J+Q0CFHBzOlK8RY9+fVcI3CAbF4inNy4IHD+eq9ub3RDT4utMRMm7jBHcNK7vD4a0jbAUZXhcnMVYL+e6/p2Se99It1Um//lDRxVvcnP+udNYps9VVWYS5oHk+wvYeEJjH8emnDCXbbbVMV+wtjAN9JUvqc3Ufy60gmF8lmzeAys2pEWgOraKiI80qExhTbvfrKcMyt4wIxmvO7X6PyowuWkn2PM2oAHZ0wPYTaJG1Z8I+d0e0TwlxyvBOLZ7TOGfHJr8ZCb+rsYScuuXzqy518OuEl5eGwCeMlOyCAozWCsCi6ZlzFP+sxPeW1+seY75g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IpkFSLMmvsvZrNcSkRrNIIeoJvK9nC8waZ2xo2VQ6WMsDpLE3c9J2beDnk3x/K8yPMHID9x0ihQ8k+4G5fEbiJDOvFdNPMniG/pCQot0zwFaze7WieTVnftPE+DnZtdAIXaoS1sv29yJ71QJWCUjBxqrWdvpHn/MF5JDKGh98Ud4Z/ldmGbsDkJd7PsMMERuFiCZHA+Jer4qkztVjImcHrJREJlPriu9yYjmGo8Lhj58denpXYN18rgVJJl95T4cGE7liiTw2hl5P/1EBni2Ojt6/IjnINty7peHbdKEEGcaXmCyi+WDx8ZPCL+RapFU81Rt2xfu7waIME9Q9/vD4Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Delivery-date: Tue, 16 Aug 2022 11:23:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.08.2022 18:54, Dylanger Daly wrote:
> Please see the attached dom0 dmesg log, verbose lspci output and a tar of all 
> SSDT and DSDT decompiled ACPI tables.

The only way I can currently explain all aspects of the behavior that
I'm aware of is for Dom0's kernel somehow not identifying the page
that ACPI wants to map (via ioremap_cache()) as identity mapped. As
far as ACPI goes, this is what I read out of the tables:

In SSDT27.dsl we have

    Scope (\_SB.PCI0.GP17.AZAL)
    {
        Method (_PS0, 0, NotSerialized)  // _PS0: Power State 0
        {
            Acquire (\M27E, 0xFFFF)
            M460 ("FEA-ASL-\\_SB.PCI0.PBC.AZAL._PS0 CpmAzaliaPresentState = 
1\n", Zero, Zero, Zero, Zero, Zero, Zero)
            M279 = One
            M276 ()
            Release (\M27E)
        }

M276() then invokes

                Local0 = M017 (Zero, 0x08, One, 0x19, Zero, 0x08)

with M017() located in SSDT16.dsl:

    Method (M017, 6, Serialized)
    {
        Local0 = M083 /* \M083 */
        Local1 = (M083 >> 0x14)
        Local2 = (Local1 & 0x0F00)
        Local2 += 0x0100
        If (((Local1 + Arg0) >= Local2))
        {
            Local3 = 0x7FFFFFFF
            Local3 |= 0x80000000
            Local4 = ((Local3 >> Arg4) & (Local3 >> (0x20 - Arg5)
                ))
            Return (Local4)
        }

        Local0 += (Arg0 << 0x14)
        Local0 += (Arg1 << 0x0F)
        Local0 += (Arg2 << 0x0C)
        Return (M013 (Local0, Arg3, Arg4, Arg5))
    }

M013 carries out the actual memory access (32 bits at offset 0x19 from
Local0 that was determined here; oddly enough a mis-aligned access,
but that itself isn't a problem). The base address therefore is M083
offset by (0 << 0x14) + (8 << 0xf) + (1 << 0xc) = 0x41000 if I got
things right.

M083 in turn is a field in

    OperationRegion (CPNV, SystemMemory, 0x7AF67018, 0x000100F7)
    Field (CPNV, AnyAcc, Lock, Preserve)
    {
        M082,   32, 
        M083,   32, 
        M084,   32, 
        ...

so the first few words of machine memory at 0x7af67018 would be of
interest (assuming of course that address doesn't change across
boots). 0x7af67018 itself is within the ACPI NVS range. Could you
perhaps obtain this from one of the /proc or /sys interfaces (perhaps
from a native kernel), or should I make a debugging patch for the
hypervisor? (Making one right away, with further logging added,
doesn't seem useful until it's clear whether you can actually also
observe output slightly before the actual crash, which has a risk of
being overwritten or scrolling off the screen.)

The situation of course isn't helped by the kernel's PFN <-> MFN
translation asymmetry in pte_pfn_to_mfn() nor pte_mfn_to_pfn()'s
anomaly (as already noted over two years ago in
https://lists.xen.org/archives/html/xen-devel/2020-05/msg00549.html),
albeit the exception error code suggests that the former is what is
getting in the way (and what would then also result in entirely
silent mapping failure). While I would like to patch the kernel at
least as much for the PFN/MFN to survive and hence appear in the page
table entry dump associated with the page fault, I'm afraid the
resulting entry could be recognized as a swap one. Such a patch could
hence only be used for debugging purposes when no swap space is in
use.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.