[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 0/4] x86/PVH: Dom0 building adjustments
On 01.09.2021 18:13, Roger Pau Monné wrote: > On Wed, Sep 01, 2021 at 04:19:40PM +0200, Jan Beulich wrote: >> On 01.09.2021 15:56, Roger Pau Monné wrote: >>> On Tue, Aug 31, 2021 at 10:53:59AM +0200, Jan Beulich wrote: >>>> On 30.08.2021 15:01, Jan Beulich wrote: >>>>> The code building PVH Dom0 made use of sequences of P2M changes >>>>> which are disallowed as of XSA-378. First of all population of the >>>>> first Mb of memory needs to be redone. Then, largely as a >>>>> workaround, checking introduced by XSA-378 needs to be slightly >>>>> relaxed. >>>>> >>>>> Note that with these adjustments I get Dom0 to start booting on my >>>>> development system, but the Dom0 kernel then gets stuck. Since it >>>>> was the first time for me to try PVH Dom0 in this context (see >>>>> below for why I was hesitant), I cannot tell yet whether this is >>>>> due further fallout from the XSA, or some further unrelated >>>>> problem. >>> >>> Iff you have some time could you check without the XSA applied? I have >>> to admit I haven't been testing staging, so it's possible some >>> breakage as slipped in (however osstest seemed fine with it). >> >> Well, I'd rather try to use the time to find the actual issue. From >> osstest being fine I'm kind of inferring this might be machine >> specific, or this might be due to yet some other of the overly many >> patches I'm carrying. So if I can't infer anything from the stack >> once I can actually dump that, I may indeed need to bisect my pile, >> which would then also include the XSA-378 patches (as I didn't have >> time to re-base so far). >> >>>>> Dom0's BSP is in VPF_blocked state while all APs are >>>>> still in VPF_down. The 'd' debug key, unhelpfully, doesn't produce >>>>> any output, so it's non-trivial to check whether (like PV likes to >>>>> do) Dom0 has panic()ed without leaving any (visible) output. >>> >>> Not sure it would help much, but maybe you can post the Xen+Linux >>> output? >> >> There's no Linux output yet by that point (and either >> "earlyprintk=xen" doesn't work in PVH mode, or it's even too early >> for that). All Xen has to say is >> >> (XEN) Dom0 callback via changed to Direct Vector 0xf3 >> (XEN) vmx.c:3265:d0v0 RDMSR 0x0000064e unimplemented >> (XEN) vmx.c:3265:d0v0 RDMSR 0x00000034 unimplemented > > Weird, I don't see why earlyprintk=xen shouldn't work in PVH mode, > unless it's not properly wired up. Certainly needs checking and > fixing, or else we won't be able to make much progress I think. Right - I'm intending to check this, including whether at least xen_raw_console_write() would work. >>>> Correction: I did mean '0' here, producing merely >>>> >>>> (XEN) '0' pressed -> dumping Dom0's registers >>>> (XEN) *** Dumping Dom0 vcpu#0 state: *** >>>> (XEN) *** Dumping Dom0 vcpu#1 state: *** >>>> (XEN) *** Dumping Dom0 vcpu#2 state: *** >>>> (XEN) *** Dumping Dom0 vcpu#3 state: *** >>>> >>>> 'd' output supports the "system is idle" that was also visible from >>>> 'q' output. >>> >>> Can you dump the state of the VMCS and see where the IP points to in >>> Linux? >> >> Both that and the register dumping I have meanwhile working tell >> me that it's the HLT in default_idle(). IOW Dom0 gives the impression >> of also being idle, at the first glance. The stack pointer, however, >> is farther away from the stack top than I would have expected, so it >> may still have entered default_idle() for other reasons. >> >> The VMCS also told me that the last VM entry was to deliver an >> interrupt at vector 0xf3 (i.e. the "callback" one). > > That's all quite weird. Did dom0 setup the vCPU timer? Ah - I had meant to check active timers, but then forgot. Otoh I thought I could observe vCPU0 waking up from HLT, as RIP in the registers dumped has been pointing either at it or right past it. Now that I write this I'm wondering though whether that's an artifact rather than reflection of something that's really happening, in particular because of this (XEN) RSP = 0xffffffff81c03eb8 (0xffffffff81c03eb8) RIP = 0xffffffff814be422 (0xffffffff814be423) in the VMCS dump. > What version of Linux are you using? 5.13.2; didn't get around to switching to 5.14 yet, but I also don't expect this to make a difference. > It seems to get stuck very early (or either fail to output anything > while booting), which seems unlikely to be related to your specific > hardware. Well, it can't be extremely early - I see the ACPI IRQ getting set up (from "iommu=debug" output mentioning GSI 9), and I see PCI device BARs being played with (from debug messages I had added to vPCI to monitor what P2M adjustments are being requested). As said on another sub-thread, I get all the way through start_kernel() and rest_init(), just that apparently some of the steps don't do what they're supposed to do. I'm meanwhile wondering whether I'm using a badly configured kernel, i.e. whether there are any Kconfig settings which I ought to enable, but which aren't "select"-ed nor have proper "depends on". What I did is simply take my XEN_PV=y config, replacing that by XEN_PVH=y. I did observe that this let XEN_DOM0 go off, but according to my checking (at the time) nothing this crucial should have been affected by that. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |