[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PVH dom0 creation fails - the system freezes



>>> On 08.08.18 at 10:51, <roger.pau@xxxxxxxxxx> wrote:
> On Wed, Aug 08, 2018 at 09:43:39AM +0100, Paul Durrant wrote:
>> > -----Original Message-----
>> > From: Roger Pau Monne
>> > Sent: 08 August 2018 09:08
>> > To: bercarug@xxxxxxxxxx 
>> > Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; xen-devel <xen-
>> > devel@xxxxxxxxxxxxxxxxxxxx>; David Woodhouse <dwmw2@xxxxxxxxxxxxx>;
>> > Jan Beulich <JBeulich@xxxxxxxx>; Belgun, Adrian <abelgun@xxxxxxxxxx>
>> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
>> > 
>> > On Wed, Aug 08, 2018 at 10:46:40AM +0300, bercarug@xxxxxxxxxx wrote:
>> > > On 08/02/2018 04:55 PM, Roger Pau Monné wrote:
>> > > > Please try to avoid top posting.
>> > > >
>> > > > On Thu, Aug 02, 2018 at 11:36:26AM +0000, Bercaru, Gabriel wrote:
>> > > > > I applied the match mentioned, but the system fails to boot. 
>> > > > > Instead, it
>> > > > > drops to a BusyBox shell. It seems to be a file system issue.
>> > > > So you have applied 173c7803592065d27bf2e60d50e08e197a0efa83 and it
>> > > > causes a regression for you?
>> > > >
>> > > > As I understand it, before applying 173c780359206 you where capable of
>> > > > booting the PVH Dom0, albeit with non-working USB?
>> > > >
>> > > > And after applying 173c780359206 you are no longer able to boot?
>> > > Right, after applying 173c780359206 the system fails to boot (it drops to
>> > > the BusyBox shell).
>> > > > > Following is a sequence of the boot log regarding the file system 
>> > > > > issue.
>> > > > At least part of the issue seems to be that the IO-APIC information
>> > > > provided to Dom0 is wrong (from the attached log):
>> > > >
>> > > > [    0.000000] IOAPIC[0]: apic_id 2, version 152, address 0xfec00000, 
>> > > > GSI 
> 0-
>> > 0
>> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>> > > > [    0.000000] Failed to find ioapic for gsi : 2
>> > > > [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high 
> level)
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>> > > > [    0.000000] Failed to find ioapic for gsi : 9
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 1
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 3
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 4
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 5
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 6
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 7
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 8
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 9
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 10
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 11
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 12
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 13
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 14
>> > > > [    0.000000] ERROR: Unable to locate IOAPIC for GSI 15
>> > > >
>> > > > Can you try to boot with just the staging branch (current commit is
>> > > > 008a8fb249b9d433) and see how that goes?
>> > > >
>> > > > Thanks, Roger.
>> > > >
>> > > I recompiled Xen using the staging branch, commit 008a8fb249b9d433 and
>> > the
>> > > system boots,
>> > 
>> > OK, so your issues where not caused by 173c780359206 then?
>> > 
>> > 008a8fb249b9d433 already contains 173c780359206 because it was
>> > committed earlier. In any case it's good to know you are able to boot
>> > (albeit with issues) using the current staging branch.
>> > 
>> > > however the USB problem persists. I was able to log in using the serial 
> port
>> > > and after executing
>> > 
>> > Yes, the fixes for this issue have not been committed yet, see:
>> > 
>> > https://lists.xenproject.org/archives/html/xen-devel/2018- 
>> > 08/msg00528.html
>> > 
>> > If you want you can give this branch a try, it should hopefully solve
>> > your USB issues.
>> > 
>> > > "xl list -l" the memory decrease problem appeared.
>> > 
>> > Yup, I will look into this now in order to find some kind of
>> > workaround.
>> > 
>> > > I attached the boot log. Following are some lines extracted from the log,
>> > > regarding the USB
>> > >
>> > > devices problem:
>> > >
>> > > [    5.864084] usb 1-1: device descriptor read/64, error -71
>> > >
>> > > [    7.564025] usb 1-1: new full-speed USB device number 4 using xhci_hcd
>> > > [    7.571347] usb 1-1: Device not responding to setup address.
>> > >
>> > > [    8.008031] usb 1-1: device not accepting address 4, error -71
>> > >
>> > > [    8.609623] usb 1-1: device not accepting address 5, error -71
>> > >
>> > >
>> > > At the beginning of the log, there is a message regarding
>> > > iommu_inclusive_mapping:
>> > >
>> > >
>> > > (XEN) [VT-D]found ACPI_DMAR_RMRR:
>> > > (XEN) [VT-D]  RMRR address range 3e2e0000..3e2fffff not in reserved
>> > memory;
>> > > need "iommu_inclusive_mapping=1"?
>> > > (XEN) [VT-D] endpoint: 0000:00:14.0
>> > >
>> > >
>> > > I mention that I tried to boot again using this command line option, but
>> > > this message and the
>> > >
>> > > USB messages persist.
> 
> Does USB work despite of the warning message?
> 
>> > Yes, iommu_inclusive_mapping doesn't work for PVH, that's what my
>> > patch series is trying to address. The error is caused by
>> > missing/wrong RMRR regions in the ACPI tables.
>> > 
>> 
>> Looks like this warning is suggesting that there is an RMRR that falls 
> outside of an E820 reserved region. For PV I guess that 'inclusive' will fix 
> this, but for PVH 'reserved' isn't going to fix it. I hope that the range at 
> least falls in a hole, so maybe we also need a dom0_iommu=holes option too? 
> Ugly, but maybe necessary.
> 
> I wanted to avoid adding such option because I think it's going to
> interact quite badly with BAR mappings.
> 
> Maybe it would make sense to add RMRR regions as long as they don't
> reside in a RAM region and just print the warning message?

But the BAR problem would then still exist, unless we hand Dom0 a
fixed up E820 table.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.