Hello,
On Mon, Dec 03, 2018 at 09:06:37AM -0700, Rian Quinn wrote:
> > Can you trace this to the Linux code that's actually making the call
> > by injecting a trap when this happens?
>
> Yes, we can. In some cases, we have to manually backtrace, but so far
> we have been able to map resources to the actual source code.
>
> > Serial port poking?
>
> This would be a great one to locate in the kernel. I suspect that
> serial is the case, but if that is true, something is a bit wrong as
> once again, this device doesn't exist without QEMU.
Maybe Linux pokes at this port in order to check whether the device
exists?
The fact that the device doesn't exist doesn't prevent a guest from
poking at this port, and IMO it's a legit thing to do. Returning all
1s (like bare metal) should be OK and would actually signal Linux
there's no register there and thus no device.
> There is also a
> little bit of testing that we should do here. Right now we manually
> pass-through a serial device for UART debugging, and that might have
> the side effect of this port showing up so I would want to rule that
> out first.
>
> > APs for PVH can be started using the native way, which means they are
> > started in real mode, that's why Linux uses the real mode trampoline.
>
> Ah... ok. That makes sense. Uhg... emulating INIT/SIPI is no fun. That
> is some pretty fragile code.
It's the same code that we already use for HVM guests, since PVH
guests get an emulated LAPIC like HVM ones.
> > Legacy ROMs from which device?
>
> Video BIOS was one of them. There are several memory regions within
> legacy BIOS that are being scanned so my assumption is that these
> regions are some ROMs, and I am not really sure why PVH would execute
> that logic at all.
Xen signals in the FADT that there's no VGA, but I won't be surprised
that some OSes simply ignore this bit because there are systems with
broken ACPI tables out there with the bit set and VGA.
> I am pretty sure that it is scanning for MP tables
> as I think I traced that specific logic back to the Linux kernel.
There's no other way to detect MP tables rather that scanning the
different positions where they can be found, so I think it's fine for
Linux to do so.
> I
> know for sure that DMI is being scanned as well. Right now we map in a
> read-only zero page and that works fine, but I would think that a lot
> of this logic would not be needed in the Guest case. Dom0 is another
> story.
IMO we should try to limit as much as possible the PVH specific
modifications that we have to make to guests. So it's better to let
the guest scan memory or poke at IO ports rather than add a specific
'is running on PVH' check to each device driver that we know it's not
available when running as PVH.
Poking at such ports or scanning memory is exactly the same that's
done on bare metal, and should work fine on PVH to detect the absence
of certain devices.
Thanks, Roger.
>
> > Hello,
> >
> > Thanks, this is very interesting.
> >
> > On Sat, Dec 01, 2018 at 09:21:00AM -0700, Rian Quinn wrote:
> > > We finally have a Linux PVH guest up and running (using an initramfs
> > right
> > > now). I have posted a quick status update video on YouTube that shows our
> > > progress of getting a Windows Dom0 working (which is one of the many
> > goals
> > > of our research).
> > >
> > > As promised in the x86 Community Call, here is the list of things that a
> > > PVH Linux guest requires. You can see the code for this here:
> > >
> > > and here:
> > >
> > >
> > > I would love to put this information somewhere in Xen's project (i.e.
> > wiki
> > > or source), but I am not sure what you would prefer. Any ideas?
> > >
> > > Finally, keep in mind that we will likely keep adding to this list as we
> > > add more features (like front/back support, xenstore, etc...)
> > >
> > > Thanks,
> > > - Rian
> > >
> > > CPUID:
> > > - XEN_CPUID_LEAF(0)
> > > - XEN_CPUID_LEAF(1)
> > > - XEN_CPUID_LEAF(2)
> > > - XEN_CPUID_LEAF(4)
> > > - 0x0, 0x1, 0x2, 0x4, 0x6, 0x7, 0xA, 0xB, 0xD, 0xF, 0x10, 0x15, 0x16
> > > - 0x80000000, 0x80000001, 0x80000002, 0x80000003, 0x80000004
> > > - 0x80000007, 0x80000008
> > >
> > > MSRs:
> > > - Hypercall page (dynamic)
> > > - ia32_star
> > > - ia32_lstar
> > > - ia32_cstar
> > > - ia32_fmask
> > > - ia32_kernel_gs_base
> > > - ia32_pat
> > > - ia32_efer
> > > - ia32_fs_base
> > > - ia32_gs_base
> > > - ia32_sysenter_cs
> > > - ia32_sysenter_eip
> > > - ia32_sysenter_esp
> > > - ia32_apic_base
> > > - platform_info
> > > - 0x34, 0x64E, 0x140, 0x1A0, 0x6e0
> > >
> > > IO Ports (some of these are odd):
> > > - 0xCF8 - 0xCFF
> > > - 0x4D0 (odd since PIT and ACPI is disable for everything that might need
> > > this)
> >
> > Likely some poking for EISA devices? (same for 0x4D1)
> >
> > Can you trace this to the Linux code that's actually making the call
> > by injecting a trap when this happens?
> >
> > > - 0x4D1