[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Xen vm kernel crash in get_free_entries.



On Thu, Nov 07, 2013 at 01:47:03PM +0000, Ian Campbell wrote:
> On Thu, 2013-11-07 at 09:20 +0400, Astarta wrote:
> > Hello,
> > 
> > Let me bring some new life to this discussion.
> > 
> > I've investigated a bit and found another way to make  kernels starting 
> > from 3.8.x to boot on the VMs with platform device_id 0002.
> > Reverting of xen-grant-table-correctly-initialize-grant-table-version-1 
> > patch is not necessary.
> > 
> > We can simply modify struct pci_device_id platform_pci_tbl[] (in 
> > drivers/xen/platform-pci.c) to respect 0002 and 0000 device ids.
> > That makes the kernel (3.8.x and 3.11.6) to boot correctly, disks and 
> > network are also recognized.
> 
> I think this is just working around the problem, by avoiding the
> situation where the error occurs. You could just as well switch to
> platform device id < 2.

I am bit late to this discussion - but shouldn't there be something
in the kernel to deal with this?

> 
> > IMO, there is no need to add new fields with device id 0002 and device 
> > id 0000 to platform_pci_tbl[] , we can modify the existing one to use 
> > PCI_ANY_ID instead of PCI_DEVICE_ID_XEN_PLATFORM (which is 0001), so if 
> > we have PCI_VENDOR_ID_XEN there is no need to pay attention on device id.
> 
> That omits the possibility that a future rev might differ in some
> meaningful way though.
> 
> Ian.
> 
> > 
> > So the patch is more than simple. See attached. I've tested the resulted 
> > kernel in my environment (with device ids 0002, 0001 and 0000) and it 
> > seems to work well.
> > 
> > 
> > --
> > Marina
> > 
> > On 10/21/2013 02:55 PM, Matt Wilson wrote:
> > > On Sat, Oct 19, 2013 at 01:58:50PM +0200, Sander Eikelenboom wrote:
> > >> Saturday, October 19, 2013, 1:03:17 PM, you wrote:
> > >>
> > >>> On Sat, 2013-10-19 at 14:51 +0400, Astarta wrote:
> > >>>> On 10/19/2013 03:14 AM, Sander Eikelenboom wrote:
> > >>>>> makes a HVM guest (qemu-xen-traditional) with xen_platform_pci=0 boot 
> > >>>>> again using xl, haven't tested it with xend.
> > >>>>>
> > >>>> Great catch!
> > >>>> I also confirm that 3.11.5 kernel boots just fine after reverting of
> > >>>> 'correctly initialize grant table version 1' patch.
> > >>> This could just be down to that patch adding some BUG_ONs to catch bad
> > >>> things going on, e.g. the one in gnttab_expand which I think is being
> > >>> hit here.
> > >>> I have a feeling that it is still wrong (but just more benign) to be
> > >>> hitting that call chain in a configuration where there is no platform
> > >>> device driver running. IOW reverting that patch removes the obvious
> > >>> symptom (blowing up) but not the root cause, i.e. the patch is doing its
> > >>> job.
> > >> That was my suspicion too, but at least it seems like some starting point
> > >> of further debugging.
> > >> (and indication of the kernels affected since this commit went to stable 
> > >> as well)
> > >>
> > >> Since i was still seeing the "Booting PV enabled guest on Xen HVM" is 
> > >> was wondering
> > >> what is supposed to happen when there are some combinations ....
> > > This is the enlightenment code noticing that it's running in a HVM
> > > guest under Xen via the hypervisor cpuid leaf (cpuid leaf
> > > 0x40000000).
> > >
> > >> xen HVM xen_platform_pci=0 + guest kernel without PV guest support and 
> > >> without xen pv drivers (net + block)
> > > This should work.
> > >
> > >> xen HVM xen_platform_pci=0 + guest kernel with PV guest support but 
> > >> without xen pv drivers (net + block)
> > > This should work.
> > >
> > >> xen HVM xen_platform_pci=0 + guest kernel with PV guest support and with 
> > >> xen pv drivers (net + block)
> > >> -- This is the configuration that hits the bug described here.
> > > I don't see how this can be expected to work - the PV net and block
> > > devices need the facilities that are initialized by the Xen platform
> > > PCI device to operate. Of course it shouldn't crash either, it should
> > > just use emulated devices instead of xen-netfront/xen-blkfront.
> > >
> > >> xen HVM xen_platform_pci=1 + guest kernel without PV guest support and 
> > >> without xen pv drivers (net + block)
> > > This should work.
> > >
> > >> xen HVM xen_platform_pci=1 + guest kernel with PV guest support and 
> > >> without xen pv drivers (net + block)
> > > This should work.
> > >
> > >> xen HVM xen_platform_pci=1 + guest kernel with PV guest support and with 
> > >> xen pv drivers (net + block)
> > > This should work.
> > >
> > >> Booting a guest kernel with PV support as HVM but without using PV 
> > >> doesn't seem possible with a .cfg option ?
> > >> (yes it's a hypothetical option (performance wise), as is running with a 
> > >> guest kernel which supports PV drivers,
> > >>   but not using them with xen_platform_pci=0 .. but it is useful for 
> > >> debugging )
> > > AFAICT the expected behavior would be to for the guest kernel to use
> > > basic enlightenment for CPU operations (hotplug, timers) but no PV IO
> > > support (net + block). But perhaps I'm missing something since you
> > > theoretically don't need the PCI device if you have event channel
> > > callback support in the guest kernel and sufficient support in the
> > > hypervisor.
> > >
> > > --msw
> > 
> > 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.