[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated nics



On 22.09.2011 12:30, Stefano Stabellini wrote:
> On Wed, 21 Sep 2011, Stefan Bader wrote:
>> On 21.09.2011 15:31, Stefano Stabellini wrote:
>>> On Wed, 21 Sep 2011, Stefan Bader wrote:
>>>> This is on 3.0.4 based dom0 and domU with 4.1.1 hypervisor. I tried using 
>>>> the
>>>> default 8139cp and ne2k_pci emulated nic. The 8139cp one at least comes up 
>>>> and
>>>> gets configured via dhcp. And initial pings also get routed and done 
>>>> correctly.
>>>> But slightly higher traffic (like checking for updates) hangs. And after a 
>>>> while
>>>> there are messages about tx timeouts.
>>>> The ne2k_pci type nic almost immediately has those issues and never comes 
>>>> up
>>>> correctly.
>>>>
>>>> I am attaching the dmesg of the guest with apic=debug enabled. I am not 
>>>> sure how
>>>> this should be but both nics get configured with level,low IRQs. Disk 
>>>> emulation
>>>> seems to be ok but that seem to use IO-APIC-edge. And any other IRQs seem 
>>>> to be
>>>> at least not level.
>>>
>>
>>> Does the e1000 emulated card work correctly?
>>
>> Yes, that one seems to work ok.
>>
>>> What happens if you disable interrupt remapping (see patch below)?
>>
>> 8139cp seems to work correctly now (much higher irq stats as well) and e1000
>> still works. Both then using IOAPIC-fasteoi.
>>
> 
> That means there must be another subtle bug in Xen in interrupt
> remapping that only affects 8139p emulation
> 
Right, or to be complete:
- e1000: ok
- 8139cp: unstable (setup is possible)
- ne2k_pci: not working (tx problems from the beginning)

The behaviour feels a bit like interrupts may get lost if occurring at a higher
rate. Why this affects various drivers differently is a bit weird.
> 
>>>> Another problem came up recently though that may just be me doing the wrong
>>>> thing. Normally I boot with xen_emul_unplug=unnecessary as I want the 
>>>> emulated
>>>> devices. xen-blkfront is a module in my case and I thought I once had been 
>>>> able
>>>> to use that by removing the unplug arg and making the blkfront driver 
>>>> load. But
>>>> when I recently tried the module loaded but no disks appeared... Again, 
>>>> not sure
>>>> I just forgot how to do that right or that was different when using a 4.1.0
>>>> hypervisor still...
>>>  
>>> xen_emul_unplug=unnecessary allows the kernel to use PV interfaces on
>>> older hypervisors that didn't support the unplug protocol and had other
>>> ways to cope with multiple drivers accessing the same devices.
>>> You can use xen_emul_unplug=never to prevent any unplug but you won't
>>> get any PV interfaces.
>>
>> Hm, odd. Somehow I thought that I had been using pv interfaces that way when 
>> the
>> interrupts for the emulated ide was broken.
>> A bit suboptimal atm, because without any option and a kernel compiled with 
>> the
>> platform pci and pv drivers (as modules) booting in HVM mode the kernel 
>> decides
>> that having both is no use and unplugs the emulated devices. Which then 
>> leaves
>> you with ... none.
> 
> In theory you would have the PV frontend modules in the initrd.
> On the other hand having both can easily cause data corruptions on your
> drive.

They _are_ in the initrd. And the boot rightfully drops to a maintenance shell
right now (without any argument and the emulated devices unplugged). And
"modprobe xen-blkfront" loads the module but it does _not_ detect any pv device.

-Stefan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.