[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH 0/5] Add MSI support to XEN



Keir, when I try to get the ip address today, I suddenly found I can't
reproduce it anymore, also orginally if I removed the code that trigger
the software LSC interrupt, the NIC can still work and get IP address,
but now if I remove that code, the NIC can't work anymore. 
It is really strange to me, I did't change anything to the system. Also
I don't know any changes in the lab environment that may cause this
change. But I do can reproduce it before each time.

Really frustrated to get this :-( , do you think we still need move the
config space access down, now the only reasons to move this down is,
ack_edge_ioapic_irq() did the mask, and this mask can make HV more
robust.

Thanks
-- Yunhong Jiang


Jiang, Yunhong <> wrote:
> xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote:
>> On 28/3/08 08:40, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>> 
>>> The investigation result is,
>>> 1) if mask and ack the interrupt, the interrupt will happen 3 times,
the
>>> last 2 is masked because they happened when the first one is still
>>> pending for ISR's handler, the system is ok.
>> 
>> How can you tell it happened three times? If the interrupt is pending
in
>> the ISR then only one further pending interrupt can become visible
>> to software
>> as there is only one pending bit per vector in the IRR.
> 
> There are two type of msi interrupt, one for receive/transmit,
> one for other (this is the one cuase storm). I add printk if
> interrupt happen while previous is in progress. Then I added
> the print number and the output in /prot/interrupt. The output in
> /prco/interrupt is only 1. 
> 
>> 
>>> So I suppose the problem happens only if trigger the interrupt by
>>> software. I consulted the HW engineer also but didn't get
confirmation,
>>> the only answer I got is, the PCI-E need a rising edge before send
the
>>> 2nd interrupt :(
>> 
>> That answer means very little to me. One interesting question to have
>> answered would be: is this a closed-loop or open-loop
>> interrupt storm? I.e.,
>> does the device somehow detect API EOI and then trigger
>> re-send of the MSI
>> (closed loop) or is this an initialisation-time-only open-loop
>> storm where
>> the device is spitting out the MSI regularly until some device
register
>> gets written by the interrupt service routine?
>> 
>> Given the circumstances, I'm inclined to think it is the
>> latter. Especially
>> since I think the former is impossible as EPIC EOI is not
>> visible outside
>> the processor unless the interrupt came from a level-triggered
IO-APIC pin,
>> and even then the EOI would not be visible across the PCI bus!
>> 
>> Also it seems *very* likely that this is just an
>> initialisation-time thing,
>> and the device probably behaves very nicely after it is
>> bootstrapped. In
> 
> I can't tell this becuase this interrupt didn't happen again
> after the device is up. Maybe I can change the driver to do more
> experiement. 
> 
>> light of this I think we should treat MSI sources as
>> ACKTYPE_NONE in Xen
>> (i.e, require no callback from guest to hypervisor on completion of
the
>> interrupt handler). We can then handle the interrupt storm
>> entirely within
>> the hypervisor by detecting the storm and masking the
>> interrupt and only
>> unmasking on some timeout.
>> 
>> In your tests, how aggressive was the IRQ storm? If you looked at the
>> interrupted EIP on each interrupt, was it immediately after
>> the APIC was
>> EOIed and EFLAGS.IF set back to 1, or was it some time after?
>> This tells us
>> how aggressively the device is sending out EOIs, and may determine
how
>> cunning we need to be regarding interrupt storm detection.
> 
> I will try that.
> 
>> 
>>> I'm not sure if there are any other BRAIN-DEAD device like this, I
only
>>> have this device to test MSI-X function, but we may need make sure
it
>>> will not break the whole system.
>> 
>> Yes, we have to handle this case, unfortunately.
>> 
>>> The call-back to guest because we are using the ACK-new method to
work
>>> around this issue. Yes, it is expensive, Also, this ACK-new method
may
>>> cause deadlock as Haitao suggested in the mail.
>> 
>> Yes, that sucks. See my previous email -- if possible it would
>> be great to
>> teach Xen enough about the PCI config space to be able to mask MSIs.
> In fact, currently xen is already tryting to access config
> space, althought that is a bug still currently. In vt-d, xen try to
access
> FLR directly :) 
> 
>> 
>>> But if we move the config space to HV, then we don't need this
ACK-new
>>> method, that should be ok, but admittedly, that should be the last
>>> method we we turn to, since config-space should be owned by domain0.
>> 
>> A partial movement into the hypervisor may be the best of a
>> choice of evils.
> 
> Sure, we will do that!
> 
>> -- Keir
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.