[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [Q] Device error handling discussion -- Was: Is qemu used when we use VTd?



Yuji Shimada <mailto:shimada-yxb@xxxxxxxxxxxxxxx> wrote:
> On Thu, 16 Oct 2008 15:32:40 +0800
> "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>
>>>>>    Non-fatal error on I/O device:
>>>>>        - kill the domain with error source function.
>>>>>        - reset the function.
>>>>
>>>>> From following staement in PCI-E 2.0 section 6.6.2:
> "Note that Port
>>>> state machines associated with Link functionality including those
>>>> in the Physical and Data Link Layers are not reset by FLR", I'm not
>>>> sure if FLR is a right method to handle the error situation. That's
>>>> the reason I asked on how to handle multiple-function devices.
>>>
>>> I think Non-fatal error is transaction's error and it does not require
>>> to reset lower layer. But I am not sure.
>>
>> By default, the data link layer's error is fatal, but the result
>> depends on how driver setup it.  We can trap the access to AER
>> register, and make sure data link layer error always report as
>> fatal. That is easy to implement.
>
> It means non-fatal error is transaction layer's error, with default
> setting. When non-fatal error occurs on I/O device, FLR seems to recover it.
>
>
>>>
>>>>>    Non-fatal error on PCI-PCI bridge.
>>>>>        - kill all domains with the functions under the PCI-PCI bridge.
>>>>>        - reset PCI-PCI bridge and secondary bus.
>>>>>
>>>>>    Fatal error:
>>>>>        - kill all domains with the functions under the same root port.
>>>>>        - reset the link (secondary bus reset on root port).
>>>>
>>>> Agree. Basically I think the action of "reset PCI-PCI bridge and
>>>> secondary bus" or "reset the link" has been done by AER core
>>>> already. What we need define is PCI back's error handler. In first
>>>> step, the error handler will trigger domain reset, in future, more
>>>> elegant action can be defined/implemented, Any idea?
>>>
>>> I agree with you basically.
>>>
>>> Current AER core does not reset PCI-PCI bridge and secondary bus,
>>> when Non-fatal error occurs on PCI-PCI bridge. We need to implement
>>> resetting PCI-PCI bridge and secondary bus.
>>
>> I'd keep the AER core as current-is unless some special reason. For
>> example, why should we kill all domains under same root port and
>> reset root port's secondary link? Currently it will do so only if
>> the impacted device has no aer service register.
>
> On linux 2.6.27, there is aer driver which bind to root port. But
> there is no aer driver for other device. So When fatal error occurs,
> linux resets root port's secondary link.
>
>        drivers/pci/pcie/aer/aerdrv.c:aer_root_reset
>
>
>> Also not sure if we need reset the link for non-fatal error if AER
>> core does not do that. Are there any special difference between
>> virtualization/native situation?
>
> No. There is no difference.
> I agree with you to keep the AER core as current-is.
>
>
>>>>> Note: we have to consider to prevent device from destroying other
>>>>> domain's memory.
>>>>
>>>> Why should we consider destroy other domain's memory? I think VT-d
>>>> should gurantee this.
>>>
>>> The device is re-assigned to dom0 on destroying HVM domain. If we
>>> destroy domain before resetting the device, I/O device can write
>>> memory of dom0. On the other hand, we have to stop guest software
>>> before resetting the device to prevent guest software from accessing
>>> device.
>>
>> That should same to normal VT-d situation. We need FLR before we
>> re-assign device to dom0 (If current not working like this, it
>> should be a bug).  Also, to stop guest software before resetting the
>> device maybe helpful, but maybe not so important. Do you think
>> guest's second access will cause host impacted?  After all, even on
>> native environment this is guranted unless platform support it. (It
>> is said PPC has such support).
>>
>> BTW, you stated "We have to solve many difficulties to keep guest
>> domain running", can you give some detail difficulties (it maybe
>> difficult to HVM, but not sure for PV side)?
>
> - HVM
For HVM, yes, it is tricky, and we have no plan for it till now.

>        * Implementing root port emulator in ioemu.
>        * Implementing memory mapped configuration access mechanism
> for guest os.
>        * Enhancing guest aml to allow guest os to handle aer.
>        * Mapping host error to guest error.

This should be the the tricky one considering:
1) How to translate the TLP for the header log register?
2) Need to map the Source Identification register.


>        * Interaction between ioemu and pci back driver.
The main difficult is mapping the error_handler to AER register operation.

>        * Handling when guest does not work fine.
>
> - PV
>        * Notifying pciback to pcifront.
>        * Handling when guest does not work fine.

We are working on this now.

>
>>> By the way, do you have any plan to implement these function?
>>> I can provide the idea. But I can't provide the code.
>>
>> Yes, we try to work on it. But we may have not enough environment to
>> test all types of error. Also although the AER code can be
>> backported easily, some required ACPI fix is more challenge.
>
> I'm not sure backporting is good. In the long term, dom0 linux will be
> based on newer linux. How/When developers(we) can switch it to newer
> one? I'd like other developer's comment.

At least we need do that for internal testing. Not sure when kernel update will 
happen.

>
> Thanks,
> --
> Yuji Shimada

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.