[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [RFC] RAS(Part II)--MCA enalbing in XEN
Thanks for your reply. Let me explain my comments a little: Jiang, Yunhong wrote: One notice is, we delieve vMCE to dom0/domU only when it is impacted. The idea behind this is, MCE is handled by Xen HV totally, while guest's vMCE handler will only works for itself. For example, when a page broken, Xen will firstly mark the page offline in Xen side (i.e. take the recover action), then, it will inject a vMCE to guest corresponding (dom0 or domU), the guest will kill the application using the page, free the page, or do more action. And we always pass the vIRQ to dom0 for logging and telemetry, user space tools can take more proactive action for this if needed. I understand this part, and have no problems with them mechanism itself. I think it has advantages over the original concept, where dom0 informs domUs. My question is: what useful action can a domU take without fully knowing the physical system? I'll go more in to that below. What would be needed for the Solaris framework, however, is to provide information on what action was taken, along with the telemetry. AsAgree that this modification is needed. Sorry we didn't reliaze the requirement from Dom0 after reboot. Either we can pass the action in the telemetry, or Dom0 can take action specific method ,like retrieve the offlined page from Xen before reboot. If we take the former, we may need a interface definition. Passing the action along with the telemetry seems the best way to go to me. Since the telemetry is used to determine which action to take, any information on actions already taken should come at the same time. \ What do you mean of the effect of wrmsr instruction. We need considering inject #GP if invalid wrmsr , or remove the event when guest clear the MCi_STATUS_MCA if needed. We send this RFC early to get feedback firstly for the design idea. Or you mean more than this for the wrmsr?To take further action, the MCA code in dom0 (or a domU) needs to know that it is running under Xen, and it needs to have detailed physicalOur purpose is guest has no idea it is running under xen as descripted above. And what information do you think a normal guest's MCA handler needs to know, and use the detailed physical information? After all, a guest cares only itself. Also, maybe we can't provide PV handler for all guest (like windows).Dom0 is a special case, it's vIRQ handler knows it is running under Xen, but that is for log/telemetry and for proactive action.information on the system. In other words, the existing code that can beWhat do you mean of "existing", our patch or current Xen implementation?used is only the code that gathers some information. So, the only thing that vMCE is good for, is that you can run unmodified error logging code. But you can't interpret any of the error information further without knowing more. Especially for a domU, which might not know anything, this doesn't seem useful. What would the user of a domU do withthat information? To recap, I think the part where Xen itself takes action is fine, withsome modifications. But I don't see any advantages in vMCE delivery, unless I'm missing something of course..I think the main advantage are: a) We don't need maintain a PV MCA handler for guest, especially for HVM guest b) We can get benifit from guest's MCA improvement/enhancement . c) Applying this to dom0, we don't need different mechanism to dom0/hvm. Ok, my main issue here is: if you want to enable a guest to run unmodified MCA code (which you state as a goal, and as an advantage of the vMCE approach), then what can the guest actually do. Or the dom0, for that matter? MCA information is highly specific to the hardware. Without additional information on the hardware, it is hard, or even impossible, for the unmodified MCA handler in dom0 or a domU to do anything useful. It will interpret the information to fit the virtualized environment it is in, which doesn't match the reality of the hardware at all. So what can it do? It can just read the MSRs and log the information, but even that information wouldn't be useful; it is already available to dom0, where the code and/or person who can make sense of the data will see it. The unmodified MCA handler also can't take any corrective action; it might think that it is taking action, but in fact, its wrmsr instructions have no effect (and they shouldn't, guests should definitely not be able to do MSR writes). I only see one possible exception to this: if you translate the ADDR MSR of a bank to a guest address in the vmca info before delivering the vMCE, then the guest could do something useful, because its virtualized MSR reads would then produce a guest address, and it could do something useful with it. But currently, your code doesn't seem to do this; the virtualized MSR will produce the machine address, which the guest can't do anything with, unless it knows its running under Xen. So that's my main problem here: there is a contradiction. The vMCE mechanism as you implement it enables guests to run an unmodified MCA handler, but there isn't actually much that the guest can do with that, without knowing it runs under Xen. I see only one specific use for this: if you translate the ADDR info to a guest address, it could potentially try to do a "local" page retire. - Frank _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |