Xen project Mailing List

RE: [Xen-devel] Re: [RFC] RAS(Part II)--MCA enalbing in XEN

To: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>, Christoph Egger <Christoph.Egger@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>

Date: Tue, 10 Mar 2009 09:19:04 +0800

Accept-language: en-US

Acceptlanguage: en-US

Cc: "Kleen, Andi" <andi.kleen@xxxxxxxxx>, "Frank.Vanderlinden@xxxxxxx" <Frank.Vanderlinden@xxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Gavin Maltby <Gavin.Maltby@xxxxxxx>, "Ke, Liping" <liping.ke@xxxxxxxxx>

Delivery-date: Mon, 09 Mar 2009 18:21:00 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: Acmdt+qdhPKHFOpIQ024Wo5jup66OQAP1dBAAMmswkA=

Thread-topic: [Xen-devel] Re: [RFC] RAS(Part II)--MCA enalbing in XEN

Christoph/Frank, do you have any comments? Thanks Yunhong Jiang Jiang, Yunhong <> wrote: > Christoph Egger <mailto:Christoph.Egger@xxxxxxx> wrote: >> On Thursday 05 March 2009 16:19:40 Jiang, Yunhong wrote: >>> Christoph Egger <mailto:Christoph.Egger@xxxxxxx> wrote: >>>> MC_ACT_CACHE_SHIRNK <-- typo. should be MC_ACT_CACHE_SHRINK >>> >>> Ahh, yes, I will fix it. >>> >>>> The L3 cache index disable feature works like this: >>>> >>>> You read the bits 17:6 from the MSR 0xC0000408 (which is MC4_MISC1) >>>> and write it into the index field. This MSR does not belong to >>>> the standard >>>> mc bank data and is therefore provided by mcinfo_extended. >>>> The index field are the bits 11:0 of the PCI function 3 register "L3 >>>> Cache Index Disable". >>> >>> So what's the offset of "L3 Cache Index Disable"? Is it in 256 byte or 4K >>> byte? >> >> Sorry, which offset do you mean ? > > I mean the offset of this register in the PCI function's > configuration space. You know for a PCI device, it has 256 > byte configuration register while PCI-E device has 4K > configuration register. > Currently xen can access the 256 byte config register already, > however, to support 4K range, it requires more stuff, like > mmconfig sparse etc. That's the reason I ask the offset of > this register. > >> >>> >>> For the PCI access, I'd prefer to have xen to control all these, i.e. even >>> if dom0 want to disable the L3 cache, it is done through a hypercall. The >>> reason is, Xen control the CPU, so keep it in Xen will make things >>> simpler. >>> >>> Of course, it is ok for me too, if you want to keep Xen for #MC handler >>> and Dom0 for CE handler. >> >> We still need to define the rules to prevent interferes and >> clarify how to >> deal with Dom0/DomU going wild and breaking the rules. > > As discussed previously, we don't need concern about DomU, > all configuration space access from domU will be intercepted by dom0. > > For Dom0, since currently all PCI access to 0xcf8/cfc will be > intercepted by Xen, so Xen can do checking. We can achieve > same checking for mmconfig if remove that range from dom0. But > I have to say I'm not sure if we do need concern too much what > will happen when dom0 going wild ( after all, a crash in dom0 > will lost everything), especially interfere on such access > will not cause security issue (please correct me if I'm wrong ). > >> >>>> Why is the recover action bound to the bank ? >>>> I would like to see a struct mcinfo_recover rather extending >>>> struct mcinfo_bank. That gives us flexibility. >>> >>> I'd get input from Frank or Gavin. Place mcinfo_recover in mcinfo_back has >>> advantage of keep connection of the error source and the action, but it do >>> make the mcinfo_bank more complex. Or we can keep the cpu/bank information >>> in the mcinfo_recover also, so that we keep the flexibility and don't lose >>> the connection. >> >> From your suggestions I prefer the last one, but is still limited due >> to the assumption that each struct mcinfo_bank and each struct >> mcinfo_extended stands for exactly one error. >> >> This assumption doesn't cover follow-up errors which may be needed to >> determine the real root cause. Some of them may even be ignored >> depending on what is going on. > > I think the assumption here is a recover action will be > triggered only by one bank. For example, we offline page > because one MC bank tell us that page is broken. > > The "follow-up errors" is something interesting to me, do you > have any example? It's ok for us to not include the back > information if there are such requirement. > > Thanks > Yunhong Jiang > >> >> Christoph >> >>> >>> Thanks >>> Yunhong Jiang >>> >>>> Christoph >>>> >>>> On Thursday 05 March 2009 09:31:27 Jiang, Yunhong wrote: >>>>> Christoph/Frank, Followed is the interface definition, please have a >>>>> look. >>>>> >>>>> Thanks >>>>> Yunhong Jiang >>>>> >>>>> 1) Interface between Xen/dom0 for passing xen's recovery action >>>>> information to dom0. Usage model: After offlining broken page, Xen might >>>>> pass its page-offline recovery action result information to dom0. Dom0 >>>>> will save the information in non-volatile memory for further proactive >>>>> actions, such as offlining the easy-broken page early when doing next >>>>> reboot. >>>>> >>>>> >>>>> struct page_offline_action >>>>> { >>>>> /* Params for passing the offlined page number to DOM0 */ >>>>> uint64_t mfn; uint64_t status; /* Similar to page offline hypercall */ >>>>> }; >>>>> >>>>> struct cpu_offline_action >>>>> { >>>>> /* Params for passing the identity of the offlined CPU to DOM0 */ >>>>> uint32_t mc_socketid; uint16_t mc_coreid; >>>>> uint16_t mc_core_threadid; >>>>> }; >>>>> >>>>> struct cache_shrink_action >>>>> { >>>>> /* TBD, Christoph, please fill it */ >>>>> }; >>>>> >>>>> /* Recover action flags, giving recovery result information to guest */ >>>>> /* Recovery successfully after taking certain recovery actions below */ >>>>> #define REC_ACT_RECOVERED (0x1 << 0) >>>>> /* For solaris's usage that dom0 will take ownership when crash */ >>>>> #define REC_ACT_RESET (0x1 << 2) >>>>> /* No action is performed by XEN */ >>>>> #define REC_ACT_INFO (0x1 << 3) >>>>> >>>>> /* Recover action type definition, valid only when flags & >>>>> REC_ACT_RECOVERED */ #define MC_ACT_PAGE_OFFLINE 1 >>>>> #define MC_ACT_CPU_OFFLINE 2 >>>>> #define MC_ACT_CACHE_SHIRNK 3 >>>>> >>>>> struct recovery_action >>>>> { >>>>> uint8_t flags; >>>>> uint8_t action_type; >>>>> union >>>>> { >>>>> struct page_offline_action page_retire; >>>>> struct cpu_offline_action cpu_offline; >>>>> struct cache_shrink_action cache_shrink; >>>>> uint8_t pad[MAX_ACTION_SIZE]; >>>>> } action_info; >>>>> } >>>>> >>>>> struct mcinfo_bank { >>>>> struct mcinfo_common common; >>>>> >>>>> uint16_t mc_bank; /* bank nr */ >>>>> uint16_t mc_domid; /* Usecase 5: domain referenced by mc_addr on >>>>> dom0 * and if mc_addr is valid. Never valid on DomU. */ uint64_t >>>>> mc_status; /* bank status */ uint64_t mc_addr; /* bank address, >>>>> only valid * if addr bit is set in mc_status */ >>>>> uint64_t mc_misc; uint64_t mc_ctrl2; >>>>> uint64_t mc_tsc; >>>>> /* Recovery action is performed per bank */ >>>>> struct recovery_action action; >>>>> }; >>>>> >>>>> 2) Below two interfaces are for MCA processing internal use. >>>>> a. pre_handler will be called earlier in MCA ISR context, mainly for >>>>> early need_reset detection for avoiding log missing (flag MCA_RESET). >>>>> Also, pre_handler might be able to find the impacted domain if possible. >>>>> b. mca_error_handler is actually a (error_action_index, >>>>> recovery_handler pointer) pair. The defined recovery_handler function >>>>> performs the actual recovery operations in softIrq context after the >>>>> per_bank MCA error matching the corresponding mca_code index. If >>>>> pre_handler can't judge the impacted domain, recovery_handler must >>>>> figure it out. >>>>> >>>>> /* Error has been recovered successfully */ >>>>> #define MCA_RECOVERD 0 >>>>> /* Error impact one guest as stated in owner field */ #define MCA_OWNER >>>>> 1 /* Error can't be recovered and need reboot system */ #define >>>>> MCA_RESET 2 /* Error should be handled in softIRQ context */ #define >>>>> MCA_MORE_ACTION 3 >>>>> >>>>> struct mca_handle_result >>>>> { >>>>> uint32_t flags; >>>>> /* Valid only when flags & MCA_OWNER */ >>>>> domid_d owner; >>>>> /* valid only when flags & MCA_RECOVERD */ >>>>> struct recovery_action *action; >>>>> }; >>>>> >>>>> struct mca_error_handler >>>>> { >>>>> /* >>>>> * Assume we will need only architecture defined code. If the index >>>>> can't be setup by * mca_code, we will add a function to do the (index, >>>>> recovery_handler) mapping check. * This mca_code represents the recovery >>>>> handler pointer index for identifying this * particular error's >>>>> corresponding recover action */ >>>>> uint16_t mca_code; >>>>> >>>>> /* Handler to be called in softIRQ handler context */ >>>>> int recovery_handler(struct mcinfo_bank *bank, >>>>> struct mcinfo_global *global, >>>>> struct mcinfo_extended *extention, >>>>> struct mca_handle_result *result); >>>>> >>>>> }; >>>>> >>>>> struct mca_error_handler intel_mca_handler[] = >>>>> { >>>>> .... >>>>> }; >>>>> >>>>> struct mca_error_handler amd_mca_handler[] = >>>>> { >>>>> .... >>>>> }; >>>>> >>>>> >>>>> /* HandlVer to be called in MCA ISR in MCA context */ >>>>> int intel_mca_pre_handler(struct cpu_user_regs *regs, >>>>> struct mca_handle_result *result); >>>>> >>>>> int amd_mca_pre_handler(struct cpu_user_regs *regs, >>>>> struct mca_handle_result *result); >>>>> >>>>> Frank.Vanderlinden@xxxxxxx >> <mailto:Frank.Vanderlinden@xxxxxxx> wrote: >>>>>> Jiang, Yunhong wrote: >>>>>>> Frank/Christopher, can you please give more comments for it, or you >>>>>>> are OK with this? For the action reporting mechanism, we will send out >>>>>>> a proposal for review soon. >>>>>> >>>>>> I'm ok with this. We need a little more information on the AMD >>>>>> mechanism, but it seems to me that we can fit this in. >>>>>> >>>>>> Sometime this week, I'll also send out the last of our changes that >>>>>> haven't been sent upstream to xen-unstable yet. Maybe we can combine >>>>>> some things in to one patch, like the telemetry handling changes that >>>>>> Gavin did. The other changes are error injection (for debugging) and >>>>>> panic crash dump support for our FMA tools, but those are probably >>>>>> only interesting to us. >>>>>> >>>>>> - Frank >>>> >>>> -- >>>> ---to satisfy European Law for business letters: >>>> Advanced Micro Devices GmbH >>>> Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen >>>> Geschaeftsfuehrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni >>>> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen >>>> Registergericht Muenchen, HRB Nr. 43632 >> >> >> >> -- >> ---to satisfy European Law for business letters: >> Advanced Micro Devices GmbH >> Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen >> Geschaeftsfuehrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni >> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen >> Registergericht Muenchen, HRB Nr. 43632 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.