[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
Jan Beulich wrote: >>>> On 11.10.11 at 11:51, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote: >> Jan Beulich wrote: >>> If the prefetch was from Xen space (only in guest context), >>> delivering a vMCE to the guest is pointless (and perhaps confusing >>> to the guest). >>> >> >> Yes, exactly. how about delay handle it as: >> * at mce isr >> if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs)) >> xen panic; >> * at mce softirq >> if ( (srar error) && (EIPV ==0) && (broken page owned by >> hypervisor) ) xen panic; > > Possible, but I'm not convinced. > >>>> * guest may kill app, kernel thread, guest itself, or whatever; >>>> >>>> The error is still an error, w/ 2 possibilities in the future: >>>> 1. it may not be consumed as an SRAR error, system keep going, >>>> h/w mechanism may detect a SRAO error (i.e. memroy scrub) at some >>>> time point and handled then; >>>> 2. it may be consumed at some time point and a SRAR error >>>> triggered again. At this time, 1). if srar occurred at >>>> hypervisor context, xen will panic. or, 2). if srar occurred at >>>> guest >>>> context, xen kill the guest as a malicious one (as what the 2nd >>>> patch do), and move the page to broken page list; >>>> >>>> Considering the rare possibility of the above case, I think it's >>>> acceptable to handle it in this way. Thoughts? >>> >>> You're only discussing instruction fetches (which can be discarded), >>> but you're not covering the other example I gave (GDT access from >>> guest context - just like this is a ring-0 operations from the >>> paging unit's pov, this ought to be an out-of-context operation >>> from MCE's perspective). >> >> That would be data load error (EIPV=1), a sync error. > > If indeed implemented that way in hardware, that would make the > handling ambiguous: A GDT access must not (unconditionally) be > attributed to the (pv) guest, as it is not a problem the guest can > (necessarily) deal with (considering the split page ownership of > what constitutes the GDT under Xen, the guest should only be > accountable for the non-reserved part of the GDT, the rest should > be attributed back to Xen). > > The same would go for (perhaps speculative) page table walks. > Seems not ambiguous here: who own, who take. If error caused by hypervisor access broken page, xen panic; If error caused by guest access, guest would handle it (I guess normally kill itself); If guest maliciously access again, it would be killed by hypervisor. > Furthermore, data prefetching is possible too - how would a problem > there get reported? > It may be reported as unkown error, or nothing, but not as srar data load error w/ EIPV=1. Thanks, Jinsong _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |