[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] Critical bug: VT-d fault causes disk corruption or Dom0 kernel panic.
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On > Behalf Of Keir Fraser > Sent: Friday, January 23, 2009 10:42 AM > > Ah, I know what it is! We actually free up bits of the Xen image at the end > of Xen bootstrap, and these can now be allocated to a domain (e.g., dom0) > and DMAed to. But these will be contained within the bounds of __pa(&_start) > and __pa(&_end) and hence will not have been mapped in dom0'd vtd tables. > > Sadly the fact is that Xen relies on validity of memory from the domain heap > as well as Xen heap anyway, so the avoidance of mapping Xen-critical memory > in dom0 vtd tables is inadequate anyway, even on x86_32 and ia64. > > Also it's going to be hard to do better while keeping efficiency since if > you only map dom0's pages in its vtd tables then PV backend drivers will not > work (which rely on DMAing to/from other domain's pages via grant > references). You'd have to dynamically map/unmap as grants get > mapped/unmapped, and you may not want the performance hit of that. > > I'd personally vote for getting rid of xen_in_range(). Alternatively we > could have it merely check for is_kernel_text(), but really I think since it > is not in any way full protection from dom0 I wonder if it is worth the > bother at all. > > What do you think? > > -- Keir Since this is somewhat similar to the issue I'm facing with the TXT patch, it does seem useful to have a good way of knowing where all of the hypervisor memory is. I looked at is_kernel_text() and that only compares against _stext/_etext, which after looking at the xen.lds file, is really just some of the code of the hypervisor. Is there any reason not to use [_stext, __init_begin) + [__per_cpu_start, __per_cpu_end] + [__bss_start, _end] + [bootsym_phys(trampoline_start), bootsym_phys(trampoline_end)] as a first approximation of hypervisor memory (I'm assuming that the code within [__init_begin, __init_end] is what you reclaim)? While this still doesn't get the xen heap or domain heap, it at least gets us a little farther. For the MAC aspect of the TXT patch, we need to know all of the code + data that could be used during resume and before the xen code that MACs everything else. This includes the stack, page tables, etc. We've also added a fn that checks the ACPI Sx addresses against xen memory (hypervisor + domain) to ensure that tboot can't be tricked into overwriting xen as part of S3. This should be a more comprehensive check than for MAC, since there is no way of detecting if we missed some range. Joe > > On 23/01/2009 17:30, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote: > > > I have not figured out why this is the problem yet but I know comment it out > > makes the problem go away. Leaving tboot_in_range() in does not cause this > > problem. > > > > Allen > > > > -----Original Message----- > > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > > Sent: Friday, January 23, 2009 12:34 AM > > To: Kay, Allen M; Li, Xin; Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx' > > Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or > > Dom0 kernel panic. > > > > Are you sure that is the problem? The xen_in_range() change should make the > > dom0 VT-d table more permissive, and hence if anything less likely to > > experience VT-d faults. Also it wouldn't seem to explain problems for HVM > > guest passthrough. > > > > -- Keir > > > > On 23/01/2009 01:01, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote: > > > >> Looks like the problem is caused by xen_in_range() call in > >> vtd/iommu.c/intel_iommu_domain_init(). Definition of xen_in_range() was > >> changed as part of the heap patch. > >> > >> I'm looking into change intel_iommu_domain_init() to just map pages in > >> dom0->page_list. However this looks to be more complicated as d->page_list > >> is > >> not initialized at this stage of the boot yet. > >> > >> Allen > >> > >> -----Original Message----- > >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser > >> Sent: Thursday, January 22, 2009 1:23 AM > >> To: Li, Xin; Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx' > >> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or > >> Dom0 kernel panic. > >> > >> Mmm well not really. :-) > >> > >> Is there any assumption in the VT-d setup about preventing access to the > >> Xen > >> heap, and could that be broken? > >> > >> Perhaps the VT-d pagetables are broken causing bad DMAs leading to data > >> corruption and bad command packets? > >> > >> -- Keir > >> > >> On 22/01/2009 08:58, "Li, Xin" <xin.li@xxxxxxxxx> wrote: > >> > >>> We are looking into the issue too. If you have any idea on how it's > >>> caused, > >>> please tell us :-) > >>> Thanks! > >>> -Xin > >>> > >>>> -----Original Message----- > >>>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > >>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser > >>>> Sent: Thursday, January 22, 2009 3:40 PM > >>>> To: Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx' > >>>> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption > >>>> or > >>>> Dom0 > >>>> kernel panic. > >>>> > >>>> Thanks, > >>>> > >>>> I haven't seen any problems outside of VT-d since c/s 19057, btw. > >>>> > >>>> -- Keir > >>>> > >>>> On 22/01/2009 03:42, "Li, Haicheng" <haicheng.li@xxxxxxxxx> wrote: > >>>> > >>>>> All, > >>>>> > >>>>> We met several system failures on different hardware platforms, which > >>>>> are > >>>>> all > >>>>> caused by VT-d fault. > >>>>> err 1: disk is corrupted by VT-d fault on SATA. > >>>>> err 2: Dom0 kernel panics at booting, which is caused VT-d fault on > >>>>> UHCI. > >>>>> err 3, Dom0 complains disk errors while creating HVM guests. > >>>>> > >>>>> The culprit would be changeset 19054 "x86_64: Remove > >>>>> statically-partitioned > >>>>> Xen heap.". > >>>>> > >>>>> Detailed error logs can be found via BZ#, > >>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1409. > >>>>> > >>>>> > >>>>> -haicheng > >>>>> _______________________________________________ > >>>>> Xen-devel mailing list > >>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx > >>>>> http://lists.xensource.com/xen-devel > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Xen-devel mailing list > >>>> Xen-devel@xxxxxxxxxxxxxxxxxxx > >>>> http://lists.xensource.com/xen-devel > >> > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@xxxxxxxxxxxxxxxxxxx > >> http://lists.xensource.com/xen-devel > > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |