[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?

  • To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
  • From: "David Stone" <unclestoner@xxxxxxxxx>
  • Date: Fri, 14 Dec 2007 17:43:52 -0500
  • Cc: Xen Developers <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 14 Dec 2007 14:44:23 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=h7TdUQr+c2YJhJOLbkwVyogZ1iVcR7kKYfI5SJ18Uv/ZrQimgUYL9o4vj+mS9i4du3v1ryDojZY8j/Wt2jeq3EUWu6b727t9FQghva3ItQGtGZAIj8vJO1BoZ9eGcSNKcmjveh/zO1SjyaEYC0KbZoylpN9UjC9TnlxTKo5/dHY=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> > root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given
> > guest l1e flags:63 at 10000000 (the effective mm type:6), because the
> > host mtrr type is:0
> > (XEN) CPU 1: Machine Check Exception: 0000000000000005
> > (XEN) Bank 0: b200004000000800
> > (XEN) Bank 5: b200121020080400
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 1:
> > (XEN) CPU context corrupt****************************************
> > (XEN)
> > (XEN) Reboot in five seconds..
> That looks like the CPU toasted itself. Bits 0-16 == 0x0400 in a
> machine-check status register means 'CPU internal timer error'. Perhaps this
> #MC means something else in the context of VT-d though? We probably need
> someone from Intel to help decode what happened here.

Hmm, thanks.    I'll concentrate ont he #MC for now.

Regarding which, how areyou resolving 0x0400 in the status register to
'CPU internal timer error'?  I'm looking at the "System Programming"
Intel manual and it seems to indicate that an Error Code with bits
0000 01xx xxxx xxxx (like 0x0400) is an "Internal Unclassified" error.

For machine-checks, is there the notion of protecting the hypervisor
from problems encountered in the HVM guest?  I.e., if a #MC happens
when a guest is executing (non-root mode), is the host equally
screwed?  I'm guessing not if it is the nature of a #MC is such that
it is the processor itself that is screwed, not any particular level
of hardware?

Finally, one thing I'm still not sure about is exactly what PCI
devices (as identified by B:D:F) I should hide from Dom0 and pass
through to the guest.  For my machine, the PCI topology as seen from
Dom0 is:

00:00.0 Host bridge [0600]: Intel Corporation DRAM Controller
[8086:29b0] (rev 02)
00:01.0 PCI bridge [0604]: Intel Corporation PCI Express Root Port
[8086:29b1] (rev 02)
00:1c.0 PCI bridge [0604]: Intel Corporation PCI Express Port 1
[8086:2940] (rev 02)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge
[8086:244e] (rev 92)
01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Unknown
device [1002:94c3]

01:00.0 is the 16-lane PCI-Express graphics card I'm trying to pass
through to my Windows DomU.  00:01.0 is the root complex to which is
attached (I'm pretty sure based on the below).  I think 00:1c.0 is a
switch to a one-lane PCI Express slot on the motherboard.

So I'm hiding/passing through both the root complex (00:01.0) and the
graphics card (01:00.0).  Interestingly if I explicitly hide the root
complex only, pciback seems to automagically graps the graphics card.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.