[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Xen-devel Digest, Vol 25, Issue 93

To: xen-devel@xxxxxxxxxxxxxxxxxxx
From: PUCCETTI Armand <armand.puccetti@xxxxxx>
Date: Mon, 12 Mar 2007 17:10:45 +0100
Delivery-date: Mon, 12 Mar 2007 09:09:27 -0700
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

When the system boots, the processor is normally in "real-mode", and

it's definitely not got paging enabled. So we have to "makethe guest OS

believe this is the case". But at the same time, the guest OS is most
likely not loaded at address zero in memory, so we need paging enabled
to remap the GUEST PHYSICAL address to match the machine physical
address. So we have a "linear map" to translate the "address zero" to
the "start of guest memory", and so on for every page of memory in the
guest.

This is not hard to do, since the AMD-V/VT feature of the processor
expects the paging-bit to be different between what the guest "thinks"

and the actual case. In the AMD-V, there's even support torun real-mode

with paging enabled, so all the BIOS-code and such will be running in
this mode. VT has to do a bunch of tricky stuff to work around that
problem.

Ok fine, does this argument holds true for even non-VT andnon-Pacifica enabled processors?

I doubt it.


Not precisely. I'm talking only about HVM mode, which is "full
virtualization". PV-mode uses a different paging interface, which at
least for most parts, comprise of changing the whole area of code in the
kernel that updates the page-tables, by adding code that is aware of the
THREE types of address (guest-virtual, guest-physical and
machine-physical). This means that there's no real need for the
"read-only page-tables" and "shadow-mode" - the page-table just contains
the right value for the machine-physical address. [That's not to say
that read-only page-tables can't be used in a PV system too - I'm not

100% sure how the page-table management works in the PV mode].

That is very interesting info on the paging system. Mats, could you please

explain a bit the working of the PV paging? How do the the guest+hostpage tables worktogether? What does the guest page table point to, i.e. how+when is itmapped onto the host page table?

I have seen in the code that there are different cases of guest+hostpaging table heights. Why?


thanks. Armand

I hope i made myself clear.
Please enlighten me :-).

When paging is enabled, we use a shadow page-table, which is
essentially
that the GUEST sees one page-table, and the processor another
(thanks to

the fact that the hypervisor intercepts the CR3 read/write

operations,

and when CR3 is read back by the guest, we don't send back the value
it's ACTUALLY POINTING TO IN THE PROCESSOR, but the value

that was set

by the guest). So there are two page-tables.

Got this well, thanks Mats :).
To make the page-table updates by the guest visible to the

hypervisor,

all of the guest-page-tables are made read-only (by scanning
the new CR3
value whenever one is set).

I didn't get this either well :(
sorry, but do you mean CR3 for the guest or for the
processor? i hope you mean guest?

Yes, scan the guest-CR3 to see where it placed the page-tables.

Whenever a page-fault happens, the hypervisor has "first look", and
determines if the update is for a page-table or not. If it is a

page-table update, the guest operation is emulated (in

x86_emulate.c),

and the result is written to the shadow-page-table AND the

Why do we need emulation?some peculiar reason for emulating?
Do you mean to say if i am running a 32 bit domU on top of a
64 bit processor, the guest operation for updating the page
table is emulated by the hypervisor.am i right?

No, it's simply because we need to see the result of theinstruction andwrite it to two places (with some modification in one ofthose places).

So if the code is doing, for example: "*pte |= 1;" (set a
page-table-entry to "present"), we need to mark both the

guest-page-table-entry to "present", and mark ourshadow-entry "present"

(and perhaps do some other work too, but that's the minimum work
needed).

This brings one more question in my mind.Why do we use pinning then?


I believe there's two types of pinning! Page-pinning, which is blocking
a page from being accessed in an incorrect way [again, I'm not 100% sure
how this works, or exactly what it does - just that it's a term used in

the general way I described in the previous sentence].

As i see at it.To avoid shadow page tables to be swapped outbefore the page tables they actually point to are swapped.Am i right?
But according to interface manual,-> to bind a vcpu to aspecific CPU in a SMP environment we use pining.But these twolook pretty orthogonal statements to me, which means i may bewrong :(.
Can somebody help me in this regard?


CPU pinning is to tie a VCPU to a (set of) processor(s). For example,
you may want to pin Dom0 to run only on CPU0, and pin a DomU to run on
CPU's 1,2 and 3. That way, Dom0 is ALWAYS able to run on it's own CPU,
and it's never in contention about which CPU to use, and DomU can run on
three CPU's as much as it likes. You could have another DomU pinned to
CPU 3 if you wish. That means that CPU 1, 2 are exclusively for the
first DomU, whilst the second DomU shares CPU3 with the first DomU (so
they both get half the CPU performance of one CPU - on average over a

reasonable amount of time).

--



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- RE: [Xen-devel] Re: Xen-devel Digest, Vol 25, Issue 93
  - From: Petersson, Mats

Prev by Date: [Xen-devel] [PATCH] linux: Add more consistency checking to balloon code
Next by Date: [Xen-devel] Tap:aio not working in current unstable?
Previous by thread: [Xen-devel] [PATCH] linux: Add more consistency checking to balloon code
Next by thread: RE: [Xen-devel] Re: Xen-devel Digest, Vol 25, Issue 93
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.