Xen project Mailing List

RE: [Xen-devel] paging mechanism clarification

To: "Mark Williamson" <mark.williamson@xxxxxxxxxxxx>

From: "Pradeep Singh, TLS-Chennai" <pradeep_s@xxxxxx>

Date: Wed, 14 Mar 2007 09:51:53 +0530

Cc: "Petersson, Mats" <Mats.Petersson@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 13 Mar 2007 21:25:49 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: Acdl7Hx5NmD98dWPTcWXiLW1/il7tQAA87rW

Thread-topic: [Xen-devel] paging mechanism clarification

-----Original Message-----
From: M.A. Williamson on behalf of Mark Williamson
Sent: Wed 14-Mar-07 9:25 AM
To: Pradeep Singh, TLS-Chennai
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Petersson, Mats
Subject: Re: [Xen-devel] paging mechanism clarification

> The most important difference is that in PV mode the pagetables the guest
> generates are directly used by the host processor. For this reason, there
> are various changes required to the way the guest updates its pagetables.
> The most important of these is that the guest must translate its
> pseudophysical addresses to host machine addresses before filling out page
> table entries.
>
> Thanks for the clarification Mark.
> But Isn't Xen hypevisor supposed to do this translation of pseuophysical
> addresses to host machine address, instead of the guest itself?

Not for paravirtualised (Xen-aware) guests. They handle their own
translations, which makes it possible to eliminate shadow paging entirely for
them; this is a benefit to performance. Shadow pagetables are normally only
used for a PV guest when live migration is in progress.

Now again i am confused :-(.
you mean shadow tables are not used at all in case of PV guests if i am not using live migration at all?
They are used as default only in case of HVM guests, with or without live migration right?

One more thing, where can i find code related to this shadow page table handling in my source code?
Couldn't find even the shadow page struct in my source.but it was available on the lxr repo on xensource.

Thank you very much
--pradeep

HTH,
Cheers,
Mark

> Thank you
> --pradeep
>
> In PV mode, pages that are currently part of a pagetable are only ever
> allowed to be mapped readonly in order to prevent tampering by the guest.
>
> Cheers,
> Mark
>
> > > > I hope i made myself clear.
> > > > Please enlighten me :-).
> > > >
> > > > When paging is enabled, we use a shadow page-table, which is
> > > > essentially
> > > > that the GUEST sees one page-table, and the processor another
> > > > (thanks to
> > > > the fact that the hypervisor intercepts the CR3 read/write
> > >
> > > operations,
> > >
> > > > and when CR3 is read back by the guest, we don't send back the value
> > > > it's ACTUALLY POINTING TO IN THE PROCESSOR, but the value
> > >
> > > that was set
> > >
> > > > by the guest). So there are two page-tables.
> > > >
> > > > Got this well, thanks Mats :).
> > > >
> > > > To make the page-table updates by the guest visible to the
> > >
> > > hypervisor,
> > >
> > > > all of the guest-page-tables are made read-only (by scanning
> > > > the new CR3
> > > > value whenever one is set).
> > > >
> > > > I didn't get this either well :(
> > > > sorry, but do you mean CR3 for the guest or for the
> > > > processor? i hope you mean guest?
> > >
> > > Yes, scan the guest-CR3 to see where it placed the page-tables.
> > >
> > > > Whenever a page-fault happens, the hypervisor has "first look", and
> > > > determines if the update is for a page-table or not. If it is a
> > > > page-table update, the guest operation is emulated (in
> > >
> > > x86_emulate.c),
> > >
> > > > and the result is written to the shadow-page-table AND the
> > > >
> > > > Why do we need emulation?some peculiar reason for emulating?
> > > > Do you mean to say if i am running a 32 bit domU on top of a
> > > > 64 bit processor, the guest operation for updating the page
> > > > table is emulated by the hypervisor.am i right?
> > >
> > > No, it's simply because we need to see the result of the
> > > instruction and
> > > write it to two places (with some modification in one of
> > > those places).
> > > So if the code is doing, for example: "*pte |= 1;" (set a
> > > page-table-entry to "present"), we need to mark both the
> > > guest-page-table-entry to "present", and mark our
> > > shadow-entry "present"
> > > (and perhaps do some other work too, but that's the minimum work
> > > needed).
> > >
> > > This brings one more question in my mind.Why do we use pinning then?
> >
> > I believe there's two types of pinning! Page-pinning, which is blocking
> > a page from being accessed in an incorrect way [again, I'm not 100% sure
> > how this works, or exactly what it does - just that it's a term used in
> > the general way I described in the previous sentence].
> >
> > > As i see at it.To avoid shadow page tables to be swapped out
> > > before the page tables they actually point to are swapped.Am i right?
> > >
> > > But according to interface manual,-> to bind a vcpu to a
> > > specific CPU in a SMP environment we use pining.But these two
> > > look pretty orthogonal statements to me, which means i may be
> > > wrong :(.
> > > Can somebody help me in this regard?
> >
> > CPU pinning is to tie a VCPU to a (set of) processor(s). For example,
> > you may want to pin Dom0 to run only on CPU0, and pin a DomU to run on
> > CPU's 1,2 and 3. That way, Dom0 is ALWAYS able to run on it's own CPU,
> > and it's never in contention about which CPU to use, and DomU can run on
> > three CPU's as much as it likes. You could have another DomU pinned to
> > CPU 3 if you wish. That means that CPU 1, 2 are exclusively for the
> > first DomU, whilst the second DomU shares CPU3 with the first DomU (so
> > they both get half the CPU performance of one CPU - on average over a
> > reasonable amount of time).
> >
> > --
> > Mats
> >
> > > Pointers to actual code will be of great help.
> > >
> > > Thanks a lot Mats.
> > > Thank you all.
> > >
> > > --pradeep
> > >
> > > > Does this means on a x86 platform this overkill or this
> > > > emulation is skipped altogether?
> > > > Please bear with me as i am an absolute Xen newbie out here :-).
> > >
> > > No, it's ALWAYS used for all page-table writes, as far as I
> > > understand.
> > >
> > > --
> > > Mats
> > >
> > > > guest-page-table, but in the shadow-page-table, the value is
> > > > modified to
> > > > reflect the actual address in machine-space, rather than what
> > > > the guest
> > > > thinks it should be.
> > > >
> > > > In futuer versions of AMD processors (and I believe Intel are
> > > > working on
> > > > something very similar if not the same), there will be a mode
> > > > where the
> > > > processor is able to work in "nested paging mode", which means that
> > > > there are two "parallel" page-tables. The first one is the
> > > > "guest-page-table", the second one is the "host-page-table". In this
> > > > case, every lookup in the guest-page-table will be done through the
> > > > host-page-table. So we have a "simple" way to just take the
> > > > guest-page-table and translate it to machine-physical-address
> > > > - with the
> > > > good thing that the host-page-table needn't change, since the
> > > > pages that
> > > > the host consists of is pretty much static for the duration of the
> > > > guest.
> > > >
> > > > Yes, read about about this in an article mention how Pacifica
> > > > is better than VT.
> > > >
> > > > Say for example, we have a guest that lives at 256-512MB. The
> > > > guest-page-table would contain, for example, a mapping for
> > > > 0x12200000 ->
> > > > guest-physical 0x100000 (1MB). The host-page-table
> > >
> > > translates this to
> > >
> > > > 0x10100000 because the 1MB entry in guest-address is 256+1MB in
> > > > machine-address.
> > > >
> > > > Exactly, got this well on spot :).
> > > >
> > > > [In reality, it's very likely that the guest never gets all
> > > > the space in
> > > > one big chunk, but rather a few pages here and a few pages there. If
> > > > there are big chunks, we could use large pages to map those!].
> > > >
> > > > Thanks a ton Mats and all.
> > > >
> > > > --pradeep
> > > >
> > > > The support for nested paging (called HAP, Hardware Assisted
> > > > Paging) is
> > > > in the Unstable version of Xen since a few days back.
> > > >
> > > > --
> > > > Mats
> > > >
> > > > > And this whole 2 level paging consitutes Xen's shadow page
> > > > > tables. Right?
> > > > >
> > > > > Is my understanding of Xen's paging mechanism correct?or am i
> > > > > missing something?
> > > > >
> > > > > Thank you
> > > > >
> > > > > -pradeep
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
>
> --
> Dave: Just a question. What use is a unicyle with no seat? And no pedals!
> Mark: To answer a question with a question: What use is a skateboard?
> Dave: Skateboards have wheels.
> Mark: My wheel has a wheel!

--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

DISCLAIMER: ----------------------------------------------------------------------------------------------------------------------- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. -----------------------------------------------------------------------------------------------------------------------

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.