[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] paging mechanism clarification




-----Original Message-----
From: M.A. Williamson on behalf of Mark Williamson
Sent: Tue 13-Mar-07 9:48 PM
To: xen-devel@xxxxxxxxxxxxxxxxxxx
Cc: Petersson, Mats; Pradeep Singh, TLS-Chennai
Subject: Re: [Xen-devel] paging mechanism clarification

> Not precisely. I'm talking only about HVM mode, which is "full
> virtualization". PV-mode uses a different paging interface, which at
> least for most parts, comprise of changing the whole area of code in the
> kernel that updates the page-tables, by adding code that is aware of the
> THREE types of address (guest-virtual, guest-physical and
> machine-physical). This means that there's no real need for the
> "read-only page-tables" and "shadow-mode" - the page-table just contains
> the right value for the machine-physical address. [That's not to say
> that read-only page-tables can't be used in a PV system too - I'm not
> 100% sure how the page-table management works in the PV mode].

The most important difference is that in PV mode the pagetables the guest
generates are directly used by the host processor.  For this reason, there
are various changes required to the way the guest updates its pagetables. 
The most important of these is that the guest must translate its
pseudophysical addresses to host machine addresses before filling out page
table entries.

Thanks for the clarification Mark.
But Isn't Xen hypevisor supposed to do this translation of pseuophysical addresses to host machine address, instead of the guest itself?

Thank you
--pradeep

In PV mode, pages that are currently part of a pagetable are only ever allowed
to be mapped readonly in order to prevent tampering by the guest.

Cheers,
Mark

> > > I hope i made myself clear.
> > > Please enlighten me :-).
> > >
> > > When paging is enabled, we use a shadow page-table, which is
> > > essentially
> > > that the GUEST sees one page-table, and the processor another
> > > (thanks to
> > > the fact that the hypervisor intercepts the CR3 read/write
> >
> > operations,
> >
> > > and when CR3 is read back by the guest, we don't send back the value
> > > it's ACTUALLY POINTING TO IN THE PROCESSOR, but the value
> >
> > that was set
> >
> > > by the guest). So there are two page-tables.
> > >
> > > Got this well, thanks Mats :).
> > >
> > > To make the page-table updates by the guest visible to the
> >
> > hypervisor,
> >
> > > all of the guest-page-tables are made read-only (by scanning
> > > the new CR3
> > > value whenever one is set).
> > >
> > > I didn't get this either well :(
> > > sorry, but do you mean CR3 for the guest or for the
> > > processor? i hope you mean guest?
> >
> > Yes, scan the guest-CR3 to see where it placed the page-tables.
> >
> > > Whenever a page-fault happens, the hypervisor has "first look", and
> > > determines if the update is for a page-table or not. If it is a
> > > page-table update, the guest operation is emulated (in
> >
> > x86_emulate.c),
> >
> > > and the result is written to the shadow-page-table AND the
> > >
> > > Why do we need emulation?some peculiar reason for emulating?
> > > Do you mean to say if i am running a 32 bit domU on top of a
> > > 64 bit processor, the guest operation for updating the page
> > > table is emulated by the hypervisor.am i right?
> >
> > No, it's simply because we need to see the result of the
> > instruction and
> > write it to two places (with some modification in one of
> > those places).
> > So if the code is doing, for example: "*pte |= 1;" (set a
> > page-table-entry to "present"), we need to mark both the
> > guest-page-table-entry to "present", and mark our
> > shadow-entry "present"
> > (and perhaps do some other work too, but that's the minimum work
> > needed).
> >
> > This brings one more question in my mind.Why do we use pinning then?
>
> I believe there's two types of pinning! Page-pinning, which is blocking
> a page from being accessed in an incorrect way [again, I'm not 100% sure
> how this works, or exactly what it does - just that it's a term used in
> the general way I described in the previous sentence].
>
> > As i see at it.To avoid shadow page tables to be swapped out
> > before the page tables they actually point to are swapped.Am i right?
> >
> > But according to interface manual,-> to bind a vcpu to a
> > specific CPU in a SMP environment we use pining.But these two
> > look pretty orthogonal statements to me, which means i may be
> > wrong :(.
> > Can somebody help me in this regard?
>
> CPU pinning is to tie a VCPU to a (set of) processor(s). For example,
> you may want to pin Dom0 to run only on CPU0, and pin a DomU to run on
> CPU's 1,2 and 3. That way, Dom0 is ALWAYS able to run on it's own CPU,
> and it's never in contention about which CPU to use, and DomU can run on
> three CPU's as much as it likes. You could have another DomU pinned to
> CPU 3 if you wish. That means that CPU 1, 2 are exclusively for the
> first DomU, whilst the second DomU shares CPU3 with the first DomU (so
> they both get half the CPU performance of one CPU - on average over a
> reasonable amount of time).
>
> --
> Mats
>
> > Pointers to actual code will be of great help.
> >
> > Thanks a lot Mats.
> > Thank you all.
> >
> > --pradeep
> >
> > > Does this means on a x86 platform this overkill or this
> > > emulation is skipped altogether?
> > > Please bear with me as i am an absolute Xen newbie out here :-).
> >
> > No, it's ALWAYS used for all page-table writes, as far as I
> > understand.
> >
> > --
> > Mats
> >
> > > guest-page-table, but in the shadow-page-table, the value is
> > > modified to
> > > reflect the actual address in machine-space, rather than what
> > > the guest
> > > thinks it should be.
> > >
> > > In futuer versions of AMD processors (and I believe Intel are
> > > working on
> > > something very similar if not the same), there will be a mode
> > > where the
> > > processor is able to work in "nested paging mode", which means that
> > > there are two "parallel" page-tables. The first one is the
> > > "guest-page-table", the second one is the "host-page-table". In this
> > > case, every lookup in the guest-page-table will be done through the
> > > host-page-table. So we have a "simple" way to just take the
> > > guest-page-table and translate it to machine-physical-address
> > > - with the
> > > good thing that the host-page-table needn't change, since the
> > > pages that
> > > the host consists of is pretty much static for the duration of the
> > > guest.
> > >
> > > Yes, read about about this in an article mention how Pacifica
> > > is better than VT.
> > >
> > > Say for example, we have a guest that lives at 256-512MB. The
> > > guest-page-table would contain, for example, a mapping for
> > > 0x12200000 ->
> > > guest-physical 0x100000 (1MB). The host-page-table
> >
> > translates this to
> >
> > > 0x10100000 because the 1MB entry in guest-address is 256+1MB in
> > > machine-address.
> > >
> > > Exactly, got this well on spot :).
> > >
> > > [In reality, it's very likely that the guest never gets all
> > > the space in
> > > one big chunk, but rather a few pages here and a few pages there. If
> > > there are big chunks, we could use large pages to map those!].
> > >
> > > Thanks a ton Mats and all.
> > >
> > > --pradeep
> > >
> > > The support for nested paging (called HAP, Hardware Assisted
> > > Paging) is
> > > in the Unstable version of Xen since a few days back.
> > >
> > > --
> > > Mats
> > >
> > > > And this whole 2 level paging consitutes Xen's shadow page
> > > > tables. Right?
> > > >
> > > > Is my understanding of Xen's paging mechanism correct?or am i
> > > > missing something?
> > > >
> > > > Thank you
> > > >
> > > > -pradeep
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

--
Dave: Just a question. What use is a unicyle with no seat?  And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

DISCLAIMER:
-----------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. 
Any views or opinions presented in 
this email are solely those of the author and may not necessarily reflect the 
opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, 
distribution and / or publication of 
this message without the prior written consent of the author of this e-mail is 
strictly prohibited. If you have 
received this email in error please delete it and notify the sender 
immediately. Before opening any mail and 
attachments please check them for viruses and defect.

-----------------------------------------------------------------------------------------------------------------------
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.