[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] Scheduler portability problem


  • To: "Magenheimer, Dan (HP Labs Fort Collins)" <dan.magenheimer@xxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Wed, 9 Mar 2005 16:28:51 +0800
  • Delivery-date: Wed, 09 Mar 2005 08:30:09 +0000
  • List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
  • Thread-index: AcUkMLMmuQFw0055QJSt5e32tY6tcgATgkBg
  • Thread-topic: [Xen-devel] Scheduler portability problem

Hi, Dan,
        Your finding is real problem for porting XEN to archs like IA64
which has a large set of register files. Current XEN/x86 adopts
so-called continuation mechanism to provide only one HV stack per LP,
for all domains running on that LP. A simple flow when context switch
can be:

1. Scheduler picks a new domain
2. In switch_to:
        - Save domain context (xen_regs) to prev's
thread_struct.execution_context_t
        - Load next's domain context to bottom of stack (xen_regs)
3. Then schedule_tail simply does assembly tricks like you said, to
reset stack pointer to xen_regs area and resume to new domain

        This flow is elegant regarding to small context of x86, which
saves time for normal function exits since the stack content is known to
be useless on this continuation mechanism. Also by this way, two
parameters are enough for switch_to, since no stack switch happens at
all.

        Say, IA64 has a large set of register files (n Kbytes) and
especially, a hardware engine to manage stack registers. Then both
performance and implementation difficulty are dramatically influenced if
we still adopt same mechanism. So, yes, we need to find a generic way to
allow both mechanisms (per-LP stack and per-domain stack) co-exist. A
quick code surf seems to indicate the first and major blocker is the BUG
in the end of __enter_scheduler. If we can take that check into arch
specific scheduler_tail, saying let different arch to decide whether it
wants a normal return, per-domain stacks may start to work if fortunate
enough. As long as the execution path follows normal function return
path to assembly stub, ia64_switch_to you ported from IPF linux can work
smoothly. However, you are right, we need comments from broader
developers to see what on earth an complete solution should be. :)

Thanks,
Kevin
>-----Original Message-----
>From: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
>[mailto:xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of
Magenheimer, Dan (HP
>Labs Fort Collins)
>Sent: Tuesday, March 08, 2005 2:47 PM
>To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
>Subject: [Xen-devel] Scheduler portability problem
>
>I am working on Xen/ia64 changes (within Xen itself) to support
>multiple domains and ran into the following problem:
>
>It appears that __enter_scheduler was derived from an old version
>of the Linux scheduler ("schedule()"), with some changes made for
>simplification.  Many of the function names are the same but some
>of the syntax and semantics have changed.  In particular, four
>of note:
>
>1) switch_to now takes two arguments instead of three, and
>2) after switch_to is called, "other things" are done which
>   utilize the "next" pointer
>3) schedule_tail is passed the "next" task, rather than "prev"
>4) schedule_tail is assumed to never return
>
>I'm all for simplification if the Linux code is too complicated,
>but in this case, some of the complexity is present to support
>other architectures.  I can speak for ia64 but I suspect that
>similar problems will occur with other non-x86 ports.
>
>On Linux, switch_to is actually a macro and on ia64, another routine
>is called which returns a value that is "passed back" in the
>third switch_to argument.  Why?  Because switch_to actually does a task
>switch and the world may be very different when it returns.
>In particular, the values for prev and next are *different* when
>it returns.  Why?  Because switch_to (at least on ia64) is the
>key point where all of the current task state is put in memory,
>stacks are changed, and the new task state is taken back out
>of memory.  Actually, that's not quite accurate... at the point of
>the call to switch_to, a fair amount of state has *already* been
>put in memory in both the memory stack and the register stack.
>The only way to restore this state (short of some very complex
>stack analysis) is to exit each routine in the call stack the same
>way as it was called.
>
>So, on Linux, after the call to switch_to, "next" is no longer
>valid and is not used.  "Prev" is used only because of the third
>argument macro trick, and "current" has already been changed to
>point to the new task.
>
>On Xen/x86, it appears schedule_tail never returns because some cool
>assembly tricks are used to jump directly to the right place,
>basically as if throwing an exception (I'm guessing because there is no
>useful state on the call stack on x86).  As previously noted, this is
>problematic on ia64.
>
>Bottom line: The current code in __enter_scheduler() does not easily
>accommodate other architectures.  I'll be taking a look at what it
>will take to "fix" it, but wanted to open discussion first.  I know
>there are some that will say "just change the ia64 code"... because
>of architectural constraints, this is far FAR more easily said than
>done.  And there are some that will say that mimicking Linux is
>a mistake because XINL (Xen is not Linux).  However, I believe this
>is a case where leveraging the many many years of experience on many
>many architectures (with said experience only documented in the code
>itself) of Linux will benefit Xen portability in the long run (and,
>in my case, in the short run).
>
>Comments?
>
>Thanks,
>Dan
>
>
>-------------------------------------------------------
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real
users.
>Discover which products truly live up to the hype. Start reading now.
>http://ads.osdn.com/?ad_ide95&alloc_id396&op=ick
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@xxxxxxxxxxxxxxxxxxxxx
>https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.