[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen



On Thursday 11 September 2008 16:15:14 Keir Fraser wrote:
> I applied the patch with the following changes:
>  * I rewrote your changes to fixup_irqs(). We should force lazy EOIs
> *after* we have serviced any straggling interrupts. Also we should actually
> clear the EOI stack so it is empty next time the CPU comes online.
>  * I simplified your changes to schedule.c in light of the fact we run in
> stop_machine context. Hence we can be quite relaxed about locking, for
> example.
>  * I removed your change to __csched_vcpu_is_migrateable() and instead put
> a similar check in csched_load_balance(). I think this is clearer and also
> cheaper.
>
> I note that the VCPU currently running on the offlined CPU continues to run
> there even after __cpu_disable(), and until that CPU does a final run
> through the scheduler soon after. I hope it does not matter there is one
> vcpu with v->processor == offlined_cpu for a short while 

This is not acceptable regarding to machine check. When Dom0 offlines a
defect cpu, nothing may continue on it or silent data corruption occurs.

Christoph


> On 11/9/08 12:33, "Shan, Haitao" <haitao.shan@xxxxxxxxx> wrote:
> > Thanks!
> > Concerning cpu online/offline development, I have a small question here.
> > Since cpu_online_map is very important, code in different subsystems may
> > use it extensively. If such code is not designed with cpu online/offline
> > in mind, it may introduce race conditions, just like the one fixed in cpu
> > calibration rendezvous.
> > Currently, we solve it in a find-and-fix manner. Do you have any idea
> > that can solve the problem in a cleaner way?
> > Thanks in advance.
> >
> > Shan Haitao
> >
> > -----Original Message-----
> > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> > Sent: 2008å9æ11æ 19:13
> > To: Shan, Haitao; Haitao Shan
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in
> > Xen
> >
> > It looks much better. I'll read through, maybe tweak, and most likely
> > then check it in.
> >
> >  Thanks,
> >  Keir
> >
> > On 11/9/08 09:02, "Shan, Haitao" <haitao.shan@xxxxxxxxx> wrote:
> >> Hi, Keir,
> >>
> >> Attached is the updated patch using the methods as you described in
> >> another mail.
> >> What do you think of the one?
> >>
> >> Signed-off-by: Shan Haitao <haitao.shan@xxxxxxxxx>
> >>
> >> Best Regards
> >> Haitao Shan
> >>
> >> Haitao Shan wrote:
> >>> Agree. Placing migration in stop_machine context will definitely make
> >>> our jobs easier. I will start making a new patch tomorrow. :)
> >>> I place the migraton code outside the stop_machine_run context, partly
> >>> because I am not quite sure how long it will take to migrate all the
> >>> vcpus away. If it takes too much time, all useful works are blocked
> >>> since all cpus are in the stop_machine context. Of course, I borrowed
> >>> the ideas from kernel, which also let me made the desicion.
> >>>
> >>> 2008/9/10 Keir Fraser <keir.fraser@xxxxxxxxxxxxx>:
> >>>> I feel this is more complicated than it needs to be.
> >>>>
> >>>> How about clearing VCPUs from the offlined CPU's runqueue from the
> >>>> very end of __cpu_disable()? At that point all other CPUs are safely
> >>>> in softirq context with IRQs disabled, and we are running on the
> >>>> correct CPU (being offlined). We could have a hook into the
> >>>> scheduler subsystem at that point to break affinities, assign to
> >>>> different runqueues, etc. We would just need to be careful not to
> >>>> try an IPI. :-) This approach would not need a cpu_schedule_map
> >>>> (which is really increasing code fragility imo, by creating possible
> >>>> extra confusion about which cpumask is the wright one to use in a
> >>>> given situation).
> >>>>
> >>>> My feeling, unless I've missed something, is that this would make
> >>>> the patch quite a bit smaller and with a smaller spread of code
> >>>> changes.
> >>>>
> >>>>  -- Keir
> >>>>
> >>>> On 9/9/08 09:59, "Shan, Haitao" <haitao.shan@xxxxxxxxx> wrote:
> >>>>> This patch implements cpu offline feature.
> >>>>>
> >>>>> Best Regards
> >>>>> Haitao Shan


-- 
AMD Saxony, Dresden, Germany
Operating System Research Center

Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (GeschÃftsanschrift):
   Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter KomplementÃr:
   AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
GeschÃftsfÃhrer der AMD Saxony LLC:
   Dr. Hans-R. Deppe, Thomas McCoy


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.