[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 0/N] xen: arm: rework early bring up



On 09/18/2013 10:28 PM, Ian Campbell wrote:
> On Wed, 2013-09-18 at 22:14 +0100, Julien Grall wrote:
>> On 09/18/2013 09:22 PM, Ian Campbell wrote:
>>> On Wed, 2013-09-18 at 21:00 +0100, Julien Grall wrote:
>>>> On 09/18/2013 08:24 PM, Ian Campbell wrote:
>>>>> On Wed, 2013-09-18 at 20:22 +0100, Ian Campbell wrote:
>>>>>> On Wed, 2013-09-18 at 18:33 +0100, Julien Grall wrote:
>>>>>>> On 09/17/2013 02:37 AM, Ian Campbell wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> The following reworks early bring up on ARM to use a separate set of
>>>>>>>> boot pagetables. This simplifies things by avoiding the need to bring 
>>>>>>>> up
>>>>>>>> all CPUs in lock step, which in turn allows us to do secondary CPU
>>>>>>>> bringup in C code.
>>>>>>>>
>>>>>>>> Unfortunately the main bulk of this change is a single large patch 
>>>>>>>> which
>>>>>>>> is hard to decompose any further since it is basically pulling on the
>>>>>>>> thread and then knitting a new jumper from it.
>>>>>>>>
>>>>>>>> With these changes Xen now absolutely requires that the bootloader 
>>>>>>>> calls
>>>>>>>> the hypervisor in HYP mode, the previous workarounds have been removed.
>>>>>>>> For use on models a bootwrapper is now required. See
>>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper.git xen-arm32
>>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git 
>>>>>>>> xen-arm64
>>>>>>>>
>>>>>>>> I have implemented support for CPU bringup on the fastmodel vexpress
>>>>>>>> platforms (v7and v8) here, I suppose it should work OK on a real
>>>>>>>> vexpress too but I've not tried it.
>>>>>>>
>>>>>>> I have just tried the patch series on the versatile express and it
>>>>>>> doesn't work. I'm using the u-boot from Andre and all the cpus already
>>>>>>> boot in hyp mode (checked without this patch series).
>>>>>>
>>>>>> :-(
>>>>>>
>>>>>> The code should essentially the same as the old kick cpus, but
>>>>>> reimplemented in C. The only difference I can spot is that the C version
>>>>>> is kicking a single CPU at a time while the ASM version is kicking
>>>>>> everyone at once. Does using send_SGI_allbutself instead work as a hack?
>>>>>>
>>>>>> Perhaps our idea of the CPUID is wrong or something? Your patches to
>>>>>> fixup the logical CPU id stuff might help?
>>>>
>>>> I have found the issue, in send_SGI_mask we remove the offline CPUs with
>>>> the following line:
>>>>     mask &= cpumask_bits(&cpu_online_map);
>>>>
>>>> As the secondary cpu (assuming we have a board with only 2 cores) is not
>>>> yet online, we can't send SGI with our code.
>>>> I'm very surprised that the code works on the fast models.
>>>
>>> So am I, now.
>>>
>>> OK. So online mask == 0x1 and we are trying to wake mask 0x2, so the
>>> actual target list we end up using is 0...
>>>
>>> The gic manual says "When TargetListFilter is 0b00, if the CPUTargetList
>>> field is 0x00 the Distributor does not forward the interrupt to any CPU
>>> interface." (footnote to table 4-21), so it shouldn't be waking
>>> anything... It may be a model bug
>>>
>>>> To solve the issue, I think we need to create another helper which will
>>>> send an SGI wihout removing offline CPUs.
>>>
>>> Looking briefly at the callchains leading to send_SGI_mask I don't see
>>> any which deliberately pass an offline CPU (which would be pretty
>>> non-sensical apart from this one case). So we could perhaps just remove
>>> this check and make it the caller's problem.
>>> Or maybe the right check is for cpu_possible mask or something similar
>>> instead of online?
>>
>> You are right, cpu_possible mask is better.
>> I can modify this patch http://patches.linaro.org/20437/ to add the 
>> check in gic_cpu_mask.
> 
> Sounds reasonable.

Hmm ... in fact this solution is not enough :/.
With my patch series "handle non-contiguous CPU ID" (mainly the patch #3
http://patches.linaro.org/20437/), the gic cpu ID of the target cpu is
not yet initialize.

Linux solves this issue by setting each gic cpu mask (our gic_cpu_id) to
0xff and refine each time a new cpu boot. If we choose this solution
that would mean that all secondaries cpus will boot at the same time.
As I understand with your new boot code (patch #5), we are only able to
handle one cpu simultaneously.

Perhaps, we need to reintroduce an holding pen which contains the MPIDR.

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.