[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [UNIKRAFT early RFC PATCH 08/11] plat/kvm/arm: Implement smp boot on arm64 kvm plat



Hi,

On 21/06/2019 07:57, Jia He wrote:
Signed-off-by: Jia He <justin.he@xxxxxxx>
---
  lib/ukboot/Makefile.uk |   1 +
  lib/ukboot/boot.c      |  15 ++++
  plat/kvm/arm/setup.c   | 188 +++++++++++++++++++++++++++++++++++++++++
  3 files changed, 204 insertions(+)

diff --git a/lib/ukboot/Makefile.uk b/lib/ukboot/Makefile.uk
index 55f205d..c8e0b44 100644
--- a/lib/ukboot/Makefile.uk
+++ b/lib/ukboot/Makefile.uk
@@ -1,6 +1,7 @@
  $(eval $(call addlib_s,libukboot,$(CONFIG_LIBUKBOOT)))
CINCLUDES-$(CONFIG_LIBUKBOOT) += -I$(LIBUKBOOT_BASE)/include
+CINCLUDES-$(CONFIG_LIBUKBOOT)          += -I$(UK_PLAT_COMMON_BASE)/include
  CXXINCLUDES-$(CONFIG_LIBUKBOOT)       += -I$(LIBUKBOOT_BASE)/include
LIBUKBOOT_SRCS-y += $(LIBUKBOOT_BASE)/boot.c
diff --git a/lib/ukboot/boot.c b/lib/ukboot/boot.c
index 4846782..d4d49c1 100644
--- a/lib/ukboot/boot.c
+++ b/lib/ukboot/boot.c
@@ -41,6 +41,10 @@
  #include <stdio.h>
  #include <errno.h>
+#if CONFIG_SMP
+#include <smp.h>
+#endif
+
  #if CONFIG_LIBUKALLOC && CONFIG_LIBUKALLOCBBUDDY && CONFIG_LIBUKBOOT_INITALLOC
  #include <uk/allocbbuddy.h>
  #endif
@@ -255,6 +259,17 @@ void ukplat_entry(int argc, char *argv[])
        tma.argc = argc;
        tma.argv = argv;
+#if CONFIG_SMP

I am usually not a big fan of #ifdef. What's wrong of calling this code in non-smp? After all cpu_possible_map[i] would be just -1.

Also, isn't it common code? If so, I think this should be part of a separate patch so you can discuss more easily with x86 folks.

+       uk_pr_info("before start cpu\n");
+
+       for (i = 1; i < MAXCPU; i++) {
+               if (cpu_possible_map[i] != -1)

What does actually promise you that cpu_possible_map[0] will be the boot CPU?

+                       start_cpu(cpu_possible_map[i]);
+       }
+
+       release_aps();

I don't understand why you park all CPUs until they are all up. You don't seem to do any work between the two, so why not trying to make the secondary CPUs working as soon as possible (or parked by the scheduler)?

+#endif
+
  #if CONFIG_LIBUKSCHED
        main_thread = uk_thread_create("main", main_thread_func, &tma);
        if (unlikely(!main_thread))
diff --git a/plat/kvm/arm/setup.c b/plat/kvm/arm/setup.c
index 1f6e458..4035202 100644
--- a/plat/kvm/arm/setup.c
+++ b/plat/kvm/arm/setup.c
@@ -27,6 +27,29 @@
  #include <arm/cpu.h>
  #include <uk/arch/limits.h>
+#include <stdbool.h>
+#include <uk/plat/bootstrap.h>
+#include <uk/sched.h>
+
+#include <smp.h>
+#include <uk/plat/memory.h>
+#include <uk/plat/io.h>
+#include <uk/plat/lcpu.h>
+#include <arm/psci.h>
+#include <ofw/fdt.h>
+#include <time.h>
+
+#ifdef CONFIG_SMP

See my comment about #ifdef above. However, it feels odd that the variable are under #ifdef, but the implement code is not.

+int aps_ready;

That want to a bool.

+int smp_started;

Same here.

+int mp_ncpus;
+int smp_cpus = 1;      /* how many cpu's running */

The names of mp_ncpus and smp_cpus are too close to easily differentiate them.
Also, a general rule I usually apply is any value that can't be negative should be unsigned.

+void mpentry(void);
+uint8_t secondary_stacks[MAXCPU - 1][__PAGE_SIZE * 4];
+int cpu_possible_map[MAXCPU];

AFAICT, cpu_possible_map contains the MIDR. So this want to be uint64_t.

Overall, this need the whole set of declarations needs more comments. Furthermore, please only export what is required. I am pretty sure that at least aps_ready and smp_started can be static.

+static int cpu0 = -1;

Hmmm, is this only used to know cpu0 MIDR? If so, would not it make sense to try to store the MIDR of CPU0 in cpu_possible_map[0] everytime?

+#endif
+
  void *_libkvmplat_pagetable;
  void *_libkvmplat_heap_start;
  void *_libkvmplat_stack_top;
@@ -191,6 +214,160 @@ static void _libkvmplat_entry2(void *arg 
__attribute__((unused)))
        ukplat_entry_argp(NULL, (char *)cmdline, strlen(cmdline));
  }
+static void _init_dtb_cpu(void)

I don't think this is KVM specific.

+{
+       int fdt_cpu;
+       int naddr, nsize;
+       uint64_t core_id, index;
+       int subnode;
+       int i;
+
+       /* Init the cpu_possible_map */
+       for (i = 0; i < MAXCPU; i++)
+               cpu_possible_map[i] = -1;

This could be avoided by initializing the array at the declaration.

+
+       /* Search for assigned VM cpus in DTB */
+       fdt_cpu = fdt_path_offset(_libkvmplat_dtb, "/cpus");
+       if (fdt_cpu < 0)
+               uk_pr_warn("cpus node is not found in device tree\n");

Don't you miss a return here?

+
+       /* Get address,size cell */
+       naddr = fdt_address_cells(_libkvmplat_dtb, fdt_cpu);
+       if (naddr < 0 || naddr >= FDT_MAX_NCELLS) {
+               UK_CRASH("Could not find cpu address!\n");
+               return;
+       }
+       nsize = fdt_size_cells(_libkvmplat_dtb, fdt_cpu);
+       if (nsize < 0 || nsize >= FDT_MAX_NCELLS) {
+               UK_CRASH("Could not find cpu size!\n");
+               return;
+       }
+
+       /* Search all the cpu nodes in DTB */
+       index = 0;
+       fdt_for_each_subnode(subnode, _libkvmplat_dtb, fdt_cpu) {
+               const struct fdt_property *prop;
+               int prop_len = 0;
+
+               index++;
+
+               prop = fdt_get_property(_libkvmplat_dtb, subnode,
+                                               "enable-method", NULL);
+               if (!prop || strcmp(prop->data, "psci")) {
+                       uk_pr_err("Only support psci method!(%s)\n",
+                                       prop->data);
+                       return;
+               }
+
+               prop = fdt_get_property(_libkvmplat_dtb, subnode,
+                                               "device_type", &prop_len);
+               if (!prop)
+                       continue;
+               if (prop_len < 4)

I don't think this check is necessary. DT is considered well-formed, so the string would always finish by a \0.

+                       continue;
+               if (strcmp(prop->data, "cpu"))
+                       continue;
+
+               prop = fdt_get_property(_libkvmplat_dtb, subnode,
+                                               "reg", &prop_len);
+               if (prop == NULL || prop_len <= 0) {

Why <=0? Don't you want to check that prop_len is big enough for reading the 
number?

Also, I think split the check in 2 would help for debugging. The more...

+                       uk_pr_err("Error when searching reg property\n");

... this is not making much sense when you found the property but the length is not correct.

+                       return;
+               }
+
+               core_id = fdt_reg_read_number((const fdt32_t *)prop->data,
+                                               naddr);

I think we want to make any RES0 bits just in case they are defined for other purpose in the future.

+               cpu_possible_map[index-1] = core_id;
+               mp_ncpus++;
+       }
+}
+
+void release_aps(void)

I am not convinced the model of parking the CPUs in init_secondary is correct. (see above). I will still comment on the code just in case we decide to go ahead.

+{
+       int i, started;
+
+       /* Only release CPUs if they exist */
+       if (mp_ncpus == 1)
+               return;
+
+       //TODO: make aps_ready atomic

I don't think you need aps_ready to be atomic.

+       aps_ready = 1;
+
+       /* Wake up the other CPUs */
+       __asm __volatile(
+               "dsb ishst \n"
+               "sev               \n"
+               ::: "memory");
+
+       uk_pr_info("Release APs...");
+
+       started = 0;
+       for (i = 0; i < 20000; i++) {
+               if (smp_started) {

As smp_started is modified by an external entity (the secondary CPUs), I think the compiler is free to assume this is never modified. You can either make smp_started a volatile or use barrier().

But I don't think the variable is necessary, you could just check:

if ( smp_cpus == mp_cpus )

+                       uk_pr_info("done\n");
+                       return;
+               }
+               /*
+                * Don't time out while we are making progress. Some large
+                * systems can take a while to start all CPUs.
+                */
+               if (smp_cpus > started) {

Same as smp_started, you want a compiler barrier in this code. One for the two should be enough.

+                       i = 0;
+                       started = smp_cpus;
+               }
+
+               uk_pr_info("sleep for a while\n");
+               mdelay(1);
+       }
+
+       uk_pr_err("APs not started\n");
+}
+
+void init_secondary(uint64_t cpu)

A previous patch is using init_secondary before it is actually implemented. Can you make sure that all the patch is at least building one by one? This would avoid breaking bisection with can be useful to pin point a commit when error are introduced.

+{
+       struct uk_sched *s = NULL;
+       struct uk_alloc *a = NULL;

This is defined but not used here.

+
+       uk_pr_info("init secondary cpu=%lu\n", cpu);

How about:

"Booting CPU%lu\n"?

+
+       /* Spin until the BSP releases the APs */
+       while (!aps_ready)

See my comment regarding smp_started.

+               __asm __volatile("wfe");
+       uk_pr_info("after wfe cpu=%lu\n", cpu);

A more useful message would be "CPU%lu released".

+
+       smp_cpus += 1;

As you release all the CPUs together, this needs to be atomic. If you plan to bring-up CPU one by one (as suggested above), then you are saving some trouble here.

+
+       if (smp_cpus == mp_ncpus)
+               smp_started = 1;
+}
+
+void start_cpu(uint64_t target_cpu)
+{
+

NIT: spurious line.

+       uint32_t pa;

I think this want to be a __phys_addr as PA is 64-bit.

+       int err;
+
+       /* Check we are able to start this cpu */
+       UK_ASSERT(target_cpu < MAXCPU);
+
+       uk_pr_info("Starting CPU %lu\n", target_cpu);

NIT: It would be good to stay consistent when printing CPU. So I would do s/CPU %lu/CPU%lu/

+
+       /* We are already running on cpu 0 */
+       if (target_cpu == (uint64_t)cpu0)
+               return;

If you make sure cpu_possible_map[0] is always the CPU0, then you save the trouble to keep cpu0 around.

+
+       pa = ukplat_virt_to_phys(mpentry);
+       err = psci_cpu_on(target_cpu, pa);
+       if (err != PSCI_RET_SUCCESS) {
+               mp_ncpus--;

Isn't mp_ncpus meants to also know the number of element in cpu_map_possible? At least this would avoid to browse the full array.

In any case, I would be careful to decrement the number of mp_ncpus here.

+
+               /* Notify the user that the CPU failed to start */
+               uk_pr_info("Failed to start CPU (%lx)\n", target_cpu);

Ditto here for CPU. Also, you probably want to report the error here.

Lastly, don't you need to return here?

+       }
+
+       uk_pr_info("Starting CPU %lu successfully\n", target_cpu);

Ditto for CPU.

+}
+
  void _libkvmplat_start(void *dtb_pointer)
  {
        _init_dtb(dtb_pointer);
@@ -214,6 +391,17 @@ void _libkvmplat_start(void *dtb_pointer)
        uk_pr_info("     heap start: %p\n", _libkvmplat_heap_start);
        uk_pr_info("      stack top: %p\n", _libkvmplat_stack_top);
+ _init_dtb_cpu();
+
+       if (cpu0 < 0) {

This is a bit confusing. No one is setting cpu0 before here.

+               uint64_t mpidr_reg = SYSREG_READ32(mpidr_el1);

I am afraid that system registers are always 64-bit on Arm64. We made the same the mistakes on Xen.

For a first, Clang will definitely not be happy with it (see commit a17138b0e7 "xen/arm64: sysreg: Implement the 32-bit helpers using the 64-bit helpers").

Secondly, while some register have the top 32-bit RES0, they may be defined in the future. This is the case of SCTLR_EL2 in recent revision of the architecture.

In nutshell, SYSREG_*32 should be completely removed.

+
+               uk_pr_info("get mpidr_el1 0x%lx\n", mpidr_reg);
+
+               if ((mpidr_reg & 0xff00fffffful) == 0)

Please introduce a mask for the value. This is likely going to be helpful somewhere. But I don't think you can assume that CPU0 will always have the MIDR equal to 0 (accounting the mask).

+                       cpu0 = 0;
+       }
+
        /*
         * Switch away from the bootstrap stack as early as possible.
         */


Cheers,

--
Julien Grall

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.