[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [UNIKRAFT PATCH 3/9] build: Move arm32 libraries to new family/architecture folder



Hi Simon,

> -----Original Message-----
> From: Simon Kuenzer <simon.kuenzer@xxxxxxxxx>
> Sent: 2018年4月5日 5:28
> To: Wei Chen <Wei.Chen@xxxxxxx>; minios-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Shijie Huang <Shijie.Huang@xxxxxxx>; Kaly Xin <Kaly.Xin@xxxxxxx>; nd
> <nd@xxxxxxx>
> Subject: Re: [UNIKRAFT PATCH 3/9] build: Move arm32 libraries to new
> family/architecture folder
> 
> On 15.03.2018 04:39, Wei Chen wrote:
> > We have use the arch/<FAMILY>/ to store the common codes and build
> > scripts for the architectures of same family. The original codes
> > that stored in arch/arm32 are arm32 codes. So we have move them to
> > the new folder arch/arm/arm32.
> >
> > Signed-off-by: Wei Chen <Wei.Chen@xxxxxxx>
> > ---
> >   arch/arm/Makefile.uk            |  15 +-
> >   arch/arm/arm32/divsi3.S         | 404
> ++++++++++++++++++++++++++++++++++++++++
> >   arch/arm/arm32/ldivmod.S        |  68 +++++++
> >   arch/arm/arm32/ldivmod_helper.c |  67 +++++++
> >   arch/arm/arm32/qdivrem.c        | 324 ++++++++++++++++++++++++++++++++
> >   arch/arm/divsi3.S               | 404 ------------------------------------
> ----
> >   arch/arm/ldivmod.S              |  68 -------
> >   arch/arm/ldivmod_helper.c       |  67 -------
> >   arch/arm/qdivrem.c              | 324 --------------------------------
> >   9 files changed, 874 insertions(+), 867 deletions(-)
> >   create mode 100644 arch/arm/arm32/divsi3.S
> >   create mode 100644 arch/arm/arm32/ldivmod.S
> >   create mode 100644 arch/arm/arm32/ldivmod_helper.c
> >   create mode 100644 arch/arm/arm32/qdivrem.c
> >   delete mode 100644 arch/arm/divsi3.S
> >   delete mode 100644 arch/arm/ldivmod.S
> >   delete mode 100644 arch/arm/ldivmod_helper.c
> >   delete mode 100644 arch/arm/qdivrem.c
> >
> > diff --git a/arch/arm/Makefile.uk b/arch/arm/Makefile.uk
> > index 2567dbe..780a035 100644
> > --- a/arch/arm/Makefile.uk
> > +++ b/arch/arm/Makefile.uk
> > @@ -1,3 +1,6 @@
> > +# Setup compiler flags and objects for arm32
> 
> I would call it libraries in the comment instead of objects. libarmmath
> is essentially a library for ARM32. Will you need something similar for
> AARCH64?

Current libarmmath provides the integer arithmetic routines for those
Platforms that don't provide hardware float support. The hardware float
support of AArch64 is mandatory. For half/single/double precious
float-point, AArch64 Floating-point instructions can handle them directly.
For 128-bit float (long double), we should implement some routines like
__divtf3 in libarmmath.

But I think, currently, AArch64 doesn't need this libarmmath.

> If yes, keep the LIBARMMATH_SRCS-$(ARCH_ARM_32) form and do the
> addlib call for both CPU architectures. If a library does not have any
> source files defined, the build system will automatically exclude it
> from the build.
> 
> > +ifeq ($(UK_ARCH),arm)
> > +
> >   ASFLAGS  += -D__ARM_32__
> >   ASFLAGS  += -marm
> >   CFLAGS   += -D__ARM_32__
> > @@ -13,7 +16,11 @@ CFLAGS-$(MARCH_CORTEXA7)   += -mcpu=cortex-a7 -
> mtune=cortex-a7
> >   CXXFLAGS-$(MARCH_CORTEXA7) += -mcpu=cortex-a7 -mtune=cortex-a
> >
> >   $(eval $(call addlib,libarmmath))
> > -LIBARMMATH_SRCS-$(ARCH_ARM_32) += $(UK_BASE)/arch/arm/divsi3.S
> > -LIBARMMATH_SRCS-$(ARCH_ARM_32) += $(UK_BASE)/arch/arm/ldivmod.S
> > -LIBARMMATH_SRCS-$(ARCH_ARM_32) += $(UK_BASE)/arch/arm/ldivmod_helper.c
> > -LIBARMMATH_SRCS-$(ARCH_ARM_32) += $(UK_BASE)/arch/arm/qdivrem.c
> > +LIBARMMATH_SRCS-y += $(UK_BASE)/arch/arm/arm32/divsi3.S
> > +LIBARMMATH_SRCS-y += $(UK_BASE)/arch/arm/arm32/ldivmod.S
> > +LIBARMMATH_SRCS-y += $(UK_BASE)/arch/arm/arm32/ldivmod_helper.c
> > +LIBARMMATH_SRCS-y += $(UK_BASE)/arch/arm/arm32/qdivrem.c
> > +
> > +else
> > +$(error Target architecture ($(UK_ARCH)) is currently not supported.)
> > +endif
> > diff --git a/arch/arm/arm32/divsi3.S b/arch/arm/arm32/divsi3.S
> > new file mode 100644
> > index 0000000..8bf5ac2
> > --- /dev/null
> > +++ b/arch/arm/arm32/divsi3.S
> > @@ -0,0 +1,404 @@
> > +/* $NetBSD: divsi3.S,v 1.4 2003/04/05 23:27:15 bjh21 Exp $ */
> > +
> > +/*-
> > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE.
> > + */
> > +
> > +#define ENTRY_NP(symbol) \
> > +   .globl symbol;  \
> > +   symbol:
> > +
> > +#define END(symbol)
> > +
> > +/*
> > + * stack is aligned as there's a possibility of branching to L_overflow
> > + * which makes a C call
> > + */
> > +
> > +ENTRY_NP(__umodsi3)
> > +   stmfd   sp!, {lr}
> > +   sub     sp, sp, #4      /* align stack */
> > +   bl      .L_udivide
> > +   add     sp, sp, #4      /* unalign stack */
> > +   mov     r0, r1
> > +   ldmfd   sp!, {pc}
> > +END(__umodsi3)
> > +
> > +ENTRY_NP(__modsi3)
> > +   stmfd   sp!, {lr}
> > +   sub     sp, sp, #4      /* align stack */
> > +   bl      .L_divide
> > +   add     sp, sp, #4      /* unalign stack */
> > +   mov     r0, r1
> > +   ldmfd   sp!, {pc}
> > +
> > +.L_overflow:
> > +   /* XXX should cause a fatal error */
> > +   mvn     r0, #0
> > +   mov     pc, lr
> > +
> > +END(__modsi3)
> > +
> > +#ifdef __ARM_EABI__
> > +ENTRY_NP(__aeabi_uidiv)
> > +ENTRY_NP(__aeabi_uidivmod)
> > +#endif
> > +ENTRY_NP(__udivsi3)
> > +.L_udivide:                                /* r0 = r0 / r1; r1 = r0 % r1 */
> > +   eor     r0, r1, r0
> > +   eor     r1, r0, r1
> > +   eor     r0, r1, r0
> > +                                   /* r0 = r1 / r0; r1 = r1 % r0 */
> > +   cmp     r0, #1
> > +   bcc     .L_overflow
> > +   beq     .L_divide_l0
> > +   mov     ip, #0
> > +   movs    r1, r1
> > +   bpl     .L_divide_l1
> > +   orr     ip, ip, #0x20000000     /* ip bit 0x20000000 = -ve r1 */
> > +   movs    r1, r1, lsr #1
> > +   orrcs   ip, ip, #0x10000000     /* ip bit 0x10000000 = bit 0 of r1 */
> > +   b       .L_divide_l1
> > +
> > +.L_divide_l0:                              /* r0 == 1 */
> > +   mov     r0, r1
> > +   mov     r1, #0
> > +   mov     pc, lr
> > +#ifdef __ARM_EABI__
> > +END(__aeabi_uidiv)
> > +END(__aeabi_uidivmod)
> > +#endif
> > +END(__udivsi3)
> > +
> > +#ifdef __ARM_EABI__
> > +ENTRY_NP(__aeabi_idiv)
> > +ENTRY_NP(__aeabi_idivmod)
> > +#endif
> > +ENTRY_NP(__divsi3)
> > +.L_divide:                         /* r0 = r0 / r1; r1 = r0 % r1 */
> > +   eor     r0, r1, r0
> > +   eor     r1, r0, r1
> > +   eor     r0, r1, r0
> > +                                   /* r0 = r1 / r0; r1 = r1 % r0 */
> > +   cmp     r0, #1
> > +   bcc     .L_overflow
> > +   beq     .L_divide_l0
> > +   ands    ip, r0, #0x80000000
> > +   rsbmi   r0, r0, #0
> > +   ands    r2, r1, #0x80000000
> > +   eor     ip, ip, r2
> > +   rsbmi   r1, r1, #0
> > +   orr     ip, r2, ip, lsr #1      /* ip bit 0x40000000 = -ve division */
> > +                                   /* ip bit 0x80000000 = -ve remainder */
> > +
> > +.L_divide_l1:
> > +   mov     r2, #1
> > +   mov     r3, #0
> > +
> > +   /*
> > +    * If the highest bit of the dividend is set, we have to be
> > +    * careful when shifting the divisor. Test this.
> > +    */
> > +   movs    r1,r1
> > +   bpl     .L_old_code
> > +
> > +   /*
> > +    * At this point, the highest bit of r1 is known to be set.
> > +    * We abuse this below in the tst instructions.
> > +    */
> > +   tst     r1, r0 /*, lsl #0 */
> > +   bmi     .L_divide_b1
> > +   tst     r1, r0, lsl #1
> > +   bmi     .L_divide_b2
> > +   tst     r1, r0, lsl #2
> > +   bmi     .L_divide_b3
> > +   tst     r1, r0, lsl #3
> > +   bmi     .L_divide_b4
> > +   tst     r1, r0, lsl #4
> > +   bmi     .L_divide_b5
> > +   tst     r1, r0, lsl #5
> > +   bmi     .L_divide_b6
> > +   tst     r1, r0, lsl #6
> > +   bmi     .L_divide_b7
> > +   tst     r1, r0, lsl #7
> > +   bmi     .L_divide_b8
> > +   tst     r1, r0, lsl #8
> > +   bmi     .L_divide_b9
> > +   tst     r1, r0, lsl #9
> > +   bmi     .L_divide_b10
> > +   tst     r1, r0, lsl #10
> > +   bmi     .L_divide_b11
> > +   tst     r1, r0, lsl #11
> > +   bmi     .L_divide_b12
> > +   tst     r1, r0, lsl #12
> > +   bmi     .L_divide_b13
> > +   tst     r1, r0, lsl #13
> > +   bmi     .L_divide_b14
> > +   tst     r1, r0, lsl #14
> > +   bmi     .L_divide_b15
> > +   tst     r1, r0, lsl #15
> > +   bmi     .L_divide_b16
> > +   tst     r1, r0, lsl #16
> > +   bmi     .L_divide_b17
> > +   tst     r1, r0, lsl #17
> > +   bmi     .L_divide_b18
> > +   tst     r1, r0, lsl #18
> > +   bmi     .L_divide_b19
> > +   tst     r1, r0, lsl #19
> > +   bmi     .L_divide_b20
> > +   tst     r1, r0, lsl #20
> > +   bmi     .L_divide_b21
> > +   tst     r1, r0, lsl #21
> > +   bmi     .L_divide_b22
> > +   tst     r1, r0, lsl #22
> > +   bmi     .L_divide_b23
> > +   tst     r1, r0, lsl #23
> > +   bmi     .L_divide_b24
> > +   tst     r1, r0, lsl #24
> > +   bmi     .L_divide_b25
> > +   tst     r1, r0, lsl #25
> > +   bmi     .L_divide_b26
> > +   tst     r1, r0, lsl #26
> > +   bmi     .L_divide_b27
> > +   tst     r1, r0, lsl #27
> > +   bmi     .L_divide_b28
> > +   tst     r1, r0, lsl #28
> > +   bmi     .L_divide_b29
> > +   tst     r1, r0, lsl #29
> > +   bmi     .L_divide_b30
> > +   tst     r1, r0, lsl #30
> > +   bmi     .L_divide_b31
> > +/*
> > + * instead of:
> > + * tst     r1, r0, lsl #31
> > + * bmi     .L_divide_b32
> > + */
> > +   b       .L_divide_b32
> > +
> > +.L_old_code:
> > +   cmp     r1, r0
> > +   bcc     .L_divide_b0
> > +   cmp     r1, r0, lsl #1
> > +   bcc     .L_divide_b1
> > +   cmp     r1, r0, lsl #2
> > +   bcc     .L_divide_b2
> > +   cmp     r1, r0, lsl #3
> > +   bcc     .L_divide_b3
> > +   cmp     r1, r0, lsl #4
> > +   bcc     .L_divide_b4
> > +   cmp     r1, r0, lsl #5
> > +   bcc     .L_divide_b5
> > +   cmp     r1, r0, lsl #6
> > +   bcc     .L_divide_b6
> > +   cmp     r1, r0, lsl #7
> > +   bcc     .L_divide_b7
> > +   cmp     r1, r0, lsl #8
> > +   bcc     .L_divide_b8
> > +   cmp     r1, r0, lsl #9
> > +   bcc     .L_divide_b9
> > +   cmp     r1, r0, lsl #10
> > +   bcc     .L_divide_b10
> > +   cmp     r1, r0, lsl #11
> > +   bcc     .L_divide_b11
> > +   cmp     r1, r0, lsl #12
> > +   bcc     .L_divide_b12
> > +   cmp     r1, r0, lsl #13
> > +   bcc     .L_divide_b13
> > +   cmp     r1, r0, lsl #14
> > +   bcc     .L_divide_b14
> > +   cmp     r1, r0, lsl #15
> > +   bcc     .L_divide_b15
> > +   cmp     r1, r0, lsl #16
> > +   bcc     .L_divide_b16
> > +   cmp     r1, r0, lsl #17
> > +   bcc     .L_divide_b17
> > +   cmp     r1, r0, lsl #18
> > +   bcc     .L_divide_b18
> > +   cmp     r1, r0, lsl #19
> > +   bcc     .L_divide_b19
> > +   cmp     r1, r0, lsl #20
> > +   bcc     .L_divide_b20
> > +   cmp     r1, r0, lsl #21
> > +   bcc     .L_divide_b21
> > +   cmp     r1, r0, lsl #22
> > +   bcc     .L_divide_b22
> > +   cmp     r1, r0, lsl #23
> > +   bcc     .L_divide_b23
> > +   cmp     r1, r0, lsl #24
> > +   bcc     .L_divide_b24
> > +   cmp     r1, r0, lsl #25
> > +   bcc     .L_divide_b25
> > +   cmp     r1, r0, lsl #26
> > +   bcc     .L_divide_b26
> > +   cmp     r1, r0, lsl #27
> > +   bcc     .L_divide_b27
> > +   cmp     r1, r0, lsl #28
> > +   bcc     .L_divide_b28
> > +   cmp     r1, r0, lsl #29
> > +   bcc     .L_divide_b29
> > +   cmp     r1, r0, lsl #30
> > +   bcc     .L_divide_b30
> > +.L_divide_b32:
> > +   cmp     r1, r0, lsl #31
> > +   subhs   r1, r1,r0, lsl #31
> > +   addhs   r3, r3,r2, lsl #31
> > +.L_divide_b31:
> > +   cmp     r1, r0, lsl #30
> > +   subhs   r1, r1,r0, lsl #30
> > +   addhs   r3, r3,r2, lsl #30
> > +.L_divide_b30:
> > +   cmp     r1, r0, lsl #29
> > +   subhs   r1, r1,r0, lsl #29
> > +   addhs   r3, r3,r2, lsl #29
> > +.L_divide_b29:
> > +   cmp     r1, r0, lsl #28
> > +   subhs   r1, r1,r0, lsl #28
> > +   addhs   r3, r3,r2, lsl #28
> > +.L_divide_b28:
> > +   cmp     r1, r0, lsl #27
> > +   subhs   r1, r1,r0, lsl #27
> > +   addhs   r3, r3,r2, lsl #27
> > +.L_divide_b27:
> > +   cmp     r1, r0, lsl #26
> > +   subhs   r1, r1,r0, lsl #26
> > +   addhs   r3, r3,r2, lsl #26
> > +.L_divide_b26:
> > +   cmp     r1, r0, lsl #25
> > +   subhs   r1, r1,r0, lsl #25
> > +   addhs   r3, r3,r2, lsl #25
> > +.L_divide_b25:
> > +   cmp     r1, r0, lsl #24
> > +   subhs   r1, r1,r0, lsl #24
> > +   addhs   r3, r3,r2, lsl #24
> > +.L_divide_b24:
> > +   cmp     r1, r0, lsl #23
> > +   subhs   r1, r1,r0, lsl #23
> > +   addhs   r3, r3,r2, lsl #23
> > +.L_divide_b23:
> > +   cmp     r1, r0, lsl #22
> > +   subhs   r1, r1,r0, lsl #22
> > +   addhs   r3, r3,r2, lsl #22
> > +.L_divide_b22:
> > +   cmp     r1, r0, lsl #21
> > +   subhs   r1, r1,r0, lsl #21
> > +   addhs   r3, r3,r2, lsl #21
> > +.L_divide_b21:
> > +   cmp     r1, r0, lsl #20
> > +   subhs   r1, r1,r0, lsl #20
> > +   addhs   r3, r3,r2, lsl #20
> > +.L_divide_b20:
> > +   cmp     r1, r0, lsl #19
> > +   subhs   r1, r1,r0, lsl #19
> > +   addhs   r3, r3,r2, lsl #19
> > +.L_divide_b19:
> > +   cmp     r1, r0, lsl #18
> > +   subhs   r1, r1,r0, lsl #18
> > +   addhs   r3, r3,r2, lsl #18
> > +.L_divide_b18:
> > +   cmp     r1, r0, lsl #17
> > +   subhs   r1, r1,r0, lsl #17
> > +   addhs   r3, r3,r2, lsl #17
> > +.L_divide_b17:
> > +   cmp     r1, r0, lsl #16
> > +   subhs   r1, r1,r0, lsl #16
> > +   addhs   r3, r3,r2, lsl #16
> > +.L_divide_b16:
> > +   cmp     r1, r0, lsl #15
> > +   subhs   r1, r1,r0, lsl #15
> > +   addhs   r3, r3,r2, lsl #15
> > +.L_divide_b15:
> > +   cmp     r1, r0, lsl #14
> > +   subhs   r1, r1,r0, lsl #14
> > +   addhs   r3, r3,r2, lsl #14
> > +.L_divide_b14:
> > +   cmp     r1, r0, lsl #13
> > +   subhs   r1, r1,r0, lsl #13
> > +   addhs   r3, r3,r2, lsl #13
> > +.L_divide_b13:
> > +   cmp     r1, r0, lsl #12
> > +   subhs   r1, r1,r0, lsl #12
> > +   addhs   r3, r3,r2, lsl #12
> > +.L_divide_b12:
> > +   cmp     r1, r0, lsl #11
> > +   subhs   r1, r1,r0, lsl #11
> > +   addhs   r3, r3,r2, lsl #11
> > +.L_divide_b11:
> > +   cmp     r1, r0, lsl #10
> > +   subhs   r1, r1,r0, lsl #10
> > +   addhs   r3, r3,r2, lsl #10
> > +.L_divide_b10:
> > +   cmp     r1, r0, lsl #9
> > +   subhs   r1, r1,r0, lsl #9
> > +   addhs   r3, r3,r2, lsl #9
> > +.L_divide_b9:
> > +   cmp     r1, r0, lsl #8
> > +   subhs   r1, r1,r0, lsl #8
> > +   addhs   r3, r3,r2, lsl #8
> > +.L_divide_b8:
> > +   cmp     r1, r0, lsl #7
> > +   subhs   r1, r1,r0, lsl #7
> > +   addhs   r3, r3,r2, lsl #7
> > +.L_divide_b7:
> > +   cmp     r1, r0, lsl #6
> > +   subhs   r1, r1,r0, lsl #6
> > +   addhs   r3, r3,r2, lsl #6
> > +.L_divide_b6:
> > +   cmp     r1, r0, lsl #5
> > +   subhs   r1, r1,r0, lsl #5
> > +   addhs   r3, r3,r2, lsl #5
> > +.L_divide_b5:
> > +   cmp     r1, r0, lsl #4
> > +   subhs   r1, r1,r0, lsl #4
> > +   addhs   r3, r3,r2, lsl #4
> > +.L_divide_b4:
> > +   cmp     r1, r0, lsl #3
> > +   subhs   r1, r1,r0, lsl #3
> > +   addhs   r3, r3,r2, lsl #3
> > +.L_divide_b3:
> > +   cmp     r1, r0, lsl #2
> > +   subhs   r1, r1,r0, lsl #2
> > +   addhs   r3, r3,r2, lsl #2
> > +.L_divide_b2:
> > +   cmp     r1, r0, lsl #1
> > +   subhs   r1, r1,r0, lsl #1
> > +   addhs   r3, r3,r2, lsl #1
> > +.L_divide_b1:
> > +   cmp     r1, r0
> > +   subhs   r1, r1, r0
> > +   addhs   r3, r3, r2
> > +.L_divide_b0:
> > +
> > +   tst     ip, #0x20000000
> > +   bne     .L_udivide_l1
> > +   mov     r0, r3
> > +   cmp     ip, #0
> > +   rsbmi   r1, r1, #0
> > +   movs    ip, ip, lsl #1
> > +   bicmi   r0, r0, #0x80000000     /* Fix incase we divided 0x80000000 */
> > +   rsbmi   r0, r0, #0
> > +   mov     pc, lr
> > +
> > +.L_udivide_l1:
> > +   tst     ip, #0x10000000
> > +   mov     r1, r1, lsl #1
> > +   orrne   r1, r1, #1
> > +   mov     r3, r3, lsl #1
> > +   cmp     r1, r0
> > +   subhs   r1, r1, r0
> > +   addhs   r3, r3, r2
> > +   mov     r0, r3
> > +   mov     pc, lr
> > +END(__aeabi_idiv)
> > +END(__aeabi_idivmod)
> > +END(__divsi3)
> > +
> > diff --git a/arch/arm/arm32/ldivmod.S b/arch/arm/arm32/ldivmod.S
> > new file mode 100644
> > index 0000000..180227c
> > --- /dev/null
> > +++ b/arch/arm/arm32/ldivmod.S
> > @@ -0,0 +1,68 @@
> > +/* SPDX-License-Identifier: BSD-2-Clause */
> > +/*
> > + * Copyright (C) 2012 Andrew Turner
> > + * All rights reserved.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE.
> > + *
> > + */
> > +
> > +#define ENTRY_NP(symbol) \
> > +   .globl symbol;  \
> > +   symbol:
> > +
> > +#define END(symbol)
> > +
> > +/*
> > + * These calculate:
> > + * q = n / m
> > + * With a remainer r.
> > + *
> > + * They take n in {r0, r1} and m in {r2, r3} then pass them into the
> > + * helper function. The hepler functions return q in {r0, r1} as
> > + * required by the API spec however r is returned on the stack. The
> > + * ABI required us to return r in {r2, r3}.
> > + *
> > + * We need to allocate 8 bytes on the stack to store r, the link
> > + * register, and a pointer to the space where the helper function
> > + * will write r to. After returning from the helper fuinction we load
> > + * the old link register and r from the stack and return.
> > + */
> > +ENTRY_NP(__aeabi_ldivmod)
> > +   sub     sp, sp, #8      /* Space for the remainder */
> > +   stmfd   sp!, {sp, lr}   /* Save a pointer to the above space and lr */
> > +   bl      __kern_ldivmod
> > +   ldr     lr, [sp, #4]    /* Restore lr */
> > +   add     sp, sp, #8      /* Move sp to the remainder value */
> > +   ldmfd   sp!, {r2, r3}   /* Load the remainder */
> > +   mov     pc, lr
> > +END(__aeabi_ldivmod)
> > +
> > +ENTRY_NP(__aeabi_uldivmod)
> > +   sub     sp, sp, #8      /* Space for the remainder */
> > +   stmfd   sp!, {sp, lr}   /* Save a pointer to the above space and lr */
> > +   bl      __qdivrem
> > +   ldr     lr, [sp, #4]    /* Restore lr */
> > +   add     sp, sp, #8      /* Move sp to the remainder value */
> > +   ldmfd   sp!, {r2, r3}   /* Load the remainder */
> > +   mov     pc, lr
> > +END(__aeabi_uldivmod)
> > diff --git a/arch/arm/arm32/ldivmod_helper.c
> b/arch/arm/arm32/ldivmod_helper.c
> > new file mode 100644
> > index 0000000..098523e
> > --- /dev/null
> > +++ b/arch/arm/arm32/ldivmod_helper.c
> > @@ -0,0 +1,67 @@
> > +/* SPDX-License-Identifier: BSD-2-Clause */
> > +/*
> > + * Copyright (C) 2012 Andrew Turner
> > + * All rights reserved.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE.
> > + *
> > + */
> > +
> > +#include <uk/arch/types.h>
> > +
> > +__u64 __qdivrem(__u64 u, __u64 v, __u64 *rem);
> > +
> > +#ifndef HAVE_LIBC
> > +__s64 __divdi3(__s64 a, __s64 b)
> > +{
> > +   __u64 ua, ub, uq;
> > +   int neg;
> > +
> > +   if (a < 0)
> > +           ua = -(__u64)a, neg = 1;
> > +   else
> > +           ua = a, neg = 0;
> > +   if (b < 0)
> > +           ub = -(__u64)b, neg ^= 1;
> > +   else
> > +           ub = b;
> > +   uq = __qdivrem(ua, ub, (__u64 *)0);
> > +   return neg ? -uq : uq;
> > +}
> > +#endif
> > +
> > +/*
> > + * Helper for __aeabi_ldivmod.
> > + * TODO: __divdi3 calls __qdivrem. We should do the same and use the
> > + * remainder value rather than re-calculating it.
> > + */
> > +long long __kern_ldivmod(long long, long long, long long *);
> > +
> > +long long __kern_ldivmod(long long n, long long m, long long *rem)
> > +{
> > +   long long q;
> > +
> > +   q = __divdi3(n, m); /* q = n / m */
> > +   *rem = n - m * q;
> > +
> > +   return q;
> > +}
> > diff --git a/arch/arm/arm32/qdivrem.c b/arch/arm/arm32/qdivrem.c
> > new file mode 100644
> > index 0000000..e7d1471
> > --- /dev/null
> > +++ b/arch/arm/arm32/qdivrem.c
> > @@ -0,0 +1,324 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause */
> > +/*-
> > + * Copyright (c) 1992, 1993
> > + * The Regents of the University of California.  All rights reserved.
> > + *
> > + * This software was developed by the Computer Systems Engineering group
> > + * at Lawrence Berkeley Laboratory under DARPA contract BG 91-66 and
> > + * contributed to Berkeley.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + * 4. Neither the name of the University nor the names of its contributors
> > + *    may be used to endorse or promote products derived from this software
> > + *    without specific prior written permission.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE.
> > + */
> > +
> > +#include <uk/arch/types.h>
> > +#include <uk/arch/limits.h>
> > +
> > +/*
> > + * Multiprecision divide.  This algorithm is from Knuth vol. 2 (2nd ed),
> > + * section 4.3.1, pp. 257--259.
> > + */
> > +
> > +/*
> > + * From
> > + * @(#)quad.h      8.1 (Berkeley) 6/4/93
> > + */
> > +
> > +#ifdef __BIG_ENDIAN
> > +#define _QUAD_HIGHWORD 0
> > +#define _QUAD_LOWWORD 1
> > +#else /* __LITTLE_ENDIAN */
> > +#define _QUAD_HIGHWORD 1
> > +#define _QUAD_LOWWORD 0
> > +#endif
> > +
> > +/*
> > + * Define high and low longwords.
> > + */
> > +#define QUADH               _QUAD_HIGHWORD
> > +#define QUADL               _QUAD_LOWWORD
> > +
> > +/*
> > + * Total number of bits in a quad_t and in the pieces that make it up.
> > + * These are used for shifting, and also below for halfword extraction
> > + * and assembly.
> > + */
> > +#define CHAR_BIT        8               /* number of bits in a char */
> > +#define QUAD_BITS       (sizeof(__s64) * CHAR_BIT)
> > +#define LONG_BITS       (sizeof(long) * CHAR_BIT)
> > +#define HALF_BITS       (sizeof(long) * CHAR_BIT / 2)
> > +
> > +#define DIGIT_BASE (1 << HALF_BITS) /* digit base */
> > +/*
> > + * Extract high and low shortwords from longword, and move low shortword of
> > + * longword to upper half of long, i.e., produce the upper longword of
> > + * ((quad_t)(x) << (number_of_bits_in_long/2)).  (`x' must actually be
> u_long.)
> > + *
> > + * These are used in the multiply code, to split a longword into upper
> > + * and lower halves, and to reassemble a product as a quad_t, shifted left
> > + * (sizeof(long)*CHAR_BIT/2).
> > + */
> > +#define HHALF(x)        ((x) >> HALF_BITS)
> > +#define LHALF(x)        ((x) & ((1 << HALF_BITS) - 1))
> > +#define LHUP(x)         ((x) << HALF_BITS)
> > +
> > +#define COMBINE(a, b) (((unsigned long)(a) << HALF_BITS) | (b))
> > +
> > +/*
> > + * Depending on the desired operation, we view a `long long' (aka quad_t)
> in
> > + * one or more of the following formats.
> > + */
> > +union uu {
> > +   __s64           q;             /* as a (signed) quad */
> > +   __s64          uq;             /* as an unsigned quad */
> > +   long           sl[2];          /* as two signed longs */
> > +   unsigned long  ul[2];          /* as two unsigned longs */
> > +};
> > +
> > +#define    B       (1 << HALF_BITS)        /* digit base */
> > +
> > +/* select a type for digits in base B: use unsigned short if they fit */
> > +#if __UL_MAX == 0xffffffff && __US_MAX >= 0xffff
> > +typedef unsigned short digit;
> > +#else
> > +typedef unsigned long digit;
> > +#endif
> > +
> > +/*
> > + * Shift p[0]..p[len] left `sh' bits, ignoring any bits that
> > + * `fall out' the left (there never will be any such anyway).
> > + * We may assume len >= 0.  NOTE THAT THIS WRITES len+1 DIGITS.
> > + */
> > +static void __shl(register digit *p, register int len, register int sh)
> > +{
> > +   register int i;
> > +
> > +   for (i = 0; i < len; i++)
> > +           p[i] = LHALF(p[i] << sh) | (p[i + 1] >> (HALF_BITS - sh));
> > +   p[i] = LHALF(p[i] << sh);
> > +}
> > +
> > +/*
> > + * __qdivrem(u, v, rem) returns u/v and, optionally, sets *rem to u%v.
> > + *
> > + * We do this in base 2-sup-HALF_BITS, so that all intermediate products
> > + * fit within u_long.  As a consequence, the maximum length dividend and
> > + * divisor are 4 `digits' in this base (they are shorter if they have
> > + * leading zeros).
> > + */
> > +__u64 __qdivrem(__u64 uq, __u64 vq, __u64 *arq)
> > +{
> > +   union uu tmp;
> > +   digit *u, *v, *q;
> > +   register digit v1, v2;
> > +   unsigned long qhat, rhat, t;
> > +   int m, n, d, j, i;
> > +   digit uspace[5], vspace[5], qspace[5];
> > +
> > +   /*
> > +    * Take care of special cases: divide by zero, and u < v.
> > +    */
> > +   if (vq == 0) {
> > +           /* divide by zero. */
> > +           static volatile const unsigned int zero = 0;
> > +
> > +           tmp.ul[QUADH] = tmp.ul[QUADL] = 1 / zero;
> > +           if (arq)
> > +                   *arq = uq;
> > +           return tmp.q;
> > +   }
> > +   if (uq < vq) {
> > +           if (arq)
> > +                   *arq = uq;
> > +           return 0;
> > +   }
> > +   u = &uspace[0];
> > +   v = &vspace[0];
> > +   q = &qspace[0];
> > +
> > +   /*
> > +    * Break dividend and divisor into digits in base B, then
> > +    * count leading zeros to determine m and n.  When done, we
> > +    * will have:
> > +    *      u = (u[1]u[2]...u[m+n]) sub B
> > +    *      v = (v[1]v[2]...v[n]) sub B
> > +    *      v[1] != 0
> > +    *      1 < n <= 4 (if n = 1, we use a different division algorithm)
> > +    *      m >= 0 (otherwise u < v, which we already checked)
> > +    *      m + n = 4
> > +    * and thus
> > +    *      m = 4 - n <= 2
> > +    */
> > +   tmp.uq = uq;
> > +   u[0] = 0;
> > +   u[1] = HHALF(tmp.ul[QUADH]);
> > +   u[2] = LHALF(tmp.ul[QUADH]);
> > +   u[3] = HHALF(tmp.ul[QUADL]);
> > +   u[4] = LHALF(tmp.ul[QUADL]);
> > +   tmp.uq = vq;
> > +   v[1] = HHALF(tmp.ul[QUADH]);
> > +   v[2] = LHALF(tmp.ul[QUADH]);
> > +   v[3] = HHALF(tmp.ul[QUADL]);
> > +   v[4] = LHALF(tmp.ul[QUADL]);
> > +   for (n = 4; v[1] == 0; v++) {
> > +           if (--n == 1) {
> > +                   unsigned long rbj; /* r*B+u[j] (not root boy jim) */
> > +                   digit q1, q2, q3, q4;
> > +
> > +                   /*
> > +                    * Change of plan, per exercise 16.
> > +                    *      r = 0;
> > +                    *      for j = 1..4:
> > +                    *              q[j] = floor((r*B + u[j]) / v),
> > +                    *              r = (r*B + u[j]) % v;
> > +                    * We unroll this completely here.
> > +                    */
> > +                   t = v[2];       /* nonzero, by definition */
> > +                   q1 = u[1] / t;
> > +                   rbj = COMBINE(u[1] % t, u[2]);
> > +                   q2 = rbj / t;
> > +                   rbj = COMBINE(rbj % t, u[3]);
> > +                   q3 = rbj / t;
> > +                   rbj = COMBINE(rbj % t, u[4]);
> > +                   q4 = rbj / t;
> > +                   if (arq)
> > +                           *arq = rbj % t;
> > +                   tmp.ul[QUADH] = COMBINE(q1, q2);
> > +                   tmp.ul[QUADL] = COMBINE(q3, q4);
> > +                   return tmp.q;
> > +           }
> > +   }
> > +
> > +   /*
> > +    * By adjusting q once we determine m, we can guarantee that
> > +    * there is a complete four-digit quotient at &qspace[1] when
> > +    * we finally stop.
> > +    */
> > +   for (m = 4 - n; u[1] == 0; u++)
> > +           m--;
> > +   for (i = 4 - m; --i >= 0;)
> > +           q[i] = 0;
> > +   q += 4 - m;
> > +
> > +   /*
> > +    * Here we run Program D, translated from MIX to C and acquiring
> > +    * a few minor changes.
> > +    *
> > +    * D1: choose multiplier 1 << d to ensure v[1] >= B/2.
> > +    */
> > +   d = 0;
> > +   for (t = v[1]; t < B / 2; t <<= 1)
> > +           d++;
> > +   if (d > 0) {
> > +           __shl(&u[0], m + n, d);         /* u <<= d */
> > +           __shl(&v[1], n - 1, d);         /* v <<= d */
> > +   }
> > +   /*
> > +    * D2: j = 0.
> > +    */
> > +   j = 0;
> > +   v1 = v[1];      /* for D3 -- note that v[1..n] are constant */
> > +   v2 = v[2];      /* for D3 */
> > +   do {
> > +           register digit uj0, uj1, uj2;
> > +
> > +           /*
> > +            * D3: Calculate qhat (\^q, in TeX notation).
> > +            * Let qhat = min((u[j]*B + u[j+1])/v[1], B-1), and
> > +            * let rhat = (u[j]*B + u[j+1]) mod v[1].
> > +            * While rhat < B and v[2]*qhat > rhat*B+u[j+2],
> > +            * decrement qhat and increase rhat correspondingly.
> > +            * Note that if rhat >= B, v[2]*qhat < rhat*B.
> > +            */
> > +           uj0 = u[j + 0]; /* for D3 only -- note that u[j+...] change */
> > +           uj1 = u[j + 1]; /* for D3 only */
> > +           uj2 = u[j + 2]; /* for D3 only */
> > +           if (uj0 == v1) {
> > +                   qhat = B;
> > +                   rhat = uj1;
> > +                   goto qhat_too_big;
> > +           } else {
> > +                   unsigned long nn = COMBINE(uj0, uj1);
> > +
> > +                   qhat = nn / v1;
> > +                   rhat = nn % v1;
> > +           }
> > +           while (v2 * qhat > COMBINE(rhat, uj2)) {
> > +qhat_too_big:
> > +                   qhat--;
> > +                   if ((rhat += v1) >= B)
> > +                           break;
> > +           }
> > +           /*
> > +            * D4: Multiply and subtract.
> > +            * The variable `t' holds any borrows across the loop.
> > +            * We split this up so that we do not require v[0] = 0,
> > +            * and to eliminate a final special case.
> > +            */
> > +           for (t = 0, i = n; i > 0; i--) {
> > +                   t = u[i + j] - v[i] * qhat - t;
> > +                   u[i + j] = LHALF(t);
> > +                   t = (B - HHALF(t)) & (B - 1);
> > +           }
> > +           t = u[j] - t;
> > +           u[j] = LHALF(t);
> > +           /*
> > +            * D5: test remainder.
> > +            * There is a borrow if and only if HHALF(t) is nonzero;
> > +            * in that (rare) case, qhat was too large (by exactly 1).
> > +            * Fix it by adding v[1..n] to u[j..j+n].
> > +            */
> > +           if (HHALF(t)) {
> > +                   qhat--;
> > +                   for (t = 0, i = n; i > 0; i--) { /* D6: add back. */
> > +                           t += u[i + j] + v[i];
> > +                           u[i + j] = LHALF(t);
> > +                           t = HHALF(t);
> > +                   }
> > +                   u[j] = LHALF(u[j] + t);
> > +           }
> > +           q[j] = qhat;
> > +   } while (++j <= m);             /* D7: loop on j. */
> > +
> > +   /*
> > +    * If caller wants the remainder, we have to calculate it as
> > +    * u[m..m+n] >> d (this is at most n digits and thus fits in
> > +    * u[m+1..m+n], but we may need more source digits).
> > +    */
> > +   if (arq) {
> > +           if (d) {
> > +                   for (i = m + n; i > m; --i)
> > +                           u[i] = (u[i] >> d) |
> > +                               LHALF(u[i - 1] << (HALF_BITS - d));
> > +                   u[i] = 0;
> > +           }
> > +           tmp.ul[QUADH] = COMBINE(uspace[1], uspace[2]);
> > +           tmp.ul[QUADL] = COMBINE(uspace[3], uspace[4]);
> > +           *arq = tmp.q;
> > +   }
> > +
> > +   tmp.ul[QUADH] = COMBINE(qspace[1], qspace[2]);
> > +   tmp.ul[QUADL] = COMBINE(qspace[3], qspace[4]);
> > +   return tmp.q;
> > +}
> > diff --git a/arch/arm/divsi3.S b/arch/arm/divsi3.S
> > deleted file mode 100644
> > index fa92233..0000000
> > --- a/arch/arm/divsi3.S
> > +++ /dev/null
> > @@ -1,404 +0,0 @@
> > -/* $NetBSD: divsi3.S,v 1.4 2003/04/05 23:27:15 bjh21 Exp $ */
> > -
> > -/*-
> > - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > - * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > - * SUCH DAMAGE.
> > - */
> > -
> > -#define ENTRY_NP(symbol) \
> > -   .globl symbol;  \
> > -   symbol:
> > -
> > -#define END(symbol)
> > -
> > -/*
> > - * stack is aligned as there's a possibility of branching to L_overflow
> > - * which makes a C call
> > - */
> > -
> > -ENTRY_NP(__umodsi3)
> > -   stmfd   sp!, {lr}
> > -   sub     sp, sp, #4      /* align stack */
> > -   bl      .L_udivide
> > -   add     sp, sp, #4      /* unalign stack */
> > -   mov     r0, r1
> > -   ldmfd   sp!, {pc}
> > -END(__umodsi3)
> > -
> > -ENTRY_NP(__modsi3)
> > -   stmfd   sp!, {lr}
> > -   sub     sp, sp, #4      /* align stack */
> > -   bl      .L_divide
> > -   add     sp, sp, #4      /* unalign stack */
> > -   mov     r0, r1
> > -   ldmfd   sp!, {pc}
> > -
> > -.L_overflow:
> > -   /* XXX should cause a fatal error */
> > -   mvn     r0, #0
> > -   mov     pc, lr
> > -
> > -END(__modsi3)
> > -
> > -#ifdef __ARM_EABI__
> > -ENTRY_NP(__aeabi_uidiv)
> > -ENTRY_NP(__aeabi_uidivmod)
> > -#endif
> > -ENTRY_NP(__udivsi3)
> > -.L_udivide:                                /* r0 = r0 / r1; r1 = r0 % r1 */
> > -   eor     r0, r1, r0
> > -   eor     r1, r0, r1
> > -   eor     r0, r1, r0
> > -                                   /* r0 = r1 / r0; r1 = r1 % r0 */
> > -   cmp     r0, #1
> > -   bcc     .L_overflow
> > -   beq     .L_divide_l0
> > -   mov     ip, #0
> > -   movs    r1, r1
> > -   bpl     .L_divide_l1
> > -   orr     ip, ip, #0x20000000     /* ip bit 0x20000000 = -ve r1 */
> > -   movs    r1, r1, lsr #1
> > -   orrcs   ip, ip, #0x10000000     /* ip bit 0x10000000 = bit 0 of r1 */
> > -   b       .L_divide_l1
> > -
> > -.L_divide_l0:                              /* r0 == 1 */
> > -   mov     r0, r1
> > -   mov     r1, #0
> > -   mov     pc, lr
> > -#ifdef __ARM_EABI__
> > -END(__aeabi_uidiv)
> > -END(__aeabi_uidivmod)
> > -#endif
> > -END(__udivsi3)
> > -
> > -#ifdef __ARM_EABI__
> > -ENTRY_NP(__aeabi_idiv)
> > -ENTRY_NP(__aeabi_idivmod)
> > -#endif
> > -ENTRY_NP(__divsi3)
> > -.L_divide:                         /* r0 = r0 / r1; r1 = r0 % r1 */
> > -   eor     r0, r1, r0
> > -   eor     r1, r0, r1
> > -   eor     r0, r1, r0
> > -                                   /* r0 = r1 / r0; r1 = r1 % r0 */
> > -   cmp     r0, #1
> > -   bcc     .L_overflow
> > -   beq     .L_divide_l0
> > -   ands    ip, r0, #0x80000000
> > -   rsbmi   r0, r0, #0
> > -   ands    r2, r1, #0x80000000
> > -   eor     ip, ip, r2
> > -   rsbmi   r1, r1, #0
> > -   orr     ip, r2, ip, lsr #1      /* ip bit 0x40000000 = -ve division */
> > -                                   /* ip bit 0x80000000 = -ve remainder */
> > -
> > -.L_divide_l1:
> > -   mov     r2, #1
> > -   mov     r3, #0
> > -
> > -   /*
> > -    * If the highest bit of the dividend is set, we have to be
> > -    * careful when shifting the divisor. Test this.
> > -    */
> > -   movs    r1,r1
> > -   bpl     .L_old_code
> > -
> > -   /*
> > -    * At this point, the highest bit of r1 is known to be set.
> > -    * We abuse this below in the tst instructions.
> > -    */
> > -   tst     r1, r0 /*, lsl #0 */
> > -   bmi     .L_divide_b1
> > -   tst     r1, r0, lsl #1
> > -   bmi     .L_divide_b2
> > -   tst     r1, r0, lsl #2
> > -   bmi     .L_divide_b3
> > -   tst     r1, r0, lsl #3
> > -   bmi     .L_divide_b4
> > -   tst     r1, r0, lsl #4
> > -   bmi     .L_divide_b5
> > -   tst     r1, r0, lsl #5
> > -   bmi     .L_divide_b6
> > -   tst     r1, r0, lsl #6
> > -   bmi     .L_divide_b7
> > -   tst     r1, r0, lsl #7
> > -   bmi     .L_divide_b8
> > -   tst     r1, r0, lsl #8
> > -   bmi     .L_divide_b9
> > -   tst     r1, r0, lsl #9
> > -   bmi     .L_divide_b10
> > -   tst     r1, r0, lsl #10
> > -   bmi     .L_divide_b11
> > -   tst     r1, r0, lsl #11
> > -   bmi     .L_divide_b12
> > -   tst     r1, r0, lsl #12
> > -   bmi     .L_divide_b13
> > -   tst     r1, r0, lsl #13
> > -   bmi     .L_divide_b14
> > -   tst     r1, r0, lsl #14
> > -   bmi     .L_divide_b15
> > -   tst     r1, r0, lsl #15
> > -   bmi     .L_divide_b16
> > -   tst     r1, r0, lsl #16
> > -   bmi     .L_divide_b17
> > -   tst     r1, r0, lsl #17
> > -   bmi     .L_divide_b18
> > -   tst     r1, r0, lsl #18
> > -   bmi     .L_divide_b19
> > -   tst     r1, r0, lsl #19
> > -   bmi     .L_divide_b20
> > -   tst     r1, r0, lsl #20
> > -   bmi     .L_divide_b21
> > -   tst     r1, r0, lsl #21
> > -   bmi     .L_divide_b22
> > -   tst     r1, r0, lsl #22
> > -   bmi     .L_divide_b23
> > -   tst     r1, r0, lsl #23
> > -   bmi     .L_divide_b24
> > -   tst     r1, r0, lsl #24
> > -   bmi     .L_divide_b25
> > -   tst     r1, r0, lsl #25
> > -   bmi     .L_divide_b26
> > -   tst     r1, r0, lsl #26
> > -   bmi     .L_divide_b27
> > -   tst     r1, r0, lsl #27
> > -   bmi     .L_divide_b28
> > -   tst     r1, r0, lsl #28
> > -   bmi     .L_divide_b29
> > -   tst     r1, r0, lsl #29
> > -   bmi     .L_divide_b30
> > -   tst     r1, r0, lsl #30
> > -   bmi     .L_divide_b31
> > -/*
> > - * instead of:
> > - * tst     r1, r0, lsl #31
> > - * bmi     .L_divide_b32
> > - */
> > -   b       .L_divide_b32
> > -
> > -.L_old_code:
> > -   cmp     r1, r0
> > -   bcc     .L_divide_b0
> > -   cmp     r1, r0, lsl #1
> > -   bcc     .L_divide_b1
> > -   cmp     r1, r0, lsl #2
> > -   bcc     .L_divide_b2
> > -   cmp     r1, r0, lsl #3
> > -   bcc     .L_divide_b3
> > -   cmp     r1, r0, lsl #4
> > -   bcc     .L_divide_b4
> > -   cmp     r1, r0, lsl #5
> > -   bcc     .L_divide_b5
> > -   cmp     r1, r0, lsl #6
> > -   bcc     .L_divide_b6
> > -   cmp     r1, r0, lsl #7
> > -   bcc     .L_divide_b7
> > -   cmp     r1, r0, lsl #8
> > -   bcc     .L_divide_b8
> > -   cmp     r1, r0, lsl #9
> > -   bcc     .L_divide_b9
> > -   cmp     r1, r0, lsl #10
> > -   bcc     .L_divide_b10
> > -   cmp     r1, r0, lsl #11
> > -   bcc     .L_divide_b11
> > -   cmp     r1, r0, lsl #12
> > -   bcc     .L_divide_b12
> > -   cmp     r1, r0, lsl #13
> > -   bcc     .L_divide_b13
> > -   cmp     r1, r0, lsl #14
> > -   bcc     .L_divide_b14
> > -   cmp     r1, r0, lsl #15
> > -   bcc     .L_divide_b15
> > -   cmp     r1, r0, lsl #16
> > -   bcc     .L_divide_b16
> > -   cmp     r1, r0, lsl #17
> > -   bcc     .L_divide_b17
> > -   cmp     r1, r0, lsl #18
> > -   bcc     .L_divide_b18
> > -   cmp     r1, r0, lsl #19
> > -   bcc     .L_divide_b19
> > -   cmp     r1, r0, lsl #20
> > -   bcc     .L_divide_b20
> > -   cmp     r1, r0, lsl #21
> > -   bcc     .L_divide_b21
> > -   cmp     r1, r0, lsl #22
> > -   bcc     .L_divide_b22
> > -   cmp     r1, r0, lsl #23
> > -   bcc     .L_divide_b23
> > -   cmp     r1, r0, lsl #24
> > -   bcc     .L_divide_b24
> > -   cmp     r1, r0, lsl #25
> > -   bcc     .L_divide_b25
> > -   cmp     r1, r0, lsl #26
> > -   bcc     .L_divide_b26
> > -   cmp     r1, r0, lsl #27
> > -   bcc     .L_divide_b27
> > -   cmp     r1, r0, lsl #28
> > -   bcc     .L_divide_b28
> > -   cmp     r1, r0, lsl #29
> > -   bcc     .L_divide_b29
> > -   cmp     r1, r0, lsl #30
> > -   bcc     .L_divide_b30
> > -.L_divide_b32:
> > -   cmp     r1, r0, lsl #31
> > -   subhs   r1, r1,r0, lsl #31
> > -   addhs   r3, r3,r2, lsl #31
> > -.L_divide_b31:
> > -   cmp     r1, r0, lsl #30
> > -   subhs   r1, r1,r0, lsl #30
> > -   addhs   r3, r3,r2, lsl #30
> > -.L_divide_b30:
> > -   cmp     r1, r0, lsl #29
> > -   subhs   r1, r1,r0, lsl #29
> > -   addhs   r3, r3,r2, lsl #29
> > -.L_divide_b29:
> > -   cmp     r1, r0, lsl #28
> > -   subhs   r1, r1,r0, lsl #28
> > -   addhs   r3, r3,r2, lsl #28
> > -.L_divide_b28:
> > -   cmp     r1, r0, lsl #27
> > -   subhs   r1, r1,r0, lsl #27
> > -   addhs   r3, r3,r2, lsl #27
> > -.L_divide_b27:
> > -   cmp     r1, r0, lsl #26
> > -   subhs   r1, r1,r0, lsl #26
> > -   addhs   r3, r3,r2, lsl #26
> > -.L_divide_b26:
> > -   cmp     r1, r0, lsl #25
> > -   subhs   r1, r1,r0, lsl #25
> > -   addhs   r3, r3,r2, lsl #25
> > -.L_divide_b25:
> > -   cmp     r1, r0, lsl #24
> > -   subhs   r1, r1,r0, lsl #24
> > -   addhs   r3, r3,r2, lsl #24
> > -.L_divide_b24:
> > -   cmp     r1, r0, lsl #23
> > -   subhs   r1, r1,r0, lsl #23
> > -   addhs   r3, r3,r2, lsl #23
> > -.L_divide_b23:
> > -   cmp     r1, r0, lsl #22
> > -   subhs   r1, r1,r0, lsl #22
> > -   addhs   r3, r3,r2, lsl #22
> > -.L_divide_b22:
> > -   cmp     r1, r0, lsl #21
> > -   subhs   r1, r1,r0, lsl #21
> > -   addhs   r3, r3,r2, lsl #21
> > -.L_divide_b21:
> > -   cmp     r1, r0, lsl #20
> > -   subhs   r1, r1,r0, lsl #20
> > -   addhs   r3, r3,r2, lsl #20
> > -.L_divide_b20:
> > -   cmp     r1, r0, lsl #19
> > -   subhs   r1, r1,r0, lsl #19
> > -   addhs   r3, r3,r2, lsl #19
> > -.L_divide_b19:
> > -   cmp     r1, r0, lsl #18
> > -   subhs   r1, r1,r0, lsl #18
> > -   addhs   r3, r3,r2, lsl #18
> > -.L_divide_b18:
> > -   cmp     r1, r0, lsl #17
> > -   subhs   r1, r1,r0, lsl #17
> > -   addhs   r3, r3,r2, lsl #17
> > -.L_divide_b17:
> > -   cmp     r1, r0, lsl #16
> > -   subhs   r1, r1,r0, lsl #16
> > -   addhs   r3, r3,r2, lsl #16
> > -.L_divide_b16:
> > -   cmp     r1, r0, lsl #15
> > -   subhs   r1, r1,r0, lsl #15
> > -   addhs   r3, r3,r2, lsl #15
> > -.L_divide_b15:
> > -   cmp     r1, r0, lsl #14
> > -   subhs   r1, r1,r0, lsl #14
> > -   addhs   r3, r3,r2, lsl #14
> > -.L_divide_b14:
> > -   cmp     r1, r0, lsl #13
> > -   subhs   r1, r1,r0, lsl #13
> > -   addhs   r3, r3,r2, lsl #13
> > -.L_divide_b13:
> > -   cmp     r1, r0, lsl #12
> > -   subhs   r1, r1,r0, lsl #12
> > -   addhs   r3, r3,r2, lsl #12
> > -.L_divide_b12:
> > -   cmp     r1, r0, lsl #11
> > -   subhs   r1, r1,r0, lsl #11
> > -   addhs   r3, r3,r2, lsl #11
> > -.L_divide_b11:
> > -   cmp     r1, r0, lsl #10
> > -   subhs   r1, r1,r0, lsl #10
> > -   addhs   r3, r3,r2, lsl #10
> > -.L_divide_b10:
> > -   cmp     r1, r0, lsl #9
> > -   subhs   r1, r1,r0, lsl #9
> > -   addhs   r3, r3,r2, lsl #9
> > -.L_divide_b9:
> > -   cmp     r1, r0, lsl #8
> > -   subhs   r1, r1,r0, lsl #8
> > -   addhs   r3, r3,r2, lsl #8
> > -.L_divide_b8:
> > -   cmp     r1, r0, lsl #7
> > -   subhs   r1, r1,r0, lsl #7
> > -   addhs   r3, r3,r2, lsl #7
> > -.L_divide_b7:
> > -   cmp     r1, r0, lsl #6
> > -   subhs   r1, r1,r0, lsl #6
> > -   addhs   r3, r3,r2, lsl #6
> > -.L_divide_b6:
> > -   cmp     r1, r0, lsl #5
> > -   subhs   r1, r1,r0, lsl #5
> > -   addhs   r3, r3,r2, lsl #5
> > -.L_divide_b5:
> > -   cmp     r1, r0, lsl #4
> > -   subhs   r1, r1,r0, lsl #4
> > -   addhs   r3, r3,r2, lsl #4
> > -.L_divide_b4:
> > -   cmp     r1, r0, lsl #3
> > -   subhs   r1, r1,r0, lsl #3
> > -   addhs   r3, r3,r2, lsl #3
> > -.L_divide_b3:
> > -   cmp     r1, r0, lsl #2
> > -   subhs   r1, r1,r0, lsl #2
> > -   addhs   r3, r3,r2, lsl #2
> > -.L_divide_b2:
> > -   cmp     r1, r0, lsl #1
> > -   subhs   r1, r1,r0, lsl #1
> > -   addhs   r3, r3,r2, lsl #1
> > -.L_divide_b1:
> > -   cmp     r1, r0
> > -   subhs   r1, r1, r0
> > -   addhs   r3, r3, r2
> > -.L_divide_b0:
> > -
> > -   tst     ip, #0x20000000
> > -   bne     .L_udivide_l1
> > -   mov     r0, r3
> > -   cmp     ip, #0
> > -   rsbmi   r1, r1, #0
> > -   movs    ip, ip, lsl #1
> > -   bicmi   r0, r0, #0x80000000     /* Fix incase we divided 0x80000000 */
> > -   rsbmi   r0, r0, #0
> > -   mov     pc, lr
> > -
> > -.L_udivide_l1:
> > -   tst     ip, #0x10000000
> > -   mov     r1, r1, lsl #1
> > -   orrne   r1, r1, #1
> > -   mov     r3, r3, lsl #1
> > -   cmp     r1, r0
> > -   subhs   r1, r1, r0
> > -   addhs   r3, r3, r2
> > -   mov     r0, r3
> > -   mov     pc, lr
> > -END(__aeabi_idiv)
> > -END(__aeabi_idivmod)
> > -END(__divsi3)
> > -
> > diff --git a/arch/arm/ldivmod.S b/arch/arm/ldivmod.S
> > deleted file mode 100644
> > index 3c3083b..0000000
> > --- a/arch/arm/ldivmod.S
> > +++ /dev/null
> > @@ -1,68 +0,0 @@
> > -/* SPDX-License-Identifier: BSD-2-Clause */
> > -/*
> > - * Copyright (C) 2012 Andrew Turner
> > - * All rights reserved.
> > - *
> > - * Redistribution and use in source and binary forms, with or without
> > - * modification, are permitted provided that the following conditions
> > - * are met:
> > - * 1. Redistributions of source code must retain the above copyright
> > - *    notice, this list of conditions and the following disclaimer.
> > - * 2. Redistributions in binary form must reproduce the above copyright
> > - *    notice, this list of conditions and the following disclaimer in the
> > - *    documentation and/or other materials provided with the distribution.
> > - *
> > - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > - * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > - * SUCH DAMAGE.
> > - *
> > - */
> > -
> > -#define ENTRY_NP(symbol) \
> > -   .globl symbol;  \
> > -   symbol:
> > -
> > -#define END(symbol)
> > -
> > -/*
> > - * These calculate:
> > - * q = n / m
> > - * With a remainer r.
> > - *
> > - * They take n in {r0, r1} and m in {r2, r3} then pass them into the
> > - * helper function. The hepler functions return q in {r0, r1} as
> > - * required by the API spec however r is returned on the stack. The
> > - * ABI required us to return r in {r2, r3}.
> > - *
> > - * We need to allocate 8 bytes on the stack to store r, the link
> > - * register, and a pointer to the space where the helper function
> > - * will write r to. After returning from the helper fuinction we load
> > - * the old link register and r from the stack and return.
> > - */
> > -ENTRY_NP(__aeabi_ldivmod)
> > -   sub     sp, sp, #8      /* Space for the remainder */
> > -   stmfd   sp!, {sp, lr}   /* Save a pointer to the above space and lr */
> > -   bl      __kern_ldivmod
> > -   ldr     lr, [sp, #4]    /* Restore lr */
> > -   add     sp, sp, #8      /* Move sp to the remainder value */
> > -   ldmfd   sp!, {r2, r3}   /* Load the remainder */
> > -   mov     pc, lr
> > -END(__aeabi_ldivmod)
> > -
> > -ENTRY_NP(__aeabi_uldivmod)
> > -   sub     sp, sp, #8      /* Space for the remainder */
> > -   stmfd   sp!, {sp, lr}   /* Save a pointer to the above space and lr */
> > -   bl      __qdivrem
> > -   ldr     lr, [sp, #4]    /* Restore lr */
> > -   add     sp, sp, #8      /* Move sp to the remainder value */
> > -   ldmfd   sp!, {r2, r3}   /* Load the remainder */
> > -   mov     pc, lr
> > -END(__aeabi_uldivmod)
> > diff --git a/arch/arm/ldivmod_helper.c b/arch/arm/ldivmod_helper.c
> > deleted file mode 100644
> > index 098523e..0000000
> > --- a/arch/arm/ldivmod_helper.c
> > +++ /dev/null
> > @@ -1,67 +0,0 @@
> > -/* SPDX-License-Identifier: BSD-2-Clause */
> > -/*
> > - * Copyright (C) 2012 Andrew Turner
> > - * All rights reserved.
> > - *
> > - * Redistribution and use in source and binary forms, with or without
> > - * modification, are permitted provided that the following conditions
> > - * are met:
> > - * 1. Redistributions of source code must retain the above copyright
> > - *    notice, this list of conditions and the following disclaimer.
> > - * 2. Redistributions in binary form must reproduce the above copyright
> > - *    notice, this list of conditions and the following disclaimer in the
> > - *    documentation and/or other materials provided with the distribution.
> > - *
> > - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > - * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > - * SUCH DAMAGE.
> > - *
> > - */
> > -
> > -#include <uk/arch/types.h>
> > -
> > -__u64 __qdivrem(__u64 u, __u64 v, __u64 *rem);
> > -
> > -#ifndef HAVE_LIBC
> > -__s64 __divdi3(__s64 a, __s64 b)
> > -{
> > -   __u64 ua, ub, uq;
> > -   int neg;
> > -
> > -   if (a < 0)
> > -           ua = -(__u64)a, neg = 1;
> > -   else
> > -           ua = a, neg = 0;
> > -   if (b < 0)
> > -           ub = -(__u64)b, neg ^= 1;
> > -   else
> > -           ub = b;
> > -   uq = __qdivrem(ua, ub, (__u64 *)0);
> > -   return neg ? -uq : uq;
> > -}
> > -#endif
> > -
> > -/*
> > - * Helper for __aeabi_ldivmod.
> > - * TODO: __divdi3 calls __qdivrem. We should do the same and use the
> > - * remainder value rather than re-calculating it.
> > - */
> > -long long __kern_ldivmod(long long, long long, long long *);
> > -
> > -long long __kern_ldivmod(long long n, long long m, long long *rem)
> > -{
> > -   long long q;
> > -
> > -   q = __divdi3(n, m); /* q = n / m */
> > -   *rem = n - m * q;
> > -
> > -   return q;
> > -}
> > diff --git a/arch/arm/qdivrem.c b/arch/arm/qdivrem.c
> > deleted file mode 100644
> > index e7d1471..0000000
> > --- a/arch/arm/qdivrem.c
> > +++ /dev/null
> > @@ -1,324 +0,0 @@
> > -/* SPDX-License-Identifier: BSD-3-Clause */
> > -/*-
> > - * Copyright (c) 1992, 1993
> > - * The Regents of the University of California.  All rights reserved.
> > - *
> > - * This software was developed by the Computer Systems Engineering group
> > - * at Lawrence Berkeley Laboratory under DARPA contract BG 91-66 and
> > - * contributed to Berkeley.
> > - *
> > - * Redistribution and use in source and binary forms, with or without
> > - * modification, are permitted provided that the following conditions
> > - * are met:
> > - * 1. Redistributions of source code must retain the above copyright
> > - *    notice, this list of conditions and the following disclaimer.
> > - * 2. Redistributions in binary form must reproduce the above copyright
> > - *    notice, this list of conditions and the following disclaimer in the
> > - *    documentation and/or other materials provided with the distribution.
> > - * 4. Neither the name of the University nor the names of its contributors
> > - *    may be used to endorse or promote products derived from this software
> > - *    without specific prior written permission.
> > - *
> > - * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
> > - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE
> > - * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
> > - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL
> > - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> STRICT
> > - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
> WAY
> > - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > - * SUCH DAMAGE.
> > - */
> > -
> > -#include <uk/arch/types.h>
> > -#include <uk/arch/limits.h>
> > -
> > -/*
> > - * Multiprecision divide.  This algorithm is from Knuth vol. 2 (2nd ed),
> > - * section 4.3.1, pp. 257--259.
> > - */
> > -
> > -/*
> > - * From
> > - * @(#)quad.h      8.1 (Berkeley) 6/4/93
> > - */
> > -
> > -#ifdef __BIG_ENDIAN
> > -#define _QUAD_HIGHWORD 0
> > -#define _QUAD_LOWWORD 1
> > -#else /* __LITTLE_ENDIAN */
> > -#define _QUAD_HIGHWORD 1
> > -#define _QUAD_LOWWORD 0
> > -#endif
> > -
> > -/*
> > - * Define high and low longwords.
> > - */
> > -#define QUADH               _QUAD_HIGHWORD
> > -#define QUADL               _QUAD_LOWWORD
> > -
> > -/*
> > - * Total number of bits in a quad_t and in the pieces that make it up.
> > - * These are used for shifting, and also below for halfword extraction
> > - * and assembly.
> > - */
> > -#define CHAR_BIT        8               /* number of bits in a char */
> > -#define QUAD_BITS       (sizeof(__s64) * CHAR_BIT)
> > -#define LONG_BITS       (sizeof(long) * CHAR_BIT)
> > -#define HALF_BITS       (sizeof(long) * CHAR_BIT / 2)
> > -
> > -#define DIGIT_BASE (1 << HALF_BITS) /* digit base */
> > -/*
> > - * Extract high and low shortwords from longword, and move low shortword of
> > - * longword to upper half of long, i.e., produce the upper longword of
> > - * ((quad_t)(x) << (number_of_bits_in_long/2)).  (`x' must actually be
> u_long.)
> > - *
> > - * These are used in the multiply code, to split a longword into upper
> > - * and lower halves, and to reassemble a product as a quad_t, shifted left
> > - * (sizeof(long)*CHAR_BIT/2).
> > - */
> > -#define HHALF(x)        ((x) >> HALF_BITS)
> > -#define LHALF(x)        ((x) & ((1 << HALF_BITS) - 1))
> > -#define LHUP(x)         ((x) << HALF_BITS)
> > -
> > -#define COMBINE(a, b) (((unsigned long)(a) << HALF_BITS) | (b))
> > -
> > -/*
> > - * Depending on the desired operation, we view a `long long' (aka quad_t)
> in
> > - * one or more of the following formats.
> > - */
> > -union uu {
> > -   __s64           q;             /* as a (signed) quad */
> > -   __s64          uq;             /* as an unsigned quad */
> > -   long           sl[2];          /* as two signed longs */
> > -   unsigned long  ul[2];          /* as two unsigned longs */
> > -};
> > -
> > -#define    B       (1 << HALF_BITS)        /* digit base */
> > -
> > -/* select a type for digits in base B: use unsigned short if they fit */
> > -#if __UL_MAX == 0xffffffff && __US_MAX >= 0xffff
> > -typedef unsigned short digit;
> > -#else
> > -typedef unsigned long digit;
> > -#endif
> > -
> > -/*
> > - * Shift p[0]..p[len] left `sh' bits, ignoring any bits that
> > - * `fall out' the left (there never will be any such anyway).
> > - * We may assume len >= 0.  NOTE THAT THIS WRITES len+1 DIGITS.
> > - */
> > -static void __shl(register digit *p, register int len, register int sh)
> > -{
> > -   register int i;
> > -
> > -   for (i = 0; i < len; i++)
> > -           p[i] = LHALF(p[i] << sh) | (p[i + 1] >> (HALF_BITS - sh));
> > -   p[i] = LHALF(p[i] << sh);
> > -}
> > -
> > -/*
> > - * __qdivrem(u, v, rem) returns u/v and, optionally, sets *rem to u%v.
> > - *
> > - * We do this in base 2-sup-HALF_BITS, so that all intermediate products
> > - * fit within u_long.  As a consequence, the maximum length dividend and
> > - * divisor are 4 `digits' in this base (they are shorter if they have
> > - * leading zeros).
> > - */
> > -__u64 __qdivrem(__u64 uq, __u64 vq, __u64 *arq)
> > -{
> > -   union uu tmp;
> > -   digit *u, *v, *q;
> > -   register digit v1, v2;
> > -   unsigned long qhat, rhat, t;
> > -   int m, n, d, j, i;
> > -   digit uspace[5], vspace[5], qspace[5];
> > -
> > -   /*
> > -    * Take care of special cases: divide by zero, and u < v.
> > -    */
> > -   if (vq == 0) {
> > -           /* divide by zero. */
> > -           static volatile const unsigned int zero = 0;
> > -
> > -           tmp.ul[QUADH] = tmp.ul[QUADL] = 1 / zero;
> > -           if (arq)
> > -                   *arq = uq;
> > -           return tmp.q;
> > -   }
> > -   if (uq < vq) {
> > -           if (arq)
> > -                   *arq = uq;
> > -           return 0;
> > -   }
> > -   u = &uspace[0];
> > -   v = &vspace[0];
> > -   q = &qspace[0];
> > -
> > -   /*
> > -    * Break dividend and divisor into digits in base B, then
> > -    * count leading zeros to determine m and n.  When done, we
> > -    * will have:
> > -    *      u = (u[1]u[2]...u[m+n]) sub B
> > -    *      v = (v[1]v[2]...v[n]) sub B
> > -    *      v[1] != 0
> > -    *      1 < n <= 4 (if n = 1, we use a different division algorithm)
> > -    *      m >= 0 (otherwise u < v, which we already checked)
> > -    *      m + n = 4
> > -    * and thus
> > -    *      m = 4 - n <= 2
> > -    */
> > -   tmp.uq = uq;
> > -   u[0] = 0;
> > -   u[1] = HHALF(tmp.ul[QUADH]);
> > -   u[2] = LHALF(tmp.ul[QUADH]);
> > -   u[3] = HHALF(tmp.ul[QUADL]);
> > -   u[4] = LHALF(tmp.ul[QUADL]);
> > -   tmp.uq = vq;
> > -   v[1] = HHALF(tmp.ul[QUADH]);
> > -   v[2] = LHALF(tmp.ul[QUADH]);
> > -   v[3] = HHALF(tmp.ul[QUADL]);
> > -   v[4] = LHALF(tmp.ul[QUADL]);
> > -   for (n = 4; v[1] == 0; v++) {
> > -           if (--n == 1) {
> > -                   unsigned long rbj; /* r*B+u[j] (not root boy jim) */
> > -                   digit q1, q2, q3, q4;
> > -
> > -                   /*
> > -                    * Change of plan, per exercise 16.
> > -                    *      r = 0;
> > -                    *      for j = 1..4:
> > -                    *              q[j] = floor((r*B + u[j]) / v),
> > -                    *              r = (r*B + u[j]) % v;
> > -                    * We unroll this completely here.
> > -                    */
> > -                   t = v[2];       /* nonzero, by definition */
> > -                   q1 = u[1] / t;
> > -                   rbj = COMBINE(u[1] % t, u[2]);
> > -                   q2 = rbj / t;
> > -                   rbj = COMBINE(rbj % t, u[3]);
> > -                   q3 = rbj / t;
> > -                   rbj = COMBINE(rbj % t, u[4]);
> > -                   q4 = rbj / t;
> > -                   if (arq)
> > -                           *arq = rbj % t;
> > -                   tmp.ul[QUADH] = COMBINE(q1, q2);
> > -                   tmp.ul[QUADL] = COMBINE(q3, q4);
> > -                   return tmp.q;
> > -           }
> > -   }
> > -
> > -   /*
> > -    * By adjusting q once we determine m, we can guarantee that
> > -    * there is a complete four-digit quotient at &qspace[1] when
> > -    * we finally stop.
> > -    */
> > -   for (m = 4 - n; u[1] == 0; u++)
> > -           m--;
> > -   for (i = 4 - m; --i >= 0;)
> > -           q[i] = 0;
> > -   q += 4 - m;
> > -
> > -   /*
> > -    * Here we run Program D, translated from MIX to C and acquiring
> > -    * a few minor changes.
> > -    *
> > -    * D1: choose multiplier 1 << d to ensure v[1] >= B/2.
> > -    */
> > -   d = 0;
> > -   for (t = v[1]; t < B / 2; t <<= 1)
> > -           d++;
> > -   if (d > 0) {
> > -           __shl(&u[0], m + n, d);         /* u <<= d */
> > -           __shl(&v[1], n - 1, d);         /* v <<= d */
> > -   }
> > -   /*
> > -    * D2: j = 0.
> > -    */
> > -   j = 0;
> > -   v1 = v[1];      /* for D3 -- note that v[1..n] are constant */
> > -   v2 = v[2];      /* for D3 */
> > -   do {
> > -           register digit uj0, uj1, uj2;
> > -
> > -           /*
> > -            * D3: Calculate qhat (\^q, in TeX notation).
> > -            * Let qhat = min((u[j]*B + u[j+1])/v[1], B-1), and
> > -            * let rhat = (u[j]*B + u[j+1]) mod v[1].
> > -            * While rhat < B and v[2]*qhat > rhat*B+u[j+2],
> > -            * decrement qhat and increase rhat correspondingly.
> > -            * Note that if rhat >= B, v[2]*qhat < rhat*B.
> > -            */
> > -           uj0 = u[j + 0]; /* for D3 only -- note that u[j+...] change */
> > -           uj1 = u[j + 1]; /* for D3 only */
> > -           uj2 = u[j + 2]; /* for D3 only */
> > -           if (uj0 == v1) {
> > -                   qhat = B;
> > -                   rhat = uj1;
> > -                   goto qhat_too_big;
> > -           } else {
> > -                   unsigned long nn = COMBINE(uj0, uj1);
> > -
> > -                   qhat = nn / v1;
> > -                   rhat = nn % v1;
> > -           }
> > -           while (v2 * qhat > COMBINE(rhat, uj2)) {
> > -qhat_too_big:
> > -                   qhat--;
> > -                   if ((rhat += v1) >= B)
> > -                           break;
> > -           }
> > -           /*
> > -            * D4: Multiply and subtract.
> > -            * The variable `t' holds any borrows across the loop.
> > -            * We split this up so that we do not require v[0] = 0,
> > -            * and to eliminate a final special case.
> > -            */
> > -           for (t = 0, i = n; i > 0; i--) {
> > -                   t = u[i + j] - v[i] * qhat - t;
> > -                   u[i + j] = LHALF(t);
> > -                   t = (B - HHALF(t)) & (B - 1);
> > -           }
> > -           t = u[j] - t;
> > -           u[j] = LHALF(t);
> > -           /*
> > -            * D5: test remainder.
> > -            * There is a borrow if and only if HHALF(t) is nonzero;
> > -            * in that (rare) case, qhat was too large (by exactly 1).
> > -            * Fix it by adding v[1..n] to u[j..j+n].
> > -            */
> > -           if (HHALF(t)) {
> > -                   qhat--;
> > -                   for (t = 0, i = n; i > 0; i--) { /* D6: add back. */
> > -                           t += u[i + j] + v[i];
> > -                           u[i + j] = LHALF(t);
> > -                           t = HHALF(t);
> > -                   }
> > -                   u[j] = LHALF(u[j] + t);
> > -           }
> > -           q[j] = qhat;
> > -   } while (++j <= m);             /* D7: loop on j. */
> > -
> > -   /*
> > -    * If caller wants the remainder, we have to calculate it as
> > -    * u[m..m+n] >> d (this is at most n digits and thus fits in
> > -    * u[m+1..m+n], but we may need more source digits).
> > -    */
> > -   if (arq) {
> > -           if (d) {
> > -                   for (i = m + n; i > m; --i)
> > -                           u[i] = (u[i] >> d) |
> > -                               LHALF(u[i - 1] << (HALF_BITS - d));
> > -                   u[i] = 0;
> > -           }
> > -           tmp.ul[QUADH] = COMBINE(uspace[1], uspace[2]);
> > -           tmp.ul[QUADL] = COMBINE(uspace[3], uspace[4]);
> > -           *arq = tmp.q;
> > -   }
> > -
> > -   tmp.ul[QUADH] = COMBINE(qspace[1], qspace[2]);
> > -   tmp.ul[QUADL] = COMBINE(qspace[3], qspace[4]);
> > -   return tmp.q;
> > -}
> >
_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.