[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 5/7] xen/riscv: introduce and initialize SBI RFENCE extension



On Tue, 2024-08-27 at 16:19 +0200, Jan Beulich wrote:
> On 21.08.2024 18:06, Oleksii Kurochko wrote:
> > --- a/xen/arch/riscv/include/asm/sbi.h
> > +++ b/xen/arch/riscv/include/asm/sbi.h
> > @@ -31,4 +64,34 @@ struct sbiret sbi_ecall(unsigned long ext,
> > unsigned long fid,
> >   */
> >  void sbi_console_putchar(int ch);
> >  
> > +/*
> > + * Check underlying SBI implementation has RFENCE
> > + *
> > + * @return true for supported AND false for not-supported
> > + */
> > +bool sbi_has_rfence(void);
> > +
> > +/*
> > + * Instructs the remote harts to execute one or more SFENCE.VMA
> > + * instructions, covering the range of virtual addresses between
> > + * [start_addr, start_addr + size).
> > + *
> > + * Returns 0 if IPI was sent to all the targeted harts
> > successfully
> > + * or negative value if start_addr or size is not valid.
> > + *
> > + * @hart_mask a cpu mask containing all the target harts.
> > + * @param start virtual address start
> > + * @param size virtual address range size
> > + */
> > +int sbi_remote_sfence_vma(const cpumask_t *cpu_mask,
> > +                          unsigned long start_addr,
> > +                          unsigned long size);
> 
> I may have asked before: Why not vaddr_t and size_t respectively?
Just to follow how this arguments are declared in RISC-V SBI spec but
considering that that the prototype of this function has been already
change I think we can also change types of start_addr and size.

> 
> > @@ -38,7 +51,265 @@ struct sbiret sbi_ecall(unsigned long ext,
> > unsigned long fid,
> >      return ret;
> >  }
> >  
> > +static int sbi_err_map_xen_errno(int err)
> > +{
> > +    switch ( err )
> > +    {
> > +    case SBI_SUCCESS:
> > +        return 0;
> > +    case SBI_ERR_DENIED:
> > +        return -EACCES;
> > +    case SBI_ERR_INVALID_PARAM:
> > +        return -EINVAL;
> > +    case SBI_ERR_INVALID_ADDRESS:
> > +        return -EFAULT;
> > +    case SBI_ERR_NOT_SUPPORTED:
> > +        return -EOPNOTSUPP;
> > +    case SBI_ERR_FAILURE:
> > +        fallthrough;
> > +    default:
> 
> What's the significance of the "fallthrough" here?
To indicate that the fallthrough from the case SBI_ERR_FAILURE and
default labels is intentional and should not be diagnosed by a compiler
that warns on fallthrough. Or it is needed only when fallthough happen
between switch's cases label ( not default label ) like in the
following code ( should it be fallthrough here? ):
    case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
    case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
    case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:

Additionally, I am considering whether the case SBI_ERR_FAILURE should
be removed or if we should find the appropriate Xen error code for this
case. I am uncertain which Xen error code from xen/errno.h would be
appropriate.

> 
> > +static unsigned long sbi_major_version(void)
> > +{
> > +    return MASK_EXTR(sbi_spec_version,
> > SBI_SPEC_VERSION_MAJOR_MASK);
> > +}
> > +
> > +static unsigned long sbi_minor_version(void)
> > +{
> > +    return MASK_EXTR(sbi_spec_version,
> > SBI_SPEC_VERSION_MINOR_MASK);
> > +}
> 
> Both functions return less than 32-bit wide values. Why unsigned long
> return types?
We had this discussion in the previous patch series. Please look here:
https://lore.kernel.org/xen-devel/253638c4-2256-4bdd-9f12-7f99e373355e@xxxxxxxx/

If it would be better I can add the comment for these functions why
they returns 'unsigned long'.

> 
> > +                               unsigned long arg4)
> > +{
> > +    struct sbiret ret = {0};
> > +    int result = 0;
> > +
> > +    switch ( fid )
> > +    {
> > +    case SBI_EXT_RFENCE_REMOTE_FENCE_I:
> > +        ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
> > +                        0, 0, 0, 0);
> > +        break;
> > +
> > +    case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA:
> > +    case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA:
> > +    case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA:
> > +        ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
> > +                        start, size, 0, 0);
> > +        break;
> > +
> > +    case SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID:
> > +    case SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID:
> > +    case SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID:
> > +        ret = sbi_ecall(SBI_EXT_RFENCE, fid, hmask, hbase,
> > +                        start, size, arg4, 0);
> > +        break;
> > +
> > +    default:
> > +        printk("%s: unknown function ID [%lu]\n",
> 
> I wonder how useful the logging in decimal of (perhaps large) unknown
> values is.
Agree, it would much better in hex.

> 
> > +               __func__, fid);
> > +        result = -EINVAL;
> > +        break;
> > +    };
> > +
> > +    if ( ret.error )
> > +    {
> > +        result = sbi_err_map_xen_errno(ret.error);
> > +        printk("%s: hbase=%lu hmask=%#lx failed (error %d)\n",
> > +               __func__, hbase, hmask, result);
> 
> Considering that sbi_err_map_xen_errno() may lose information, I'd
> recommend logging ret.error here.
By 'lose information' you mean case SBI_ERR_FAILURE?

> 
> > +static int cf_check sbi_rfence_v02(unsigned long fid,
> > +                                   const cpumask_t *cpu_mask,
> > +                                   unsigned long start, unsigned
> > long size,
> > +                                   unsigned long arg4, unsigned
> > long arg5)
> > +{
> > +    unsigned long hartid, cpuid, hmask = 0, hbase = 0, htop = 0;
> > +    int result;
> > +
> > +    /*
> > +     * hart_mask_base can be set to -1 to indicate that hart_mask
> > can be
> > +     * ignored and all available harts must be considered.
> > +     */
> > +    if ( !cpu_mask )
> > +        return sbi_rfence_v02_real(fid, 0UL, -1UL, start, size,
> > arg4);
> > +
> > +    for_each_cpu ( cpuid, cpu_mask )
> > +    {
> > +        /*
> > +        * Hart IDs might not necessarily be numbered contiguously
> > in
> > +        * a multiprocessor system, but at least one hart must have
> > a
> > +        * hart ID of zero.
> 
> Does this latter fact matter here in any way?
It doesn't, just copy from the RISC-V spec the full sentence. If it
would be better to drop the latter fact I will be happy to do that in
the next patch version.

> 
> > +        * This means that it is possible for the hart ID mapping
> > to look like:
> > +        *  0, 1, 3, 65, 66, 69
> > +        * In such cases, more than one call to
> > sbi_rfence_v02_real() will be
> > +        * needed, as a single hmask can only cover sizeof(unsigned
> > long) CPUs:
> > +        *  1. sbi_rfence_v02_real(hmask=0b1011, hbase=0)
> > +        *  2. sbi_rfence_v02_real(hmask=0b1011, hbase=65)
> > +        *
> > +        * The algorithm below tries to batch as many harts as
> > possible before
> > +        * making an SBI call. However, batching may not always be
> > possible.
> > +        * For example, consider the hart ID mapping:
> > +        *   0, 64, 1, 65, 2, 66
> 
> Just to mention it: Batching is also possible here: First (0,1,2),
> then
> (64,65,66). It just requires a different approach. Whether switching
> is
> worthwhile depends on how numbering is done on real world systems.
For sure, it's possible to do that. I was just trying to describe the
currently implemented algorithm. If you think it's beneficial to add
that information to the comment, I can include it as well.

> 
> > +static int (* __ro_after_init sbi_rfence)(unsigned long fid,
> > +                                          const cpumask_t
> > *cpu_mask,
> > +                                          unsigned long start,
> > +                                          unsigned long size,
> > +                                          unsigned long arg4,
> > +                                          unsigned long arg5);
> > +
> > +int sbi_remote_sfence_vma(const cpumask_t *cpu_mask,
> > +                          unsigned long start_addr,
> 
> To match other functions, perhaps just "start"?
It would be better, RISC-V spec is using 'start' everywhere too, at
least, for FENCE Extension.

> 
> > +int sbi_probe_extension(long extid)
> > +{
> > +    struct sbiret ret;
> > +
> > +    ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, extid,
> > +                    0, 0, 0, 0, 0);
> > +    if ( !ret.error && ret.value )
> > +        return ret.value;
> > +
> > +    return -EOPNOTSUPP;
> 
> Any reason not to use sbi_err_map_xen_errno() here?
We can, just missed that.

Thanks.

~ Oleksii

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.