[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 1/3] xen/vpci: Move ecam access functions to common code


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Fri, 15 Oct 2021 07:37:10 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8oIQyPXfJivnZ1dnrFTnqt8GoLhD7FLWCK9WSsvURTM=; b=nGdN2r/W7K6dES864SRJvCMiIMHV31+Apn6tVfDPejqGjxE33DFneX6ec4gE3LcnnBr9zhublbA+Ql9gyMfnhWhgenD/2ivrwZsw2xUyeNskUFOtkAAM37DBEPXPjQnTwmZ1RFhXaoGM7GHL7knyotqLfge4Nc4InKw1QUhmqHzYr6mX93vD0eXs+1kolPnAYFp9YAAhITTIAkNjD9T1zXjAemeHgIAmEJIuajMSRLVT9IN7O18SidWVm1EzRu06Crot/5XTyXzqCDzVO9clelqaMMPbijUApRwqfBoCdjrg93fLpUTt0fwwEr4khLRLFWxdi5odlCksu2SxgU+1UA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=E6v1ARB+j8HFeIk0mZY5s3YMdhudOvZeCpnh9sTxEeKteyfmMsF3KhnHgfQUVksyX+xwrkyId49anViRkcEUEfrqzBkpun0H60lZj33iuO/wFxqEdrhF7Cu93AJBQYBvFyOe/3+nMinR7Ilc5s+1KSkDyAAAMQXKYciQHuTg/A3GijK2M+DXAu6v4BcrVftcsG3xg2n1NdixSb6G2J1sU4/ya697t7N30LocU2IWkB8myMu0TOSKAlhqRi9aK+3xQplgdYFmd40tsTYmcc06rH3H2OIH2fQkQAWFaTGhlL5bBtAf65kTFc0A9UpgjIJ0TkTCdiCl78PWR09EyQwKvw==
  • Authentication-results-original: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;
  • Cc: Ian Jackson <iwj@xxxxxxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>
  • Delivery-date: Fri, 15 Oct 2021 07:37:27 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXwQraA42SZeh0QUGI/xX7keuWA6vSqLgAgAARnQCAAN+ngIAAEtoA
  • Thread-topic: [PATCH v6 1/3] xen/vpci: Move ecam access functions to common code

Hi Jan,

> On 15 Oct 2021, at 07:29, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
> On 14.10.2021 19:09, Bertrand Marquis wrote:
>>> On 14 Oct 2021, at 17:06, Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>> On 14.10.2021 16:49, Bertrand Marquis wrote:
>>>> @@ -305,7 +291,7 @@ static int vpci_portio_read(const struct 
>>>> hvm_io_handler *handler,
>>>> 
>>>>    reg = hvm_pci_decode_addr(cf8, addr, &sbdf);
>>>> 
>>>> -    if ( !vpci_access_allowed(reg, size) )
>>>> +    if ( !vpci_ecam_access_allowed(reg, size) )
>>>>        return X86EMUL_OKAY;
>>>> 
>>>>    *data = vpci_read(sbdf, reg, size);
>>>> @@ -335,7 +321,7 @@ static int vpci_portio_write(const struct 
>>>> hvm_io_handler *handler,
>>>> 
>>>>    reg = hvm_pci_decode_addr(cf8, addr, &sbdf);
>>>> 
>>>> -    if ( !vpci_access_allowed(reg, size) )
>>>> +    if ( !vpci_ecam_access_allowed(reg, size) )
>>>>        return X86EMUL_OKAY;
>>>> 
>>>>    vpci_write(sbdf, reg, size, data);
>>> 
>>> Why would port I/O functions call an ECAM helper? And in how far is
>>> that helper actually ECAM-specific?
>> 
>> The function was global before.
> 
> I'm not objecting to the function being global, but to the "ecam" in
> its name.

Adding ecam in the name was a request from Roger.
This is just a consequence of this.

One suggestion here could be to turn vpci_ecam_access_allowed into 
vpci_access_allowed and maybe put this into vpci.h as a static inline ?

> 
>>>> @@ -434,25 +420,8 @@ static int vpci_mmcfg_read(struct vcpu *v, unsigned 
>>>> long addr,
>>>>    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
>>>>    read_unlock(&d->arch.hvm.mmcfg_lock);
>>>> 
>>>> -    if ( !vpci_access_allowed(reg, len) ||
>>>> -         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
>>>> -        return X86EMUL_OKAY;
>>> 
>>> While I assume this earlier behavior is the reason for ...
>> 
>> Yes :-)
>> 
>>> 
>>>> -    /*
>>>> -     * According to the PCIe 3.1A specification:
>>>> -     *  - Configuration Reads and Writes must usually be DWORD or smaller
>>>> -     *    in size.
>>>> -     *  - Because Root Complex implementations are not required to support
>>>> -     *    accesses to a RCRB that cross DW boundaries [...] software
>>>> -     *    should take care not to cause the generation of such accesses
>>>> -     *    when accessing a RCRB unless the Root Complex will support the
>>>> -     *    access.
>>>> -     *  Xen however supports 8byte accesses by splitting them into two
>>>> -     *  4byte accesses.
>>>> -     */
>>>> -    *data = vpci_read(sbdf, reg, min(4u, len));
>>>> -    if ( len == 8 )
>>>> -        *data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>> +    /* Ignore return code */
>>>> +    vpci_ecam_mmio_read(sbdf, reg, len, data);
>>> 
>>> ... the commented-upon ignoring of the return value, I don't think
>>> that's a good way to deal with things anymore. Instead I think
>>> *data should be written to ~0 upon failure, unless it is intended
>>> for vpci_ecam_mmio_read() to take care of that case (in which case
>>> I'm not sure I would see why it needs to return an error indicator
>>> in the first place).
>> 
>> I am not sure in the first place why this is actually ignored and just
>> returning a -1 value.
>> If an access is not right, an exception should be generated to the
>> Guest instead.
> 
> No. That's also not what happens on bare metal, at least not on x86.
> Faults cannot be raised for reasons outside of the CPU; such errors
> (if these are errors in the first place) need to be dealt with
> differently. Signaling an error on the PCI bus would be possible,
> but would leave open how that's actually to be dealt with. Instead
> bad reads return all ones, while bad writes simply get dropped.

So that behaviour is kept here on x86 and I think as the function is
generic it is right for it to return an error here. It is up to the caller to
ignore it or not.
To make this more generic I could return 0 on success and -EACCESS,
the caller would then handle it as he wants.

> 
>> When we do that on arm the function is returning an error to the upper
>> layer in that case, that’s why I did keep a generic function informing the
>> caller.
> 
> While you're the Arm expert, with the above in mind I wonder what
> the actual action in that case ought to be there. Would you explain
> to me how, say, a misaligned 2-byte read that the CPU permits but
> the PCI subsystem doesn't like would be dealt with by bare metal?

The hardware will probably return a BUS error as the access did not
work. But the CPU might just prevent unaligned mmio access on the
first place.
@julien: Maybe you know that one ? Otherwise I can dig to make sure
I answer that right.

> 
>>>> @@ -476,13 +445,8 @@ static int vpci_mmcfg_write(struct vcpu *v, unsigned 
>>>> long addr,
>>>>    reg = vpci_mmcfg_decode_addr(mmcfg, addr, &sbdf);
>>>>    read_unlock(&d->arch.hvm.mmcfg_lock);
>>>> 
>>>> -    if ( !vpci_access_allowed(reg, len) ||
>>>> -         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
>>>> -        return X86EMUL_OKAY;
>>>> -
>>>> -    vpci_write(sbdf, reg, min(4u, len), data);
>>>> -    if ( len == 8 )
>>>> -        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>> +    /* Ignore return code */
>>>> +    vpci_ecam_mmio_write(sbdf, reg, len, data);
>>> 
>>> Here ignoring is fine imo, but if you feel it is important to
>>> comment on this, then I think you need to prefer "why" over "what".
>> 
>> Agree I would just need some help on the why.
>> Now there was no comment before to explain why so I could also
>> remove the comment altogether.
> 
> The latter would be my preference.

Ok I will do that.

> 
>>>> --- a/xen/drivers/vpci/vpci.c
>>>> +++ b/xen/drivers/vpci/vpci.c
>>>> @@ -478,6 +478,66 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, 
>>>> unsigned int size,
>>>>    spin_unlock(&pdev->vpci->lock);
>>>> }
>>>> 
>>>> +/* Helper function to check an access size and alignment on vpci space. */
>>>> +bool vpci_ecam_access_allowed(unsigned int reg, unsigned int len)
>>>> +{
>>>> +    /*
>>>> +     * Check access size.
>>>> +     *
>>>> +     * On arm32 or for 32bit guests on arm, 64bit accesses should be 
>>>> forbidden
>>>> +     * but as for those platform ISV register, which gives the access 
>>>> size,
>>>> +     * cannot have a value 3, checking this would just harden the code.
>>>> +     */
>>>> +    if ( len != 1 && len != 2 && len != 4 && len != 8 )
>>>> +        return false;
>>> 
>>> I'm not convinced talking about Arm specifically here is
>>> warranted, unless there's something there that's clearly
>>> different from all other architectures. Otherwise the comment
>>> should imo be written in more general terms.
>> 
>> Other architectures might allow this case. So this is specific to Arm.
> 
> If it really is, I consider it wrong to live in common code. If
> per-arch tweaking is necessary, and if earlier handling of the
> intercepted access doesn't already exclude "bad" cases, then a
> per-arch hook would imo be the way to go here. Given the size
> of the function I would then wonder why it doesn't remain per-
> arch in the first place.

To have this in common code was a request from Roger and as
The code is the same I think that is ok.
I suggested before to turn this into a static inline and remove “ecam”
from the name.
For the comment this was a request from Julien in the first place but
before this was moved to the common code.
I can remove the comment from the common code and put it in the
Arm vpci code instead.
@Julien: would that be acceptable for you now ?
Otherwise I can remove it all together.

> 
>>>> +int vpci_ecam_mmio_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int 
>>>> len,
>>>> +                         unsigned long data)
>>>> +{
>>>> +    if ( !vpci_ecam_access_allowed(reg, len) ||
>>>> +         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
>>>> +        return 0;
>>>> +
>>>> +    vpci_write(sbdf, reg, min(4u, len), data);
>>>> +    if ( len == 8 )
>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>> +
>>>> +    return 1;
>>>> +}
>>>> +
>>>> +int vpci_ecam_mmio_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int 
>>>> len,
>>>> +                        unsigned long *data)
>>>> +{
>>>> +    if ( !vpci_ecam_access_allowed(reg, len) ||
>>>> +         (reg + len) > PCI_CFG_SPACE_EXP_SIZE )
>>>> +        return 0;
>>>> +
>>>> +    /*
>>>> +     * According to the PCIe 3.1A specification:
>>>> +     *  - Configuration Reads and Writes must usually be DWORD or smaller
>>>> +     *    in size.
>>>> +     *  - Because Root Complex implementations are not required to support
>>>> +     *    accesses to a RCRB that cross DW boundaries [...] software
>>>> +     *    should take care not to cause the generation of such accesses
>>>> +     *    when accessing a RCRB unless the Root Complex will support the
>>>> +     *    access.
>>>> +     *  Xen however supports 8byte accesses by splitting them into two
>>>> +     *  4byte accesses.
>>>> +     */
>>>> +    *data = vpci_read(sbdf, reg, min(4u, len));
>>>> +    if ( len == 8 )
>>>> +        *data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>> +
>>>> +    return 1;
>>>> +}
>>> 
>>> Why do these two functions return int/0/1 instead of
>>> bool/false/true (assuming, as per above, that them returning non-
>>> void is warranted at all)?
>> 
>> This is what the mmio handlers should return to say that an access
>> was ok or not so the function stick to this standard.
> 
> Sticking to this would be okay if the functions here needed their
> address taken, such that they can be installed as hooks for a
> more general framework to invoke. The functions, however, only get
> called directly. Hence there's no reason to mirror what is in need
> of cleaning up elsewhere. I'm sure you're aware there we're in the
> (slow going) process of improving which types get used where.
> While the functions you refer to may not have undergone such
> cleanup yet, we generally expect new code to conform to the new
> model.

I am ok to rename those to vpci_ecam_{read/write}.
Is it what you want ?

Regards
Bertrand

> 
> Jan
> 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.