[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/9] vpci/header: Implement guest BAR register handlers


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Oleksandr Andrushchenko <andr2000@xxxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Wed, 8 Sep 2021 09:43:59 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=X2aKd0k+XEncRABb6WT/koWSa4++HRFGx8CZSLC/ryw=; b=dbGjh8a2Aa9nmAJX9Uo2tAWU2fkRocCXSVj8ZGyqsXpLRSZ5MpJyHeMJwJVAQm28TeYWKnSxzgNrDQbaEl7TI8JFZxOgZiBEF5t59tMl6hLKoCqpxzYCD67FskBR3AtAGLNfn929vlqOnjVwyHynKNeRFrgGkz4V0yA7VEx5zkhXP3OwC6wBuTmR4jsxRfoBwEtLffxnhJDad+7HLWRa18/oImX1PrlOejbD0DYtCU/M+7c08ggseOyqIbHzQcuNZFPkk17tvJZ5OZTkR+nntKrtolUBG7DRiGsNq6PkOBEpDCeL65oBBdo7KzeSdvXwmao+LrgzZxHVXNUPBnRQ4Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VSFa/X+79PDEBINK7vW5RYS39EJ8wso4+uI1RozkiTUHmKBcySqoxW89V8sJti/sef79MWdSMCzdxWJHgx0ZM+iPPiePEhzJEaOXyytdv9zsUeL8tsgADslosXT5szef2IT4qEjNTDo1xTyzZpNEGibCeDwPcyHNlA0nZx36/jNopxP6dy/hqAKo0Tnawz9PkNh/x+fCMi+VX7EO7HLy/qc3Grajgp4OEeCV971s8+Gks+g0WjktZp+fFySltxUscARmMaAwOCGpWm/KyHg3iW7WRgmjxfWir4uxQxqhpb7V/VCUgIU50cwyQ7xzljFh3JilYSJi+3opTykpGHDHgw==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=epam.com;
  • Cc: "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 08 Sep 2021 09:44:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXoKxmxnIdZosa7ESbvTmPlG6KHauXFn+AgAGCLICAADFtgIAAE0qAgAEIooCAAAS6AA==
  • Thread-topic: [PATCH 5/9] vpci/header: Implement guest BAR register handlers

On 08.09.21 12:27, Jan Beulich wrote:
> On 07.09.2021 19:39, Oleksandr Andrushchenko wrote:
>> On 07.09.21 19:30, Jan Beulich wrote:
>>> On 07.09.2021 15:33, Oleksandr Andrushchenko wrote:
>>>> On 06.09.21 17:31, Jan Beulich wrote:
>>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>> @@ -400,12 +400,72 @@ static void bar_write(const struct pci_dev *pdev, 
>>>>>> unsigned int reg,
>>>>>>     static void guest_bar_write(const struct pci_dev *pdev, unsigned int 
>>>>>> reg,
>>>>>>                                 uint32_t val, void *data)
>>>>>>     {
>>>>>> +    struct vpci_bar *bar = data;
>>>>>> +    bool hi = false;
>>>>>> +
>>>>>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>>>>>> +    {
>>>>>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>>>>>> +        bar--;
>>>>>> +        hi = true;
>>>>>> +    }
>>>>>> +    else
>>>>>> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
>>>>>> +    bar->guest_addr &= ~(0xffffffffull << (hi ? 32 : 0));
>>>>>> +    bar->guest_addr |= (uint64_t)val << (hi ? 32 : 0);
>>>>> What you store here is not the address that's going to be used,
>>>>>     as
>>>>> you don't mask off the low bits (to account for the BAR's size).
>>>>> When a BAR gets written with all ones, all writable bits get these
>>>>> ones stored. The address of the BAR, aiui, really changes to
>>>>> (typically) close below 4Gb (in the case of a 32-bit BAR), which
>>>>> is why memory / I/O decoding should be off while sizing BARs.
>>>>> Therefore you shouldn't look for the specific "all writable bits
>>>>> are ones" pattern (or worse, as you presently do, the "all bits
>>>>> outside of the type specifier are ones" one) on the read path.
>>>>> Instead mask the value appropriately here, and simply return back
>>>>> the stored value from the read path.

But in case of BAR sizing I need to actually return BAR size.

So, the comparison is the way to tell if the guest wants to read

actual (configured) BAR value or it tries to determine BAR's size.

This is why I compare and use the result as the answer to what needs

to be supplied to the guest. So, if I don't compare with 0xffffffff for the

hi part and 0xfffffff0 for the low then how do I know when to return

configured BAR or return the size?

>>>> "PCI LOCAL BUS SPECIFICATION, REV. 3.0", "IMPLEMENTATION NOTE
>>>>
>>>> Sizing a 32-bit Base Address Register Example" says, that
>>>>
>>>> "Software saves the original value of the Base Address register, writes
>>>> 0 FFFF FFFFh to the register, then reads it back."
>>>>
>>>> The same applies for 64-bit BARs. So what's wrong if I try to catch such
>>>> a write when a guest tries to size the BAR? The only difference is that
>>>> I compare as
>>>>            if ( (val & PCI_BASE_ADDRESS_MEM_MASK_32) == 
>>>> PCI_BASE_ADDRESS_MEM_MASK_32 )
>>>> which is because val in the question has lower bits cleared.
>>> Because while this matches what the spec says, it's not enough to
>>> match how hardware behaves.
>> But we should implement as the spec says, not like buggy hardware behaves
> The behavior I'm describing doesn't violate the spec. There merely is
> more to it than what the spec says, or one might also view it the way
> that the spec doesn't use the best way of expressing things.

Well, the spec explicitly says "write 0xffffffff and read back"

I can't see any way it can be read differently

>
>>>    Yet you want to mimic hardware behavior
>>> as closely as possible here. There is (iirc) at least one other
>>> place in the source tree were we had to adjust a similar initial
>>> implementation to be closer to one expected by guests,
>> Could you please point me to that code so I can consult and possibly
>> re-use the approach?
> I only recall the fact; this might have been hvmloader, vpci, or yet
> somewhere else. I'm sorry.
No problem
>
>>>>>> @@ -522,6 +582,13 @@ static int add_bar_handlers(struct pci_dev *pdev, 
>>>>>> bool is_hwdom)
>>>>>>                 if ( rc )
>>>>>>                     return rc;
>>>>>>             }
>>>>>> +        /*
>>>>>> +         * It is neither safe nor secure to initialize guest's view of 
>>>>>> the BARs
>>>>>> +         * with real values which are used by the hardware domain, so 
>>>>>> assign
>>>>>> +         * all zeros to guest's view of the BARs, so the guest can 
>>>>>> perform
>>>>>> +         * proper PCI device enumeration and assign BARs on its own.
>>>>>> +         */
>>>>>> +        bars[i].guest_addr = 0;
>>>>> I'm afraid I don't understand the comment: Without memory decoding
>>>>> enabled, the BARs are simple registers (with a few r/o bits).
>>>> My first implementation was that bar->guest_addr was initialized with
>>>> the value of bar->addr (physical BAR value), but talking on IRC with
>>>> Roger he suggested that this might be a security issue to let guest
>>>> a hint about physical values, so then I changed the assignment to be 0.
>>> Well, yes, that's certainly correct. It's perhaps too unobvious to me
>>> why one may want to use the host value here in the first place. It
>>> simply has no meaning here.
>> Do you want me to remove the comment?
> Yes. I wonder whether the assignment is necessary in the first place:
> I'd somehow expect the structure to come from xzalloc(). Albeit I
> guess this function can be run through more than once without freeing
> the structure and then allocating is anew?

I'll check that and if the structure is already zeroed then I'll remove both the

assignment and the comment

>
> Jan
>
Thank you,

Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.