[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs



>>> On 27.02.18 at 10:21, <roger.pau@xxxxxxxxxx> wrote:
> On Tue, Feb 27, 2018 at 01:32:24AM -0700, Jan Beulich wrote:
>> >>> On 26.02.18 at 19:00, <roger.pau@xxxxxxxxxx> wrote:
>> > On Mon, Feb 26, 2018 at 04:20:17AM -0700, Jan Beulich wrote:
>> >> >>> On 23.01.18 at 16:07, <roger.pau@xxxxxxxxxx> wrote:
>> >> > +static void maybe_defer_map(struct domain *d, const struct pci_dev 
>> >> > *pdev,
>> >> > +                            struct rangeset *mem, bool map, bool rom)
>> >> > +{
>> >> > +    struct vcpu *curr = current;
>> >> > +
>> >> > +    if ( is_idle_vcpu(curr) )
>> >> > +    {
>> >> > +        struct map_data data = { .d = d, .map = true };
>> >> > +
>> >> > +        /*
>> >> > +         * Only used for domain construction in order to map the BARs
>> >> > +         * of devices with memory decoding enabled.
>> >> > +         */
>> >> > +        ASSERT(map && !rom);
>> >> > +        rangeset_consume_ranges(mem, map_range, &data);
>> >> 
>> >> What if this produces -ENOMEM? And despite having looked at
>> >> several revisions of this, I can't make the connection to why this
>> >> is in an is_idle_vcpu() conditional (neither the direct caller nor the
>> >> next level up make this obvious to me). There's clearly a need
>> >> for extending the comment.
>> > 
>> > I thought the comment above that mentions domain construction would be
>> > enough. I can try to expand this, maybe like:
>> > 
>> > "This function will only be called from the idle vCPU while building
>> > the domain, in which case it's not possible to defer the operation
>> > (like done in the else branch). Call rangeset_consume_ranges in order
>> > to establish the mappings right away."
>> 
>> And what again is the connection between is_idle_domain() and
>> domain construction? I think the comment belongs ahead of the
>> if(), and it needs to make that connection.
> 
> Oh, domain constructions runs on the idle vCPU, that's the relation.
> 
> "This function will only be called from the idle vCPU while building
> the domain (because Dom0 building runs on the idle vCPU), in which
> case it's not possible to defer the operation (like done in the else
> branch). Call rangeset_consume_ranges in order to establish the
> mappings right away."
> 
> Does that seem clearer if placed ahead of the if?

Can be shorter imo - the thing I didn't pay attention to is that
this is all about _dom0_ building, not general _domain_ building.
Hence perhaps:

"Dom0 building runs on the idle vCPU, in which case it's not possible
 to defer the operation (like done in the else branch). Call
 rangeset_consume_ranges in order to establish the mappings right
 away."

>> >> > +static void modify_bars(const struct pci_dev *pdev, bool map, bool rom)
>> >> > +{
>> >> > +    struct vpci_header *header = &pdev->vpci->header;
>> >> > +    struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>> >> > +    const struct pci_dev *tmp;
>> >> > +    unsigned int i;
>> >> > +    int rc;
>> >> > +
>> >> > +    if ( !map )
>> >> > +        modify_decoding(pdev, false, rom);
>> >> > +
>> >> > +    if ( !mem )
>> >> > +        return;
>> >> 
>> >> Similarly here - why is it okay (or what effect will it have) to return
>> >> here when we're out of memory, but the caller won't know?
>> > 
>> > The behaviour here depends on the change to the memory decoding bit:
>> > 
>> >  - Clearing: memory decoding on device will be disabled, BARs won't be
>> >    unmapped.
>> >  - Setting: no change to device memory decoding bit, BARs won't be
>> >    mapped.
>> > 
>> > Do you think this is suitable? IMO it's fine to disable the memory
>> > decoding bit on the device and leave the memory regions mapped.
>> 
>> As long as subsequent changes to the decoding bit can't leave
>> stale mappings. Plus there needs to be a comment to explain this.
> 
> With the current approach in the unmap case there will be stale
> mappings left behind.
> 
> I guess it's better then to not modify the memory decoding bit at all
> until the operation finishes. That also rises the question of whether
> the memory decoding bit should be modified if p2m mapping/unmapping
> reports an error.
> 
> Should we also attempt to rollback failed map/unmap operations? What
> happens if the rollback also fails?

It is in particular this last question why I don't think rollback makes
sense. If there's any failure, I think the decode bit should be sticky
clear; we may want (need?) to invent some magical mechanism to
get a device back out of this state later on. But that way stale
mappings are not an immediate problem (I think).

> What about the following:
> 
>     +---------+   SUCCESS  +---------------------------------+
>     |map/unmap+------------>Change decoding or ROM enable bit|
>     +----+----+            +---------------------------------+
>          |
>          |FAILURE
>          |
> +--------v-------+ SUCCESS +---------------------------------------+
> |revert operation+--------->No change to decoding or ROM enable bit|
> +--------+-------+         +---------------------------------------+
>          |
>          |FAILURE
>          |
>    +-----v-----+
>    |Kill domain|
>    +-----------+

Possibly. Killing Dom0 is a bad thing though, if just some random
device has a problem.

>> >> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>> >> > +    {
>> >> > +        const struct vpci_bar *bar = &header->bars[i];
>> >> > +
>> >> > +        if ( !MAPPABLE_BAR(bar) ||
>> >> > +             (rom ? bar->type != VPCI_BAR_ROM
>> >> > +                  : (bar->type == VPCI_BAR_ROM && 
>> >> > !header->rom_enabled)) )
>> >> > +            continue;
>> >> 
>> >> Why does rom_enabled matter for the !rom case rather than for
>> >> the rom one? I.e.
>> >> 
>> >>         if ( !MAPPABLE_BAR(bar) ||
>> >>              (rom ? bar->type != VPCI_BAR_ROM || !header->rom_enabled
>> >>                   : bar->type == VPCI_BAR_ROM) )
>> >>             continue;
>> >> 
>> >> ?
>> > 
>> > No, for the ROM case we only want to map/unmap the ROM, so that's the
>> > only thing added to the rangeset. For the non-ROM case Xen will also
>> > map/unmap the ROM if the enable bit is set.
>> > 
>> > Your proposed code would always map/unmap the ROM into the p2m when
>> > the memory decoding bit is changed even if it's not enabled.
>> 
>> I don't understand. Taking apart the conditional I've suggested,
>> and converting to human language:
>> - if the BAR is no mappable, continue
>> - if we want to deal with ROM (rom=true), if the BAR isn't ROM
>>   or isn't enabled, continue
> 
> With the current flow rom_enabled is set after the ROM BAR mappings
> are established, which means that when modify_bars is called with
> rom=true rom_enabled is not yet set, and using the logic above the
> mappings won't be created.
> 
>> - if we want to deal with non-ROM (rom=false), if the BAR is ROM,
>>   continue
> 
> Xen also has to deal with the ROM if it's enabled when the memory
> decoding bit is toggled, hence in rom=false case the ROM also needs to
> be mapped/unampped if it's enabled.
> 
> This dependency between the memory decoding bit and the ROM enable bit
> is quite convoluted TBH.
> 
>> To me this is in line with the 2nd paragraph of your reply. It's not
>> in line with the first, which makes me wonder whether "rom" is
>> misnamed and wants to be "rom_only". Still, the question would
>> remain of why rom_enabled doesn't matter when the variable is
>> true.
> 
> Yes, rom means rom_only (ie: ROM enable bit has been toggled and
> memory decoding bit is enabled). I assume you would prefer to change
> the name to rom_only.

Yes, and perhaps provide an abridged version of your explanation
above in a comment next to the conditional.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.