[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs



On Tue, Feb 27, 2018 at 01:32:24AM -0700, Jan Beulich wrote:
> >>> On 26.02.18 at 19:00, <roger.pau@xxxxxxxxxx> wrote:
> > On Mon, Feb 26, 2018 at 04:20:17AM -0700, Jan Beulich wrote:
> >> >>> On 23.01.18 at 16:07, <roger.pau@xxxxxxxxxx> wrote:
> >> > +bool vpci_process_pending(struct vcpu *v)
> >> > +{
> >> > +    while ( v->vpci.mem )
> >> > +    {
> >> > +        struct map_data data = {
> >> > +            .d = v->domain,
> >> > +            .map = v->vpci.map,
> >> > +        };
> >> > +
> >> > +        switch ( rangeset_consume_ranges(v->vpci.mem, map_range, &data) 
> >> > )
> >> > +        {
> >> > +        case -ERESTART:
> >> > +            return true;
> >> > +
> >> > +        default:
> >> > +            if ( v->vpci.map )
> >> > +            {
> >> > +                spin_lock(&v->vpci.pdev->vpci->lock);
> >> > +                modify_decoding(v->vpci.pdev, v->vpci.map, v->vpci.rom);
> >> > +                spin_unlock(&v->vpci.pdev->vpci->lock);
> >> > +            }
> >> > +            /* fallthrough. */
> >> > +        case -ENOMEM:
> >> 
> >> You carefully handle this error here.
> > 
> > On second thought, I'm not sure handling ENOMEM separately makes
> > sense. Unless you object I plan to remove this special casing.
> 
> Indeed I was never really happy with the special casing, but I
> did recall this being done intentionally, so I also didn't complain.

IIRC we had some discussion about the different error codes and
somehow decided that ENOMEM might be worse than others, in the sense
that less memory would be mapped and the chances of the device
actually working would be slim. IMO we can always special case errors
later, let's start with something simpler.

> >> > +static void maybe_defer_map(struct domain *d, const struct pci_dev 
> >> > *pdev,
> >> > +                            struct rangeset *mem, bool map, bool rom)
> >> > +{
> >> > +    struct vcpu *curr = current;
> >> > +
> >> > +    if ( is_idle_vcpu(curr) )
> >> > +    {
> >> > +        struct map_data data = { .d = d, .map = true };
> >> > +
> >> > +        /*
> >> > +         * Only used for domain construction in order to map the BARs
> >> > +         * of devices with memory decoding enabled.
> >> > +         */
> >> > +        ASSERT(map && !rom);
> >> > +        rangeset_consume_ranges(mem, map_range, &data);
> >> 
> >> What if this produces -ENOMEM? And despite having looked at
> >> several revisions of this, I can't make the connection to why this
> >> is in an is_idle_vcpu() conditional (neither the direct caller nor the
> >> next level up make this obvious to me). There's clearly a need
> >> for extending the comment.
> > 
> > I thought the comment above that mentions domain construction would be
> > enough. I can try to expand this, maybe like:
> > 
> > "This function will only be called from the idle vCPU while building
> > the domain, in which case it's not possible to defer the operation
> > (like done in the else branch). Call rangeset_consume_ranges in order
> > to establish the mappings right away."
> 
> And what again is the connection between is_idle_domain() and
> domain construction? I think the comment belongs ahead of the
> if(), and it needs to make that connection.

Oh, domain constructions runs on the idle vCPU, that's the relation.

"This function will only be called from the idle vCPU while building
the domain (because Dom0 building runs on the idle vCPU), in which
case it's not possible to defer the operation (like done in the else
branch). Call rangeset_consume_ranges in order to establish the
mappings right away."

Does that seem clearer if placed ahead of the if?

> >> > +static void modify_bars(const struct pci_dev *pdev, bool map, bool rom)
> >> > +{
> >> > +    struct vpci_header *header = &pdev->vpci->header;
> >> > +    struct rangeset *mem = rangeset_new(NULL, NULL, 0);
> >> > +    const struct pci_dev *tmp;
> >> > +    unsigned int i;
> >> > +    int rc;
> >> > +
> >> > +    if ( !map )
> >> > +        modify_decoding(pdev, false, rom);
> >> > +
> >> > +    if ( !mem )
> >> > +        return;
> >> 
> >> Similarly here - why is it okay (or what effect will it have) to return
> >> here when we're out of memory, but the caller won't know?
> > 
> > The behaviour here depends on the change to the memory decoding bit:
> > 
> >  - Clearing: memory decoding on device will be disabled, BARs won't be
> >    unmapped.
> >  - Setting: no change to device memory decoding bit, BARs won't be
> >    mapped.
> > 
> > Do you think this is suitable? IMO it's fine to disable the memory
> > decoding bit on the device and leave the memory regions mapped.
> 
> As long as subsequent changes to the decoding bit can't leave
> stale mappings. Plus there needs to be a comment to explain this.

With the current approach in the unmap case there will be stale
mappings left behind.

I guess it's better then to not modify the memory decoding bit at all
until the operation finishes. That also rises the question of whether
the memory decoding bit should be modified if p2m mapping/unmapping
reports an error.

Should we also attempt to rollback failed map/unmap operations? What
happens if the rollback also fails?

What about the following:

    +---------+   SUCCESS  +---------------------------------+
    |map/unmap+------------>Change decoding or ROM enable bit|
    +----+----+            +---------------------------------+
         |
         |FAILURE
         |
+--------v-------+ SUCCESS +---------------------------------------+
|revert operation+--------->No change to decoding or ROM enable bit|
+--------+-------+         +---------------------------------------+
         |
         |FAILURE
         |
   +-----v-----+
   |Kill domain|
   +-----------+

> >> > +    /*
> >> > +     * Create a rangeset that represents the current device BARs memory 
> >> > region
> >> > +     * and compare it against all the currently active BAR memory 
> >> > regions. If
> >> > +     * an overlap is found, subtract it from the region to be
> >> > +     * mapped/unmapped.
> >> > +     *
> >> > +     * NB: the rangeset uses inclusive frame numbers.
> >> > +     */
> >> > +
> >> > +    /*
> >> > +     * First fill the rangeset with all the BARs of this device or with 
> >> > the ROM
> >> > +     * BAR only, depending on whether the guest is toggling the memory 
> >> > decode
> >> > +     * bit of the command register, or the enable bit of the ROM BAR 
> >> > register.
> >> > +     */
> >> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> >> > +    {
> >> > +        const struct vpci_bar *bar = &header->bars[i];
> >> > +
> >> > +        if ( !MAPPABLE_BAR(bar) ||
> >> > +             (rom ? bar->type != VPCI_BAR_ROM
> >> > +                  : (bar->type == VPCI_BAR_ROM && 
> >> > !header->rom_enabled)) )
> >> > +            continue;
> >> 
> >> Why does rom_enabled matter for the !rom case rather than for
> >> the rom one? I.e.
> >> 
> >>         if ( !MAPPABLE_BAR(bar) ||
> >>              (rom ? bar->type != VPCI_BAR_ROM || !header->rom_enabled
> >>                   : bar->type == VPCI_BAR_ROM) )
> >>             continue;
> >> 
> >> ?
> > 
> > No, for the ROM case we only want to map/unmap the ROM, so that's the
> > only thing added to the rangeset. For the non-ROM case Xen will also
> > map/unmap the ROM if the enable bit is set.
> > 
> > Your proposed code would always map/unmap the ROM into the p2m when
> > the memory decoding bit is changed even if it's not enabled.
> 
> I don't understand. Taking apart the conditional I've suggested,
> and converting to human language:
> - if the BAR is no mappable, continue
> - if we want to deal with ROM (rom=true), if the BAR isn't ROM
>   or isn't enabled, continue

With the current flow rom_enabled is set after the ROM BAR mappings
are established, which means that when modify_bars is called with
rom=true rom_enabled is not yet set, and using the logic above the
mappings won't be created.

> - if we want to deal with non-ROM (rom=false), if the BAR is ROM,
>   continue

Xen also has to deal with the ROM if it's enabled when the memory
decoding bit is toggled, hence in rom=false case the ROM also needs to
be mapped/unampped if it's enabled.

This dependency between the memory decoding bit and the ROM enable bit
is quite convoluted TBH.

> To me this is in line with the 2nd paragraph of your reply. It's not
> in line with the first, which makes me wonder whether "rom" is
> misnamed and wants to be "rom_only". Still, the question would
> remain of why rom_enabled doesn't matter when the variable is
> true.

Yes, rom means rom_only (ie: ROM enable bit has been toggled and
memory decoding bit is enabled). I assume you would prefer to change
the name to rom_only.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.