|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v9 07/11] vpci/bars: add handlers to map the BARs
On Thu, Mar 15, 2018 at 06:41:00AM -0600, Jan Beulich wrote:
> >>> On 15.03.18 at 12:33, <roger.pau@xxxxxxxxxx> wrote:
> > On Wed, Mar 14, 2018 at 10:13:16AM -0600, Jan Beulich wrote:
> >> >>> On 14.03.18 at 15:04, <roger.pau@xxxxxxxxxx> wrote:
> >> > +static int maybe_defer_map(struct domain *d, struct pci_dev *pdev,
> >> > + struct rangeset *mem, bool map, bool rom)
> >> > +{
> >> > + struct vcpu *curr = current;
> >> > + int rc;
> >> > +
> >> > + if ( is_idle_vcpu(curr) )
> >> > + {
> >> > + struct map_data data = { .d = d, .map = true };
> >> > +
> >> > + /*
> >> > + * Dom0 building runs on the idle vCPU, in which case it's not
> >> > possible
> >> > + * to defer the operation (like done in the else branch). Call
> >> > + * rangeset_consume_ranges in order to establish the mappings
> >> > right
> >> > + * away.
> >> > + */
> >>
> >> For one I think this comment belongs above the if(), as that's what
> >> it explains, not the ASSERT() that follows. And then it clarifies only
> >> half of what needs clarifying: Why can't we make it here on an idle
> >> vCPU outside of Dom0 building (e.g. through a tasklet), or if we can,
> >> why is the given behavior the intended one?
> >
> > Since this seems to be causing confusion, what about using:
> >
> > system_state != SYS_STATE_active
> >
> > Instead of checking if running on the idle vpcu. Do you think that
> > would make it clearer?
>
> Yes, I think so. That would then make clear that if you moved the
> conditional into the only caller and split the function, the Dom0 one
> could even become __init.
I've splitted the function into defer_map and apply_map, and added the
following at the end of modify_bars:
if ( system_state != SYS_STATE_active )
{
/*
* Mappings might be created when building Dom0 if the memory decoding
* bit of PCI devices is enabled. In that case it's not possible to
* defer the operation, so call apply_map in order to create the
* mappings right away. Note that at build time this function will only
* be called iff the memory decoding bit is enabled, thus the operation
* will always be to establish mappings and process all the BARs.
*/
ASSERT(map && !rom_only);
return apply_map(pdev->domain, dev, mem);
}
defer_map(pdev->domain, dev, mem, map, rom_only);
Would you also like me to change the is_idle_vcpu check in map_range
to use system_state == SYS_STATE_active also?
> >> > + ASSERT(map && !rom);
> >>
> >> I can see why you assume it's not an un-mapping request (albeit
> >> I wonder whether you couldn't instead simply set .map above to
> >> the incoming value), but why the !rom part?
> >
> > This branch will only be used at Dom0 build time, when none of the
> > BARs are mapped into the p2m, so asking for an unmap in this case
> > would be wrong.
>
> Yes, that's the part I've said I understand, yet was saying that the
> code wouldn't become incorrect if you set .map to map. You don't
> explain the !rom part at all, though.
Let's see if the comment above clarifies a little bit the non deferred
map.
> >> > + */
> >> > +
> >> > + /*
> >> > + * First fill the rangeset with all the BARs of this device or with
> >> > the ROM
> >> > + * BAR only, depending on whether the guest is toggling the memory
> >> > decode
> >> > + * bit of the command register, or the enable bit of the ROM BAR
> >> > register.
> >> > + */
> >> > + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> >> > + {
> >> > + const struct vpci_bar *bar = &header->bars[i];
> >> > +
> >> > + if ( !MAPPABLE_BAR(bar) ||
> >> > + (rom_only ? bar->type != VPCI_BAR_ROM
> >> > + : (bar->type == VPCI_BAR_ROM &&
> >> > !header->rom_enabled)) )
> >> > + continue;
> >> > +
> >> > + rc = rangeset_add_range(mem, PFN_DOWN(bar->addr),
> >> > + PFN_UP(bar->addr + bar->size - 1));
> >> > + if ( rc )
> >> > + {
> >> > + printk(XENLOG_G_WARNING
> >> > + "Failed to add [%" PRI_gfn ", %" PRI_gfn "]: %d\n",
> >> > + PFN_DOWN(bar->addr), PFN_UP(bar->addr + bar->size -
> >> > 1),
> >>
> >> I thought we had agreed that the parenthesization of tuples
> >> like this one should match meaning they want to convey. I'm
> >> having a hard time to see how PFN_UP() could ever go together
> >> with a closing square bracket.
> >
> > There's a -1 in the PFN_UP, and it's exactly what we are adding to the
> > rangeset. Ie: rangeset_add_range(..., e, e) is adding the range [e,
> > e], not [e, e).
>
> Oh, I didn't spot that - it's even worse then afaict, because you
> then also have the insertion wrong. Just consider a BAR covering
> up to a page and starting at a page boundary: You'd insert two
> pages when you mean just one.
Ug, right. This should be PFN_DOWN and then the range added and the ]
notation used is correct.
> >> > +static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
> >> > + uint32_t cmd, void *data)
> >> > +{
> >> > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> >> > + uint16_t current_cmd = pci_conf_read16(pdev->seg, pdev->bus, slot,
> >> > func,
> >> > + reg);
> >> > +
> >> > + /*
> >> > + * Let Dom0 play with all the bits directly except for the memory
> >> > + * decoding one.
> >> > + */
> >> > + if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY )
> >> > + /*
> >> > + * Ignore the error. No memory has been added or removed from
> >> > the p2m,
> >> > + * and the memory decoding has not been changed, so leave
> >> > everything
> >> > + * as-is, hoping the guest will realize and try again.
> >> > + */
> >> > + modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
> >>
> >> So that comment appears to be correct, but I wonder if the reader
> >> could get a little more assistance, as it's not exactly obvious why no
> >> p2m changes would have occurred in case of failure: modify_bars()
> >> produces all its errors before doing any mapping, and
> >> maybe_defer_map() takes the "else" branch which doesn't cause
> >> any (direct) errors. Same for the similar comment in rom_write().
> >
> > Let me expand that a little bit then to give some more context:
> >
> > /*
> > * Ignore the error. No memory has been added or removed from the p2m
> > * (because the actual p2m changes are deferred in maybe_defer_map)
> > * and the memory decoding bit has not been changed, so leave
> > * everything as-is, hoping the guest will realize and try again.
> > */
>
> That's better, but with the suggested split of maybe_defer_map()
> things would end up even less confusing (because the new text
> you suggest still has an apparent [but not actual] conflict between
> the "maybe" in the function name and what you say is happening).
Let's see if the proposed change/comment makes this easier to
understand.
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |