[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86/hvmloader: round up memory BAR size to 4K



On Wed, Jan 22, 2020 at 11:27:24AM +0100, Jan Beulich wrote:
> On 21.01.2020 17:57, Roger Pau Monné wrote:
> > On Tue, Jan 21, 2020 at 05:15:20PM +0100, Jan Beulich wrote:
> >> On 21.01.2020 16:57, Roger Pau Monné wrote:
> >>> On Tue, Jan 21, 2020 at 11:43:58AM +0100, Jan Beulich wrote:
> >>>> On 21.01.2020 11:29, Roger Pau Monné wrote:
> >>>>> So I'm not sure how to progress with this patch, are we fine with
> >>>>> those limitations?
> >>>>
> >>>> I'm afraid this depends on ...
> >>>>
> >>>>> As I said, Xen hasn't got enough knowledge to correctly isolate the
> >>>>> BARs, and hence we have to rely on dom0 DTRT. We could add checks in
> >>>>> Xen to make sure no BARs share a page, but it's a non-trivial amount
> >>>>> of scanning and sizing each possible BAR on the system.
> >>>>
> >>>> ... whether Dom0 actually "DTRT", which in turn is complicated by there
> >>>> not being a specific Dom0 kernel incarnation to check against. Perhaps
> >>>> rather than having Xen check _all_ BARs, Xen or the tool stack could
> >>>> check BARs of devices about to be handed to a guest? Perhaps we need to
> >>>> pass auxiliary information to hvmloader to be able to judge whether a
> >>>> BAR shares a page with another one? Perhaps there also needs to be a
> >>>> way for hvmloader to know what offset into a page has to be maintained
> >>>> for any particular BAR, as follows from Jason's recent reply?
> >>>
> >>> Linux has an option to force resource alignment (as reported by
> >>> Jason), maybe we could force all BARs to be aligned to page size in
> >>> order to be passed through?
> >>>
> >>> That would make it easier to check (as Xen/Qemu would only need to
> >>> assert that the BAR address is aligned), and won't require much extra
> >>> work in Xen apart from the check itself.
> >>>
> >>> Do you think this would be an acceptable solution?
> >>
> >> In principle yes, but there are loose ends:
> >> - What do you mean by "we could force"? We have no control over the
> >>   Dom0 kernel.
> > 
> > I should rephrase:
> > 
> > ... maybe we should require dom0 to align all memory BARs to page size
> > in order to be passed through?
> > 
> > Ie: Xen should refuse to pass through any memory BAR that's not page
> > aligned. How the alignment is accomplished is out of the scope to Xen,
> > as long as memory BARs are aligned.
> 
> That's an acceptable model, as long as it wouldn't typically break
> existing configurations, and as long as for those who we would
> break there are easy to follow steps to unbreak their setups.

Jason, do you think you could take a stab at adding a check in order
to make sure memory BAR addresses are 4K aligned when assigning a
device to a guest?

> >> - What about non-Linux Dom0?
> > 
> > Other OSes would have to provide similar functionality in order to
> > align the memory BARs. Right now Linux is the only dom0 that supports
> > PCI passthrough AFAIK.
> > 
> >> Also, apart from extra resource (address space) consumption,
> > 
> > The PCI spec actually recommends memory BARs to be at least of page
> > size, but that's not a strict requirement. I would hope there aren't
> > that many devices with memory BARs smaller than a page.
> 
> I've simply gone and grep-ed all the stored lspci output I have
> for some of the test systems I have here:
> 
> 0/12
> 3/31 (all 4k-aligned)
> 6/13 (all 4k-aligned)
> 3/12
> 6/19 (all 4k-aligned)
> 3/7 (all 4k-aligned)

What does X/Y at the beginning of the line stand for?

> This is without regard to what specific devices these are, and
> hence whether there would be any point in wanting to pass it to
> a guest in the first place. I'd like to note though that there
> are a fair amount of USB controllers among the ones with BARs
> smaller than a page's worth.
> 
> >> what's
> >> the point of forcing a single device's BARs to separate pages?
> > 
> > Makes the placement logic in hvmloader easier IMO, and I don't think
> > that would be such a waste of space since I expect most devices will
> > follow the PCI spec recommendation and round up memory BAR sizes to a
> > page size.
> 
> Especially for devices with very little space needed (serial
> cards with one port per BAR, for example) the waste may be
> noticeable.

But you can only have 6 BARs per device, so unless you have a huge
amount of USB or serial cards (as they tend to be the ones with
memory BARs < 4K) it shouldn't be worrying IMO.

> >> (I'm
> >> assuming here that hvmloader would have a way to know of the
> >> potentially resulting non-zero offsets into a page. And I'm still
> >> puzzled that the lack thereof hasn't been reported as a bug by
> >> anyone, afaik.)
> > 
> > As said above I would like to think that most devices have memory BARs
> > at least of page size, as recommended by the PCI spec, and hence
> > that's why we haven't got any reports so far.
> 
> I'm curious about this recommendation, as what size a page is
> varies across CPU architectures, and PCI devices shouldn't
> typically be tied to specific CPUs (unless of course they come
> as part of ones, but such devices are rarely ones you may want
> to hand to a guest). Is there really a recommendation towards
> BAR size, not towards BAR placement?

This is from the PCI local bus specification 3.0, section 6.2.5.1.

"This design implies that all address spaces used are a power of two
in size and are naturally aligned. Devices are free to consume more
address space than required, but decoding down to a 4 KB space for
memory is suggested for devices that need less than that amount. For
instance, a device that has 64 bytes of registers to be mapped into
Memory Space may consume up to 4 KB of address space in order to
minimize the number of bits in the address decoder."

So while the reasoning is not related to isolation, it's a
recommendation of the spec. I should have used 4K instead of page size
in the previous replies, as you say the spec is not tied to an
architecture, and hence using page size in my previous replies was
wrong.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.