[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RESEND] [RFC PATCH] xen/arm: domain_build: Ignore empty memory bank



On Mon, Dec 21, 2020 at 06:28:35PM +0000, Julien Grall wrote:
> I was planning to review the first version today, but as you sent a new 
> version I will answer on this one directly.

Mostly the commentary has been increasing, not so much the commit.

> On 21/12/2020 17:30, Elliott Mitchell wrote:
> > Previously Xen had stopped processing Device Trees if an empty
> > (size == 0) memory bank was found.
> > 
> > Commit 5a37207df52066efefe419c677b089a654d37afc changed this behavior to
> > ignore such banks.  Unfortunately this means these empty nodes are
> > visible to code which accesses the device trees.  Have domain_build also
> > ignore these entries.
> 
> I am probably missing something here. The commit you pointed out will 
> only take care of memory nodes (including reserved-memory).
> 
> It should not be possible to reach handle_device() with actual RAM. 
> Although, it would with the reserved memory node.
> 
> Could you provide a bit more details on the issue? In particular, I am 
> interested to see the offending node and its content.

Define "see" in this context.  The message which shows up is:
"Unable to retrieve address 0 for /scb/pcie@7d500000/pci@1,0/usb@1,0"

According to Linux "name", "reg" and "resets" exist there.

(which as stated "0" looks suspiciously like NULL, rather than an
index)

> > I doubt this is the only bug exposed by
> > 5a37207df52066efefe419c677b089a654d37afc.
> 
> Are you saying that with my patch dropped, Xen will boot but with it 
> will not?

Hmm, realizing that I'd found a fix, but not actually tested to
confirm...   Seems I had it wrong, dropping
5a37207df52066efefe419c677b089a654d37afc doesn't fix the issue, so that
is NOT the cause.

Sorry about misattributing the cause.  Now to start doing builds to
identify the cause...  (this of course will take a bit)

Hmm, this is being rather more difficult to identify than I expected.


> > As I thought the 0 it was reporting was an address of 0.  Perhaps the
> > message should instead be:
> > "Unable to retrieve address for index %u of %s\n"?
> > ---
> >   xen/arch/arm/domain_build.c | 5 +++++
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index e824ba34b0..0b83384bd3 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -1405,6 +1405,11 @@ static int __init handle_device(struct domain *d, 
> > struct dt_device_node *dev,
> >       {
> >           struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >           res = dt_device_get_address(dev, i, &addr, &size);
> > +
> > +        /* Some DT may describe empty bank, ignore them */
> > +        if ( !size )
> > +            continue;
> 
> ... dt_device_get_address() will not set the size if the node is bogus. 
> So you can't rely on either addr or size when res is non-zero.
> 
> handle_device() (at least on unstable) will not initialize the two 
> variables to 0. So I guess you are lucky that you compiler zeroed them 
> for you, but that's not the normal behavior.
> 
> So I think we first need to figure out what is the offending node and 
> why it is dt_device_get_address() is returning an error for it.
> 
> That said, I agree that we possibly want a check size == 0 (action TBD) 
> in map_range_to_domain() as the code would do the wrong thing for 0.

Seems like returning to a working build without that commit is going to
take a bit.  :-(


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg@xxxxxxx  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.