[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RESEND] [RFC PATCH] xen/arm: domain_build: Ignore empty memory bank



Hi Elliott,

I was planning to review the first version today, but as you sent a new version I will answer on this one directly.

On 21/12/2020 17:30, Elliott Mitchell wrote:
Previously Xen had stopped processing Device Trees if an empty
(size == 0) memory bank was found.

Commit 5a37207df52066efefe419c677b089a654d37afc changed this behavior to
ignore such banks.  Unfortunately this means these empty nodes are
visible to code which accesses the device trees.  Have domain_build also
ignore these entries.

I am probably missing something here. The commit you pointed out will only take care of memory nodes (including reserved-memory).

It should not be possible to reach handle_device() with actual RAM. Although, it would with the reserved memory node.

Could you provide a bit more details on the issue? In particular, I am interested to see the offending node and its content.

---
This is tagged "RFC" due to issues.

Authorship of this is unclean.  In the first version (checked in, but
never sent to the list and never compiled) a different condition was used
and the comment was absent.  When examing the code it became clear a
condition identical to
5a37207df52066efefe419c677b089a654d37afc was appropriate and so I changed
to !size.  Since what the code is doing was sufficiently similar, the
comment was grabbed.
How far does this dilute authorship?  I diagnosed the bug and figured out
where to add the lines, but the amount inspired by Julien Grall gives
Julien Grall some level of claim of authorship.  Advice is needed.

You did all the investigation of the bug and the code is small enough. So I think it is fine for you to claim the authorship.


Commit 7d2b21fd36c2a47799eed71c67bae7faa1ec4272 is an outright bug for
me.  I don't know what percentage of users will experience this bug, but
being observed this quickly suggests this is major enough to be urgent
for the stable-4.14 branch.

We usually work on fix for upstream first and then backport it.

I doubt this is the only bug exposed by
5a37207df52066efefe419c677b089a654d37afc.

Are you saying that with my patch dropped, Xen will boot but with it will not?

This might actually effect
most uses of the device-tree code.  I think either the core needs to be
fixed to hide zero-sized entries from anything outside of
xen/common/device_tree.c, otherwise all uses of the device-tree core need
to be audited to ensure they ignore zero-sized entries.

The meaning of zero-sized is not the same everywhere. In the case of memory banks, it can be safely ignored.

For other devices (e.g. GIC), hiding it may make things worse because a size 0 means the node is bogus.

Notably this is
the second location where zero-size device-tree entries need to be
ignored, preemptive action should be taken before a third is found by
bugreport. >
Perhaps this fix is appropriate for the stable-4.14 branch and a proper
solution should be implemented for the main branch?

The error message which first showed was
"Unable to retrieve address %u for %s\n".  Where the number in %u was
0, this seems a poor error message.  Version 0.1 (which never got
compiled) had been:  if(!addr) continue;

Usually, dt_get_address() will fail because it can't translate the address, not because the size is 0. The more...


As I thought the 0 it was reporting was an address of 0.  Perhaps the
message should instead be:
"Unable to retrieve address for index %u of %s\n"?
---
  xen/arch/arm/domain_build.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e824ba34b0..0b83384bd3 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1405,6 +1405,11 @@ static int __init handle_device(struct domain *d, struct 
dt_device_node *dev,
      {
          struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
          res = dt_device_get_address(dev, i, &addr, &size);
+
+        /* Some DT may describe empty bank, ignore them */
+        if ( !size )
+            continue;

... dt_device_get_address() will not set the size if the node is bogus. So you can't rely on either addr or size when res is non-zero.

handle_device() (at least on unstable) will not initialize the two variables to 0. So I guess you are lucky that you compiler zeroed them for you, but that's not the normal behavior.

So I think we first need to figure out what is the offending node and why it is dt_device_get_address() is returning an error for it.

That said, I agree that we possibly want a check size == 0 (action TBD) in map_range_to_domain() as the code would do the wrong thing for 0.

          if ( res )
          {
              printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",


Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.