[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG



On 29/08/2019 09:00, Roger Pau Monné wrote:
>>
>> I think we need to have this option to at least have a way to workaround
>> problem (1). Problem (2) could be solved in Dom0 kernel by invoking
>> xen_mcfg_late() earlier but before the first PCI bus enumertaion which
>> currently happens somwhere during ACPI scan I think.
>>
>> The question is what the defult value for this option should be?
> 
> Have you tested this against a variety of hardware?
Not yet, I presume it's possible in theory that there is such a box in
the wild that will misbehave if we attempt to access MCFG earlier it'd
be discovered through ACPI. Other than that I don't see a reason why it
would cause any problem. But you're right - we need to run some tests
with this option set to true on our fleet before deciding.

> Have you found any box that has a wrong MMCFG area in the MCFG static
> table? (ie: one that doesn't match the motherboard resource
> reservation in the dynamic ACPI tables?)

I think it's possible that size of the table reported from ACPI is
smaller which would be a problem if we access out-of-bounds preliminary
- hence the ability to fall back. But I'm not aware of any hardware like
that.

>>
>> ---
>>  docs/misc/xen-command-line.pandoc     |  8 +++++---
>>  xen/arch/x86/e820.c                   | 20 ++++++++++++++++++++
>>  xen/arch/x86/x86_64/mmconfig-shared.c | 34 
>> ++++++++++++++++++++--------------
>>  xen/include/asm-x86/e820.h            |  1 +
>>  4 files changed, 46 insertions(+), 17 deletions(-)
>>
>> diff --git a/docs/misc/xen-command-line.pandoc 
>> b/docs/misc/xen-command-line.pandoc
>> index 7c72e31..f13b10c 100644
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and 
>> - when supported
>>  by the platform - DomU with pass-through device assigned).
>>  
>>  ### mmcfg (x86)
>> -> `= <boolean>[,amd-fam10]`
>> +> `= List of [ <boolean>, force, amd-fam10 ]`
>>  
>> -> Default: `1`
>> +> Default: `true,force`
>>  
>> -Specify if the MMConfig space should be enabled.
>> +Specify if the MMConfig space should be enabled. If force option is 
>> specified
>> +(default) MMConfig space will be used by Xen early in boot even if it's
>> +not reserved by firmware in the E820 table.
>>  
>>  ### mmio-relax (x86)
>>  > `= <boolean> | all`
>> diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
>> index 8e8a2c4..edef72a 100644
>> --- a/xen/arch/x86/e820.c
>> +++ b/xen/arch/x86/e820.c
>> @@ -37,6 +37,26 @@ struct e820map e820;
>>  struct e820map __initdata e820_raw;
>>  
>>  /*
>> + * This function checks if any part of the range <start,end> is mapped
>> + * with type.
>> + */
>> +int __init e820_any_mapped(u64 start, u64 end, unsigned type)
> 
> Please use uint64_t and unsigned int. Also it looks like this
> function wants to return a bool instead of an int.
> 

The problem here is that I want this function be similar to the one
below (e820_all_mapped) which is copied from Linux as well as the rest
of the file. At the same time I don't want to introduce code churn
fixing unrelated style issues. Perhaps it's better to keep style of this
function as is? Or do you think it's still better to unify the style
across both of them (perhaps in a separate commit)?

>> +{
>> +       int i;
> 
> unsigned int.
> 
>> +
>> +       for (i = 0; i < e820.nr_map; i++) {
> 
> Coding style. Some parts of this file are using the Linux coding
> style I think, but new additions should be using the Xen coding
> style, see e820_change_range_type for example.
> 
>> +               struct e820entry *ei = &e820.map[i];
> 
> const.
> 
>> +
>> +               if (type && ei->type != type)
>> +                       continue;
>> +               if (ei->addr >= end || ei->addr + ei->size <= start)
>> +                       continue;
>> +               return 1;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/*
>>   * This function checks if the entire range <start,end> is mapped with type.
>>   *
>>   * Note: this function only works correct if the e820 table is sorted and
>> diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c 
>> b/xen/arch/x86/x86_64/mmconfig-shared.c
>> index cc08b52..9fc0865 100644
>> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
>> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
>> @@ -26,33 +26,34 @@
>>  
>>  #include "mmconfig.h"
>>  
>> +static bool_t __read_mostly force_mmcfg = true;
>>  unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>>  
>>  static int __init parse_mmcfg(const char *s)
>>  {
>>      const char *ss;
>> -    int rc = 0;
>> +    int val, rc = 0;
>>  
>>      do {
>>          ss = strchr(s, ',');
>>          if ( !ss )
>>              ss = strchr(s, '\0');
>>  
>> -        switch ( parse_bool(s, ss) )
>> -        {
>> -        case 0:
>> -            pci_probe &= ~PCI_PROBE_MMCONF;
>> -            break;
>> -        case 1:
>> -            break;
>> -        default:
>> -            if ( !cmdline_strcmp(s, "amd_fam10") ||
>> -                 !cmdline_strcmp(s, "amd-fam10") )
>> +        if ( (val = parse_bool(s, ss)) >= 0 ) {
>> +            if ( val )
>> +               pci_probe |= PCI_PROBE_MMCONF;
>> +            else
>> +               pci_probe &= ~PCI_PROBE_MMCONF;
>> +        } else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
>> +                    (val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
>> +            if ( val )
>>                  pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
>>              else
>> -                rc = -EINVAL;
>> -            break;
>> -        }
>> +                pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
>> +        } else if ( (val = parse_boolean("force", s, ss)) >= 0)
>> +            force_mmcfg = val;
> 
> You could also consider adding a new flag to pci_probe, ie:
> PCI_PROBE_FORCE_MMCFG.

I was thinking about that too. Will switch to that method if you think
that is better.

>> +        else
>> +            rc = -EINVAL;
>>  
>>          s = ss + 1;
>>      } while ( *ss );
>> @@ -355,6 +356,11 @@ static int __init is_mmconf_reserved(
>>                     (unsigned int)cfg->start_bus_number,
>>                     (unsigned int)cfg->end_bus_number);
>>          }
>> +    } else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) {
> 
> I think it should be fine to also accept a MMCFG area that's partially
> reserved and partially a hole in the memory map?
> 
> AFAICT the proposed code will only accept MMCFG regions that are
> fully in a memory map hole.

What case particularly are you referring to? Partially covered case is
handled above where the beginning of reserved region corresponds to the
beginning of the table. Other than that (if a reserved region is
randomly located within a reported area) I think is totally ambiguous
and should be considered broken.

Igor


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.