[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][XSA-126] xen: limit guest control of PCI command register



CC'ing the author of the patch and xen-devel.
FYI I think that Jan is going to be on vacation for a couple of weeks.

On Wed, 1 Apr 2015, Michael S. Tsirkin wrote:
> On Tue, Mar 31, 2015 at 03:18:03PM +0100, Stefano Stabellini wrote:
> > From: Jan Beulich <jbeulich@xxxxxxxx>
> > 
> > Otherwise the guest can abuse that control to cause e.g. PCIe
> > Unsupported Request responses (by disabling memory and/or I/O decoding
> > and subsequently causing [CPU side] accesses to the respective address
> > ranges), which (depending on system configuration) may be fatal to the
> > host.
> > 
> > This is CVE-2015-2756 / XSA-126.
> > 
> > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> > Reviewed-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
> > Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> 
> The patch description seems somewhat incorrect to me.
> UR should not be fatal to the system, and it's not platform
> specific.

I think that people have been able to repro this, but I'll let Jan
comment on it.


> In particular, there could be more reasons for devices
> to generate URs, for example, if they get a transaction
> during FLR. I don't think we ever tried to prevent this.

That cannot be triggered by guest behaviour.


> Here's the description from pci express spec:
> 
> IMPLEMENTATION NOTE
> Software UR Reporting Compatibility with 1.0a Devices
> 
>       With 1.0a device Functions, 96 if the Unsupported Request Reporting 
> Enable bit is set, the Function
>       when operating as a Completer will send an uncorrectable error Message 
> (if enabled) when a UR
>       error is detected. On platforms where an uncorrectable error Message is 
> handled as a System Error,
>       this will break PC-compatible Configuration Space probing, so 
> software/firmware on such
>       platforms may need to avoid setting the Unsupported Request Reporting 
> Enable bit.
>       With device Functions implementing Role-Based Error Reporting, setting 
> the Unsupported Request
>       Reporting Enable bit will not interfere with PC-compatible 
> Configuration Space probing, assuming
>       that the severity for UR is left at its default of non-fatal. However, 
> setting the Unsupported Request
>       Reporting Enable bit will enable the Function to report UR errors 
> detected with posted Requests,
>       helping avoid this case for potential silent data corruption.
>       On platforms where robust error handling and PC-compatible 
> Configuration Space probing is
>       required, it is suggested that software or firmware have the 
> Unsupported Request Reporting Enable
>       bit Set for Role-Based Error Reporting Functions, but clear for 1.0a 
> Functions. Software or
>       firmware can distinguish the two classes of Functions by examining the 
> Role-Based Error Reporting
>       bit in the Device Capabilities register.
> 
> 
> What I think you have is a very old 1.0a system, and host set Unsupported
> Request Reporting Enable.
> 
> The right thing is for host kernel to do is what the note says, and disable UR
> reporting for this system.
> 
> As a work around for broken kernels, this is OK I guess, though
> guest and host being out of sync is always a risk.
> But I think it's a good idea to add documentation explaining
> this, and work on the correct fix in linux, before we
> add workarounds.

There can be guest kernels other than Linux, we cannot fix them all, and
we cannot allow PCI passthrough only with Linux guests.


> > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> > index f2893b2..d095c08 100644
> > --- a/hw/xen/xen_pt.c
> > +++ b/hw/xen/xen_pt.c
> > @@ -388,7 +388,7 @@ static const MemoryRegionOps ops = {
> >      .write = xen_pt_bar_write,
> >  };
> >  
> > -static int xen_pt_register_regions(XenPCIPassthroughState *s)
> > +static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t 
> > *cmd)
> >  {
> >      int i = 0;
> >      XenHostPCIDevice *d = &s->real_device;
> > @@ -406,6 +406,7 @@ static int 
> > xen_pt_register_regions(XenPCIPassthroughState *s)
> >  
> >          if (r->type & XEN_HOST_PCI_REGION_TYPE_IO) {
> >              type = PCI_BASE_ADDRESS_SPACE_IO;
> > +            *cmd |= PCI_COMMAND_IO;
> >          } else {
> >              type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> >              if (r->type & XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
> > @@ -414,6 +415,7 @@ static int 
> > xen_pt_register_regions(XenPCIPassthroughState *s)
> >              if (r->type & XEN_HOST_PCI_REGION_TYPE_MEM_64) {
> >                  type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> >              }
> > +            *cmd |= PCI_COMMAND_MEMORY;
> >          }
> >  
> >          memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
> > @@ -638,6 +640,7 @@ static int xen_pt_initfn(PCIDevice *d)
> >      XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, d);
> >      int rc = 0;
> >      uint8_t machine_irq = 0;
> > +    uint16_t cmd = 0;
> >      int pirq = XEN_PT_UNASSIGNED_PIRQ;
> >  
> >      /* register real device */
> > @@ -672,7 +675,7 @@ static int xen_pt_initfn(PCIDevice *d)
> >      s->io_listener = xen_pt_io_listener;
> >  
> >      /* Handle real device's MMIO/PIO BARs */
> > -    xen_pt_register_regions(s);
> > +    xen_pt_register_regions(s, &cmd);
> >  
> >      /* reinitialize each config register to be emulated */
> >      if (xen_pt_config_init(s)) {
> > @@ -736,6 +739,11 @@ static int xen_pt_initfn(PCIDevice *d)
> >      }
> >  
> >  out:
> > +    if (cmd) {
> > +        xen_host_pci_set_word(&s->real_device, PCI_COMMAND,
> > +                              pci_get_word(d->config + PCI_COMMAND) | cmd);
> > +    }
> > +
> 
> Is this writing to host device, forcing cmd and io bits to enabled simply
> because they are present? If yes, I think that's wrong: you don't know whether
> bios enabled them, if it didn't host BARs might conflict with other devices,
> and this will crash host.  I don't see why you are touching host command
> register at all.  The point in the description is to avoid touching host.

pciback clears the command register when seizing the device:

pcistub_seize() -> pcistub_init_device() -> xen_pcibk_reset_device()


> >      memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
> >      memory_listener_register(&s->io_listener, &address_space_io);
> >      XEN_PT_LOG(d,
> > diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> > index d99c22e..95a51db 100644
> > --- a/hw/xen/xen_pt_config_init.c
> > +++ b/hw/xen/xen_pt_config_init.c
> > @@ -286,23 +286,6 @@ static int 
> > xen_pt_irqpin_reg_init(XenPCIPassthroughState *s,
> >  }
> >  
> >  /* Command register */
> > -static int xen_pt_cmd_reg_read(XenPCIPassthroughState *s, XenPTReg 
> > *cfg_entry,
> > -                               uint16_t *value, uint16_t valid_mask)
> > -{
> > -    XenPTRegInfo *reg = cfg_entry->reg;
> > -    uint16_t valid_emu_mask = 0;
> > -    uint16_t emu_mask = reg->emu_mask;
> > -
> > -    if (s->is_virtfn) {
> > -        emu_mask |= PCI_COMMAND_MEMORY;
> > -    }
> > -
> > -    /* emulate word register */
> > -    valid_emu_mask = emu_mask & valid_mask;
> > -    *value = XEN_PT_MERGE_VALUE(*value, cfg_entry->data, ~valid_emu_mask);
> > -
> > -    return 0;
> > -}
> >  static int xen_pt_cmd_reg_write(XenPCIPassthroughState *s, XenPTReg 
> > *cfg_entry,
> >                                  uint16_t *val, uint16_t dev_value,
> >                                  uint16_t valid_mask)
> > @@ -310,18 +293,13 @@ static int 
> > xen_pt_cmd_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
> >      XenPTRegInfo *reg = cfg_entry->reg;
> >      uint16_t writable_mask = 0;
> >      uint16_t throughable_mask = 0;
> > -    uint16_t emu_mask = reg->emu_mask;
> > -
> > -    if (s->is_virtfn) {
> > -        emu_mask |= PCI_COMMAND_MEMORY;
> > -    }
> >  
> >      /* modify emulate register */
> >      writable_mask = ~reg->ro_mask & valid_mask;
> >      cfg_entry->data = XEN_PT_MERGE_VALUE(*val, cfg_entry->data, 
> > writable_mask);
> >  
> >      /* create value for writing to I/O device register */
> > -    throughable_mask = ~emu_mask & valid_mask;
> > +    throughable_mask = ~reg->emu_mask & valid_mask;
> >  
> >      if (*val & PCI_COMMAND_INTX_DISABLE) {
> >          throughable_mask |= PCI_COMMAND_INTX_DISABLE;
> > @@ -603,9 +581,9 @@ static XenPTRegInfo xen_pt_emu_reg_header0[] = {
> >          .size       = 2,
> >          .init_val   = 0x0000,
> >          .ro_mask    = 0xF880,
> > -        .emu_mask   = 0x0740,
> > +        .emu_mask   = 0x0743,
> >          .init       = xen_pt_common_reg_init,
> > -        .u.w.read   = xen_pt_cmd_reg_read,
> > +        .u.w.read   = xen_pt_word_reg_read,
> >          .u.w.write  = xen_pt_cmd_reg_write,
> >      },
> >      /* Capabilities Pointer reg */
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.