[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support



On Mon, Jun 30, 2014 at 06:20:22PM +0800, Chen, Tiejun wrote:
> On 2014/6/30 17:55, Michael S. Tsirkin wrote:
> >On Mon, Jun 30, 2014 at 05:38:21PM +0800, Chen, Tiejun wrote:
> >>On 2014/6/30 17:05, Michael S. Tsirkin wrote:
> >>>On Mon, Jun 30, 2014 at 03:24:58PM +0800, Chen, Tiejun wrote:
> >>>>On 2014/6/30 14:48, Michael S. Tsirkin wrote:
> >>>>>On Mon, Jun 30, 2014 at 10:51:49AM +0800, Chen, Tiejun wrote:
> >>>>>>On 2014/6/26 18:03, Paolo Bonzini wrote:
> >>>>>>>Il 26/06/2014 11:18, Chen, Tiejun ha scritto:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>- offsets 0x0000..0x0fff map to configuration space of the host MCH
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>Are you saying the config space in the video device?
> >>>>>>>
> >>>>>>>No, I am saying in a new BAR, or at some magic offset of an existing
> >>>>>>>MMIO BAR.
> >>>>>>>
> >>>>>>
> >>>>>>As I mentioned previously, the IGD guy told me we have no any unused a
> >>>>>>offset or BAR in the config space.
> >>>>>>
> >>>>>>And guy who are responsible for the native driver seems not be accept to
> >>>>>>extend some magic offset of an existing MMIO BAR.
> >>>>>>
> >>>>>>In addition I think in a short time its not possible to migrate i440fx 
> >>>>>>to
> >>>>>>q35 as a PCIe machine of xen.
> >>>>>
> >>>>>That seems like a weak motivation.  I don't see a need to get something
> >>>>>merged upstream in a short time: this seems sure to miss 2.1,
> >>>>>so you have the time to make it architecturally sound.
> >>>>>"Making existing guests work" would be a better motivation.
> >>>>
> >>>>Yes.
> >>>
> >>>So focus on this then. Existing guests will probably work
> >>>fine on a newer chipset - likely better than on i440fx.
> >>>xen management tools need to do some work to support this?
> >>>That will just give everyone more long term benefits.
> >>>
> >>>If instead we create a hack that does not resemble
> >>>any existing hardware even remotely, what's the
> >>>chance that it will not break with any future
> >>>guest modification?
> >>>
> >>>
> >>>>>Isn't this possible with an mch chipset?
> >>>>
> >>>>If you're saying q35, I mean AFAIK we have no any plan to migrate to this
> >>>>MCH in xen case.
> >>>
> >>>q35 or a newer chipset that's closer to what guests expect.
> >>>
> >>>>Additionally, I think I should follow this feature after
> >>>>q35 can work for xen scenario.
> >>>
> >>>What's stopping you?
> >>
> >>I mean if you want create an new machine based on q35, actually this is
> >>equal to start making xen to migrate to q35 now. Right? I can't image how
> >>much effort should be done.
> >
> >I don't see why you don't try. Sounds like a more robust solution to me.
> 
> As I think this is another requirement to us. I'm not sure if I have enough
> time to touch this currently.
> 
> >
> >>So this is a reason why I'm saying I'd like to follow this feature after q35
> >>can work with xen completely.
> >
> >Then we'll end up with more configurations to support, and to what end?
> >
> >>>
> >>>>>
> >>>>>
> >>>>>>So could we do this step by step:
> >>>>>>
> >>>>>>#1 phase: We just cover current qemu-xen implementation based on i44fx, 
> >>>>>>so
> >>>>>>still provide that pseudo ISA bridge at 00:1f.0 as we already did.
> >>>>>
> >>>>>By the way there is no reason to put it at 00:1f.0 specifically I think.
> >>>>>So it seems simple: create a dummy device that gets device and
> >>>>>vendor id as properties. If you really like, add an option to get values
> >>>>
> >>>>Yes, this is just what we did in [Xen-devel] [v5][PATCH 2/5] xen, gfx
> >>>>passthrough: create pseudo intel isa bridge. There, we fake this device 
> >>>>just
> >>>>at 00:1f.0.
> >>>>But you guys don't like this, and shouldn't this be just this point we
> >>>>discussing now?
> >>>>
> >>>>If you guys agree that , everything is fine.
> >>>
> >>>Actually, this isn't what you did.
> >>>Don't tie it to xen, and don't tie it to 1f.
> >>>Just make it a simple stub pci device.
> >>>Whoever wants it, creates it.
> >>>
> >>>The thing I worry about, is the chance this will break going forward.
> >>>So you created a system with 2 isa bridges.
> >>
> >>I don't set class type to claim this is an ISA bridge, just a normal PCI
> >>device.
> >>+static int create_pseudo_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice
> >>*hdev)
> >>+{
> >>+    struct PCIDevice *dev;
> >>+
> >>+    char rid;
> >>+
> >>+    /* We havt to use a simple PCI device to fake this ISA bridge
> >>+     * to avoid making some confusion to BIOS and ACPI.
> >>+     */
> >>+    dev = pci_create(bus, PCI_DEVFN(0x1f, 0),
> >>"pseudo-intel-pch-isa-bridge");
> >>+
> >>+    qdev_init_nofail(&dev->qdev);
> >>+
> >>+    pci_config_set_vendor_id(dev->config, hdev->vendor_id);
> >>+    pci_config_set_device_id(dev->config, hdev->device_id);
> >>+
> >>+    xen_host_pci_get_block(hdev, PCI_REVISION_ID, (uint8_t *)&rid, 1);
> >>+
> >>+    pci_config_set_revision(dev->config, rid);
> >>+
> >>+    XEN_PT_LOG(dev, "The pseudo Intel PCH ISA bridge created.\n");
> >>+    return 0;
> >>+}
> >
> >Then I don't see how it's supposed to work.
> >Doesn't i915 look for an isa bridge?
> >
> >         /*
> >          * The reason to probe ISA bridge instead of Dev31:Fun0 is to
> >          * make graphics device passthrough work easy for VMM, that only
> >          * need to expose ISA bridge to let driver know the real hardware
> >          * underneath. This is a requirement from virtualization team.
> >          *
> >          * In some virtualized environments (e.g. XEN), there is irrelevant
> >          * ISA bridge in the system. To work reliably, we should scan 
> > trhough
> >          * all the ISA bridge devices and check for the first match, instead
> >          * of only checking the first one.
> >          */
> >         while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
> >                 if (pch->vendor == PCI_VENDOR_ID_INTEL) {
> >                         unsigned short id = pch->device & 
> > INTEL_PCH_DEVICE_ID_MASK;
> >                         dev_priv->pch_id = id;
> >
> >                         if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) {
> >                                 dev_priv->pch_type = PCH_IBX;
> >
> >etc
> >
> 
> I already post this to mainline to change as follows:
> 
> -     while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
> +     pch = pci_get_bus_and_slot(0, PCI_DEVFN(0x1f, 0));
> +     if (pch) {

I see - responded on that mail - but I thought the point is to make
existing guests run? In fact you said so explicitly.


> Please refer to this,
> 
> [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of
> check class type
> 
> Linux Native guys would like to accept this. And actually Windows always use
> devfn to detect this.

In fact I see this:

    linux 2.6.35-3.9 probes the 1st IA bridge

        no idea how would you fix this.
        try changing default class for the main bridge?

    linux 3.10 probes all ISA bridges

        requires your stub to be the isa bridge?


I don't see why are driver guys so willing to do crazy things.  They
want to match specific device/vendor id pairs, why don't they do just
that? Why poke at class, random slot numbers etc etc?



> >
> >
> >>>This is already not something that exists on real hardware.
> >>>So it might break some guests that will get confused.
> >>>Maybe we are lucky and most guests see an unfamiliar device
> >>>and ignore it. It seems believable.
> >>>
> >>>But your MCH hack overrides registers in the pci host.
> >>
> >>We just try to write *one* register we already confirm this is safe enough.
> >
> >This should go in code in form of comments:
> >document what this register does on 440fx
> >and why it's safe to override.
> >We don't see what you
> >confirmed off-line.
> 
> That offset is one specific to IGD usage, not for i440fx common. This is why
> we need to expose something in the host bridge. They're just introduced to
> support IGD.
> 
> >
> >>Other register are read-only.
> >
> >Doesn't matter, need to document these as well.
> 
> I think everything are covered in igd_pci_write()/igd_pci_write().
> 
> >
> >>>Are we lucky and there's nothing in these registers
> >>>of interest to guests? This seems much more fragile.
> >>>So please poke at the spec, and compile the list
> >>>of registers you want to touch, figure out why they are
> >>>safe to override, and put this all in code comments.
> >>>
> >>>And the same thing that applies to the isa bridge
> >>
> >>We just expose its own vendor/device ids here. We don't access to change
> >>anything in the isa bridge.
> >>
> >>>applies here too. It should work without QEMU touching
> >>>hosts' hardware.
> >>>
> >>>
> >>>
> >>>>>from sysfs: device and vendor id are world readable, so just get them
> >>>>>directly and not through xen wrappers, this way you can open the files
> >>>>>RO and not RW.
> >>>>>You seem to poke at revision as well, I don't see
> >>>>>driver looking at that - strictly necessary?
> >>>>>If yes please patch host kernel to expose that info in sysfs,
> >>>>>though we can fall back on pci config if not there.
> >>>>>
> >>>>>MCH (bridge_dev) hacks in i915 are nastier.
> >>>>>To clean them up, we really have to have an explicit driver for this
> >>>>>bridge, not a pass-through device.  Long term, the right thing to do is
> >>>>>likely to extend host driver and expose the necessary information in
> >>>>>sysfs on host kernel.
> >>>>>
> >>>>>
> >>>>
> >>>>I'm a bit confused. Any sysfs should be filled by the associated PCIe
> >>>>device, shouldn't it? So qemu still need to emulate this PCIe device
> >>>>firstly, then set properties into sysfs.
> >>>
> >>>I am talking about getting host properties into qemu.
> >>>You don't want to give qemu R/W root access to host sysfs system
> >>>of the root bridge, that's not secure.
> >>
> >>igd_pci_read()
> >>    |
> >>    + xen_host_pci_get_block()
> >>            |
> >>            + xen_host_pci_config_read(()
> >>                    |
> >>                    +  pread()
> >>
> >>So shouldn't we already expose these information via sysfs?
> >
> >That's poking at sysfs of a real device, and after opening it RW.
> 
> I don't think we can really write anything to those read-only sysfs
> interface even we open them as RW.
> 
> Tiejun
> 
> >
> >>>Avoiding read only access to filesystem is a good idea too, so it
> >>>should be possible to pass all parameters in as
> >>>device properties, and let whoever starts qemu
> >>>figure out what are reasonable values.
> >>>
> >>>>>
> >>>>>
> >>>>>>#2 phase: Now, we will choose a capability ID that won't be conflicting 
> >>>>>>with
> >>>>>>others. To do this properly, we need to get one from PCI SIG group. To 
> >>>>>>have
> >>>>>>this workable and consistently validated, this method shouldn't be virt
> >>>>>>specific. Then native driver should use the same method.
> >>>>>
> >>>>>You mean you will be able to talk sense into hardware guys?
> >>>>>I doubt that. If they could be convinced to make e.g. i915 base a
> >>>>
> >>>>We're negotiating this, so this is just our long term solution we figure 
> >>>>out
> >>>>currently.
> >>>>
> >>>>>proper BAR, why didn't they?
> >>>>
> >>>>We already have no any free BAR as we mentioned previously.
> >>>
> >>>I thought you were talking about modifying hardware?
> >>
> >>Yes.
> >>
> >>Tiejun
> >
> >And they can't figure out how to stick extra info in an existing BAR?
> >Oh well, that's hardware for you.
> >
> >>>
> >>>>>
> >>>>>
> >>>>>>So when xen work on
> >>>>>>q35 PCIe machine, we can walk this way.
> >>>>>
> >>>>>If you are emulating MCH anyway, pick one that is close
> >>>>>to what i915 driver expects. It would then work with existing
> >>>>
> >>>>Looks you guys prefer we create a new MCH anyway, right? But is it 
> >>>>necessary
> >>>>to create a new based on i440fx just for a little change?
> >>>>
> >>>>Thanks
> >>>>Tiejun
> >>>
> >>>You can inherit it. Maybe you are lucky and this happens to
> >>>work without conflicting with whatever other guests want to do.
> >>>But if you ask me, you are really just piling up hacks.
> >>>If some guest does not work on i440, you should just work on
> >>>emulating whatever it does work on.
> >>>That would have real value.
> >>>
> >>>>>devices, without new capability IDs.
> >>>>>
> >>>>>
> >>>>>>Anthony,
> >>>>>>
> >>>>>>Any comments to address this in xen case?
> >>>>>>
> >>>>>>Thanks
> >>>>>>Tiejun
> >>>>>
> >>>>>
> >>>
> >>>
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@xxxxxxxxxxxxx
> >http://lists.xen.org/xen-devel
> >
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.