[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Qemu-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough



On Thu, Oct 09, 2014 at 01:53:16PM +0800, Chen, Tiejun wrote:
> 
> 
> On 2014/10/7 15:26, Michael S. Tsirkin wrote:
> >On Tue, Sep 30, 2014 at 10:43:09AM +0800, Chen, Tiejun wrote:
> >>On 2014/9/29 18:01, Michael S. Tsirkin wrote:
> >>>On Sun, Sep 28, 2014 at 10:59:05AM +0800, Chen, Tiejun wrote:
> >>>>On 2014/9/3 9:40, Kay, Allen M wrote:
> >>>>>
> >>>>>
> >>>>>>-----Original Message-----
> >>>>>>From: Chen, Tiejun
> >>>>>>Sent: Monday, September 01, 2014 12:50 AM
> >>>>>>To: Michael S. Tsirkin
> >>>>>>Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Kay, Allen M; qemu-devel@xxxxxxxxxx;
> >>>>>>Konrad Rzeszutek Wilk
> >>>>>>Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: 
> >>>>>>create
> >>>>>>isa bridge specific to IGD passthrough
> >>>>>>
> >>>>>>On 2014/9/1 14:05, Michael S. Tsirkin wrote:
> >>>>>>>On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote:
> >>>>>>>>On 2014/8/31 16:58, Michael S. Tsirkin wrote:
> >>>>>>>>>On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>On 2014/8/28 8:56, Chen, Tiejun wrote:
> >>>>>>>>>>>>>>>>>+     */
> >>>>>>>>>>>>>>>>>+    dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
> >>>>>>>>>>>>>>>>>+                            
> >>>>>>>>>>>>>>>>>"xen-igd-passthrough-isa-bridge");
> >>>>>>>>>>>>>>>>>+    if (dev) {
> >>>>>>>>>>>>>>>>>+        r = xen_host_pci_device_get(&hdev, 0, 0,
> >>>>>>>>>>>>>>>>>+ PCI_DEVFN(0x1f,
> >>>>>>>>>>>>>>>>>0), 0);
> >>>>>>>>>>>>>>>>>+        if (!r) {
> >>>>>>>>>>>>>>>>>+            pci_config_set_vendor_id(dev->config,
> >>>>>>hdev.vendor_id);
> >>>>>>>>>>>>>>>>>+            pci_config_set_device_id(dev->config,
> >>>>>>>>>>>>>>>>>+ hdev.device_id);
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>Can you, instead, implement the reverse logic, probing the card
> >>>>>>>>>>>>>>and supplying the correct device id for PCH?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Here what is your so-called reverse logic as I already asked you
> >>>>>>>>>>>>>previously? Do you mean I should list all PCHs with a combo
> >>>>>>>>>>>>>illustrated with the vendor/device id in advance? Then look up
> >>>>>>>>>>>>>if we can find a
> >>>>>>>>>>>>
> >>>>>>>>>>>>Michael,
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>Ping.
> >>>>>>>>>>>
> >>>>>>>>>>>Thanks
> >>>>>>>>>>>Tiejun
> >>>>>>>>>>>
> >>>>>>>>>>>>Could you explain this exactly? Then I can try follow-up your
> >>>>>>>>>>>>idea ASAP if this is necessary and possible.
> >>>>>>>>>>
> >>>>>>>>>>Michel,
> >>>>>>>>>>
> >>>>>>>>>>Could you give us some explanation for your "reverse logic" when
> >>>>>>>>>>you're free?
> >>>>>>>>>>
> >>>>>>>>>>Thanks
> >>>>>>>>>>Tiejun
> >>>>>>>>>
> >>>>>>>>>So future drivers will look at device ID for the card and figure out
> >>>>>>>>>how things should work from there.
> >>>>>>>>>Old drivers still poke at device id of the chipset for this, but
> >>>>>>>>>maybe qemu can do what newer drivers do:
> >>>>>>>>>look at the card and figure out what guest should do, then present
> >>>>>>>>>the appropriate chipset id.
> >>>>>>>>>
> >>>>>>>>>This is based on what Jesse said:
> >>>>>>>>>     Practically speaking, we could probably assume specific GPU/PCH
> >>>>>>combos,
> >>>>>>>>>     as I don't think they're generally mixed across generations, 
> >>>>>>>>> though
> >>>>>>SNB
> >>>>>>>>>     and IVB did have compatible sockets, so there is the 
> >>>>>>>>> possibility of
> >>>>>>>>>     mixing CPT and PPT PCHs, but those are register identical as 
> >>>>>>>>> far as the
> >>>>>>>>>     graphics driver is concerned, so even that should be safe.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>Michael,
> >>>>>>>>
> >>>>>>>>Thanks for your explanation.
> >>>>>>>>
> >>>>>>>>>so the idea is to have a reverse table mapping GPU back to PCH.
> >>>>>>>>>Present to guest the ID that will let it assume the correct GPU.
> >>>>>>>>
> >>>>>>>>I guess you mean we should create to maintain such a table:
> >>>>>>>>
> >>>>>>>>[GPU Card: device_id(s), PCH: device_id]
> >>>>>>>>
> >>>>>>>>Then each time, instead of exposing that real PCH device id directly,
> >>>>>>>>qemu first can get the real GPU device id to lookup this table to
> >>>>>>>>present a appropriate PCH's device id to the guest.
> >>>>>>>>
> >>>>>>>>And looks here that appropriate PCH's device id is not always a that
> >>>>>>>>real PCH's device id. Right? If I'm wrong please correct me.
> >>>>>>>
> >>>>>>>Exactly: we don't really care what the PCH ID is - we only need it for
> >>>>>>>the guest driver to do the right thing.
> >>>>>>
> >>>>>>Agreed.
> >>>>>>
> >>>>>>I need to ask Allen to check if one given GPU card device id is always
> >>>>>>corresponding to one given PCH on both HSW and BDW currently. If yes, I 
> >>>>>>can
> >>>>>>do this quickly. Otherwise I need some time to establish that sort
> >>>>>>of connection.
> >>>>
> >>>>Michael,
> >>>>
> >>>>Sorry for this delay response but please keep in mind we really are in 
> >>>>this
> >>>>process.
> >>>>
> >>>>You know this involve different GPU components we have to take some time 
> >>>>to
> >>>>communicate or even discuss internal.
> >>>>
> >>>>Now I have a draft codes, could you take a look at this? I hope that 
> >>>>comment
> >>>>can help us to understand something, but if you have any question, we can
> >>>>further respond inline.
> >>>>
> >>>>---
> >>>>typedef struct {
> >>>>     uint16_t gpu_device_id;
> >>>>     uint16_t pch_device_id;
> >>>>} XenIGDDeviceIDInfo;
> >>>>
> >>>>/* In real world different GPU should have different PCH. But actually
> >>>>  * the different PCH DIDs likely map to different PCH SKUs. We do the
> >>>>  * same thing for the GPU. For PCH, the different SKUs are going to be
> >>>>  * all the same silicon design and implementation, just different
> >>>>  * features turn on and off with fuses. The SW interfaces should be
> >>>>  * consistent across all SKUs in a given family (eg LPT). But just same
> >>>>  * features may not be supported.
> >>>>  *
> >>>>  * Most of these different PCH features probably don't matter to the
> >>>>  * Gfx driver, but obviously any difference in display port connections
> >>>>  * will so it should be fine with any PCH in case of passthrough.
> >>>>  *
> >>>>  * So currently use one PCH version, 0x8c4e, to cover all HSW scenarios,
> >>>>  * 0x9cc3 for BDW.
> >>>
> >>>Pls write Haswell and Broadwell in full in comment.
> >>
> >>Are you saying I should list all PCH device ids both of HSW and BDW here
> >>like this format?
> >>
> >>HSWGT1D: 0xABCD
> >>...
> >
> >I am merely saying don't say "HSW, BDW" in comments. Say "Haswell,
> 
> Sorry I'm misunderstanding what you mean.
> 
> >Broadwell, or if you like HWS(Haswell), BDW(Broadwell).
> 
> Looks the latter is moderate.
> 
> + * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
> + * scenarios, 0x9cc3 for BDW(Broadwell).
> + */

Exactly.

> 
> >
> >>>
> >>>>  */
> >>>>static const XenIGDDeviceIDInfo xen_igd_combo_id_infos[] = {
> >>>>     /* HSW Classic */
> >>>>     {0x0402, 0x8c4e}, /* HSWGT1D, HSWD_w7 */
> >>>>     {0x0406, 0x8c4e}, /* HSWGT1M, HSWM_w7 */
> >>>>     {0x0412, 0x8c4e}, /* HSWGT2D, HSWD_w7 */
> >>>>     {0x0416, 0x8c4e}, /* HSWGT2M, HSWM_w7 */
> >>>>     {0x041E, 0x8c4e}, /* HSWGT15D, HSWD_w7 */
> >>>>     /* HSW ULT */
> >>>>     {0x0A06, 0x8c4e}, /* HSWGT1UT, HSWM_w7 */
> >>>>     {0x0A16, 0x8c4e}, /* HSWGT2UT, HSWM_w7 */
> >>>>     {0x0A26, 0x8c4e}, /* HSWGT3UT, HSWM_w7 */
> >>>>     {0x0A2E, 0x8c4e}, /* HSWGT3UT28W, HSWM_w7 */
> >>>>     {0x0A1E, 0x8c4e}, /* HSWGT2UX, HSWM_w7 */
> >>>>     {0x0A0E, 0x8c4e}, /* HSWGT1ULX, HSWM_w7 */
> >>>>     /* HSW CRW */
> >>>>     {0x0D26, 0x8c4e}, /* HSWGT3CW, HSWM_w7 */
> >>>>     {0x0D22, 0x8c4e}, /* HSWGT3CWDT, HSWD_w7 */
> >>>>     /* HSW Server */
> >>>>     {0x041A, 0x8c4e}, /* HSWSVGT2, HSWD_w7 */
> >>>>     /* HSW SRVR */
> >>>>     {0x040A, 0x8c4e}, /* HSWSVGT1, HSWD_w7 */
> >>>>     /* BSW */
> >>>>     {0x1606, 0x9cc3}, /* BDWULTGT1, BDWM_w7 */
> >>>>     {0x1616, 0x9cc3}, /* BDWULTGT2, BDWM_w7 */
> >>>>     {0x1626, 0x9cc3}, /* BDWULTGT3, BDWM_w7 */
> >>>>     {0x160E, 0x9cc3}, /* BDWULXGT1, BDWM_w7 */
> >>>>     {0x161E, 0x9cc3}, /* BDWULXGT2, BDWM_w7 */
> >>>>     {0x1602, 0x9cc3}, /* BDWHALOGT1, BDWM_w7 */
> >>>>     {0x1612, 0x9cc3}, /* BDWHALOGT2, BDWM_w7 */
> >>>>     {0x1622, 0x9cc3}, /* BDWHALOGT3, BDWM_w7 */
> >>>>     {0x162B, 0x9cc3}, /* BDWHALO28W, BDWM_w7 */
> >>>>     {0x162A, 0x9cc3}, /* BDWGT3WRKS, BDWM_w7 */
> >>>>     {0x162D, 0x9cc3}, /* BDWGT3SRVR, BDWM_w7 */
> >>>>};
> >>>>
> >>>>static void xen_igd_passthrough_pciisabridge_get_pci_device_id(Object 
> >>>>*obj,
> >>>>                                                                Visitor 
> >>>> *v,
> >>>>                                                                void 
> >>>> *opaque,
> >>>>                                                                const char
> >>>>*name,
> >>>>                                                                Error 
> >>>> **errp)
> >>>>{
> >>>>     uint32_t value = 0;
> >>>>     XenHostPCIDevice hdev;
> >>>>     int r = 0, num;
> >>>>
> >>>>     r = xen_host_pci_device_get(&hdev, 0, 0, 0x02, 0);
> >>>>     if (!r) {
> >>>>         value = hdev.device_id;
> >>>>
> >>>>         num = sizeof(xen_igd_combo_id_infos)/sizeof(uint16_t);
> >>>>         for (r = 0; r < num; r++)
> >>>>             if (value == xen_igd_combo_id_infos[r].gpu_device_id)
> >>>>                 value = xen_igd_combo_id_infos[r].pch_device_id;
> >>>
> >>>that's messy, I would use different variables for ID
> >>>of GPU and PCH.
> >>
> >>         for (r = 0; r < num; r++)
> >>             if (gpu_id == xen_igd_combo_id_infos[r].gpu_device_id)
> >>                 pch_id = xen_igd_combo_id_infos[r].pch_device_id;
> >
> >yea
> >
> >>>
> >>>>     }
> >>>>
> >>>>     visit_type_uint32(v, &value, name, errp);
> >>>
> >>>
> >>>what I was suggesting is to start with the GPU device ID
> >>>(you can get it e.g. from the config array).
> >>
> >>I think current codes is doing this:
> >>
> >>     r = xen_host_pci_device_get(&hdev, 0, 0, 0x02, 0);
> >>     if (!r) {
> >>         gpu_id = hdev.device_id;
> >>
> >>Here xen_host_pci_device_get() is a wrapper of reading pci sysfs.
> >
> >
> >Do you want the device ID of the GPU?
> 
> Yes.
> 
> >So please don't poke at it directly in sysfs.
> >Instead, find the GPU's PCIDevice and do
> >     gpu->config_read();
> >or just get from gpu->config.
> 
> Good, but how can we get gpu as pci_dev instance here? Then we can call
> pci_dev->config_read as expect.

You can't do this at host init but you can do this when
gpu is added or at machine done.

> >
> >>>Use that to look up the PCH ID in xen_igd_combo_id_infos.
> >>>If there, override the PCH ID.
> >>>
> >>>If not there, this is a new device so its driver will
> >>>not look at PCH at all, we can make it whatever or
> >>>skip it completely.
> >>
> >>I think we should return one initialized value, 0xffff, since often this
> >>represents an invalid PCI device.
> >
> >I'm not sure you really want an invalid PCI device,
> >would be nicer not to have a PCH device there at all.
> 
> Yes, I means 0xffff indicates that there's no any PCI device.
> 
> >
> >
> >>>
> >>>This seems to almost do this, however
> >>>- Why are you looking at host PCH device ID at all?
> >>>- Why don't you look at the GPU device ID?
> >>
> >>We still fix this bridge at 1f.0, and current our implementation can cover
> >>our requirement and safe. I means this bridge should not be used for other
> >>use cases, so if its still be accessed we mightn't take care of them, right?
> >
> >I was asking about the host bridge device, not guest.
> 
> Are you assuming what will happen to a new device?

Yes, the assumption is a new devices will use a new driver which does
everything through BAR, so it will not need these hacks.


> >
> >>>
> >>>
> >>>>}
> >>>>
> >>>>static void xen_igd_passthrough_isa_bridge_initfn(Object *obj)
> >>>>{
> >>>>     object_property_add(obj, "device-id", "int",
> >>>>                         
> >>>> xen_igd_passthrough_pciisabridge_get_pci_device_id,
> >>>>                         NULL, NULL, NULL, NULL);
> >>>>}
> >>>
> >>>OK and what reads this property?
> >>
> >>In sequent patch I will do something like this,
> >>
> >>@@ -464,6 +464,32 @@ static void pc_xen_hvm_init(MachineState *machine)
> >>      }
> >>  }
> >>
> >>+static void xen_igd_passthrough_isa_bridge_create(PCIBus *bus)
> >>+{
> >>+    struct PCIDevice *dev;
> >>+    Error *local_err = NULL;
> >>+    uint16_t device_id = 0xffff;
> >>+
> >>+    /* Currently IGD drivers always need to access PCH by 1f.0. */
> >>+    dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
> >>+                            "xen-igd-passthrough-isa-bridge");
> >>+
> >>+    /* Identify PCH card with its own real vendor/device ids.
> >>+     * Here that vendor id is always PCI_VENDOR_ID_INTEL.
> >>+     */
> >>+    if (dev) {
> >>+        device_id = (uint16_t)object_property_get_int(OBJECT(dev),
> >>+                                                      "device-id",
> >>+                                                      &local_err);
> >
> >cast is not needed here.
> >
> >>+        if ((!local_err) && (device_id != 0xffff)) {
> >
> >too many ().
> 
> +        device_id = object_property_get_int(OBJECT(dev), "device-id",
> +                                            &local_err);
> +        if (!local_err && device_id != 0xffff) {
> 
> >
> >>+            pci_config_set_device_id(dev->config, device_id);
> >>+            return;
> >>+        }
> >>+    }
> >>+
> >>+    fprintf(stderr, "xen set xen-igd-passthrough-isa-bridge failed\n");
> >>+}
> >>+
> >>  static void xen_igd_passthrough_pc_hvm_init(MachineState *machine)
> >>  {
> >>      PCIBus *bus;
> >>@@ -473,6 +499,7 @@ static void xen_igd_passthrough_pc_hvm_init(MachineState
> >>*machine)
> >>      bus = pci_find_primary_bus();
> >>      if (bus != NULL) {
> >>          pci_create_simple(bus, -1, "xen-platform");
> >>+        xen_igd_passthrough_isa_bridge_create(bus);
> >>      }
> >>  }
> >>  #endif
> >>
> >>Tiejun
> >>
> >
> >By the way, I wonder: would it make sense to add this code in xen_pt_initfn?
> >
> >You would then do the following there:
> >
> >     - look up GPU by device/vendor ID
> >     - if a legacy GPU is found, get the ID, translate
> >       to host bridge ID, and create the compat bridge.
> >
> >saves the hassle of using a distinct machine type for passthrough, no?
> 
> As I remember Paolo thought we'd better have a separate machine specific to
> IGD passthrough, so I'm not sure if its fine to him now.
> 
> Thanks
> Tiejun

I think the point was mostly to reserve 1f to prevent
devices from using it.
As we populate slots in order it doesn't seem to important ...
Paolo?



> >
> >>>
> >>>>Thanks
> >>>>Tiejun
> >>>>
> >>>>
> >>>>>>
> >>>>>
> >>>>>If I understand this correctly, the only difference is instead of 
> >>>>>reading PCH DevID/RevID from the host hardware, QEMU inserts those 
> >>>>>values into PCH virtual device by looking at the reverse mapping table 
> >>>>>it maintains.
> >>>>>
> >>>>>I agree the downside of doing this is the reverse mapping table may be 
> >>>>>hard to maintain.
> >>>>>
> >>>>>What is the advantage of doing this instead of having QEMU reading it 
> >>>>>from the host?  Is it to test to make sure reverse mapping methods works 
> >>>>>before it is adopted in the new drivers?
> >>>>>
> >>>>>>Thanks
> >>>>>>Tiejun
> >>>>>>
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>the problem with these tables is they are hard to keep up to date
> >>>>>>>>
> >>>>>>>>Yeah. But I think currently we can just start from some modern CPU
> >>>>>>>>such as HSW and BDW, then things could be easy.
> >>>>>>>>
> >>>>>>>>Allen,
> >>>>>>>>
> >>>>>>>>Any idea to this suggestion?
> >>>>>>>>
> >>>>>>>>>as new hardware comes out, but as future hardware won't need these
> >>>>>>>>>hacks, we shall be fine.
> >>>>>>>>
> >>>>>>>>Yeah.
> >>>>>>>>
> >>>>>>>>Thanks
> >>>>>>>>Tiejun
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>Thanks
> >>>>>>>>>>>>Tiejun
> >>>>>>>>>>>>
> >>>>>>>>>>>>>matched PCH? If yes, what is that benefit you expect in
> >>>>>>>>>>>>>passthrough case? Shouldn't we pass these info to VM directly in
> >>>>>>passthrough case?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Thanks
> >>>>>>>>>>>>>Tiejun
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>Allen
> >>>>>
> >>>>>
> >>>
> >
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.