[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci



On Wed, 25 Sep 2024, Julien Grall wrote:
> On 25/09/2024 00:16, Stefano Stabellini wrote:
> > On Tue, 24 Sep 2024, Julien Grall wrote:
> > > On 24/09/2024 18:11, Edgar E. Iglesias wrote:
> > > > On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote:
> > > > > Hi Edgar,
> > > > > 
> > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote:
> > > > > > From: "Edgar E. Iglesias" <edgar.iglesias@xxxxxxx>
> > > > > > 
> > > > > > Reserve memory ranges and interrupt lines for an externally
> > > > > > emulated PCI controller (e.g by QEMU) dedicated to hosting
> > > > > > Virtio devices and potentially other emulated devices.
> > > > > > 
> > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xxxxxxx>
> > > > > > ---
> > > > > >     xen/include/public/arch-arm.h | 17 +++++++++++++++++
> > > > > >     1 file changed, 17 insertions(+)
> > > > > > 
> > > > > > diff --git a/xen/include/public/arch-arm.h
> > > > > > b/xen/include/public/arch-arm.h
> > > > > > index e19f0251a6..654b827715 100644
> > > > > > --- a/xen/include/public/arch-arm.h
> > > > > > +++ b/xen/include/public/arch-arm.h
> > > > > > @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t;
> > > > > >     #define GUEST_RAM1_BASE   xen_mk_ullong(0x0200000000) /* 952GB
> > > > > > of RAM
> > > > > > @ 8GB */
> > > > > >     #define GUEST_RAM1_SIZE   xen_mk_ullong(0xee00000000)
> > > > > > +/* Virtio PCI - Ordered by decreasing size to keep things aligned
> > > > > > */
> > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE
> > > > > > xen_mk_ullong(0x43000000)
> > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE
> > > > > > xen_mk_ullong(0x0f000000000)
> > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE
> > > > > > xen_mk_ullong(0x100000000)
> > > > > > +
> > > > > > +#define GUEST_VIRTIO_PCI_ECAM_BASE
> > > > > > (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \
> > > > > > +
> > > > > > GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE)
> > > > > > +#define GUEST_VIRTIO_PCI_ECAM_SIZE      xen_mk_ullong(0x10000000)
> > > > > > +
> > > > > > +#define GUEST_VIRTIO_PCI_MEM_TYPE         xen_mk_ullong(0x02000000)
> > > > > > +#define GUEST_VIRTIO_PCI_MEM_BASE
> > > > > > (GUEST_VIRTIO_PCI_ECAM_BASE +
> > > > > > \
> > > > > > +
> > > > > > GUEST_VIRTIO_PCI_ECAM_SIZE)
> > > > > > +#define GUEST_VIRTIO_PCI_MEM_SIZE
> > > > > > xen_mk_ullong(0x00002000000)
> > > > > 
> > > > > Why is this specific to virtio PCI? However, I am not entirely
> > > > > convinced
> > > > > we
> > > > > should have a second PCI hostbridge exposed to the guest for a few
> > > > > reasons:
> > > > >     1. This require to reserve yet another range in the address space
> > > > > (could
> > > > > be solved with a more dynamic layout)
> > > > >     2. From your instructions, the guest needs to explicitly do a PCI
> > > > > rescan.
> > > 
> > > Another big advantage I forgot to mention is disaggregation. In a world
> > > where
> > > the hostbridge is handled in Xen, you could have a PCI device emulated by
> > > dom0
> > > while the other is emulated by a stub domain.
> > > 
> > > > > 
> > > > > So rather than having a second hostbridge, have you considered to
> > > > > extend
> > > > > the
> > > > > existing hostbridge (implemented in Xen) to support a mix of physical
> > > > > PCI
> > > > > device and virtual one?
> > > > > 
> > > > 
> > > > Thanks Julien,
> > > > 
> > > > It's briefly come up in a couple of discussions but I haven't looked
> > > > carefully at it. It is a good idea and it's probably worth prototyping
> > > > to see what the gaps are in hypercall interfaces, QEMU support etc.
> > > 
> > > I also vaguely recall to discuss it on xen-devel. But I couldn't find the
> > > discussion... :(.
> > > 
> > > I think all the hypercalls should be there but will require some plumbing
> > > in
> > > Xen on Arm. QEMU should be able to request Xen to forward configuration
> > > access
> > > for a given PCI device (see XEN_DMOP_IO_RANGE_PCI). They will then be
> > > forwarded to QEMU using IOREQ_TYPE_PCI_CONFIG.
> > > 
> > > We also have an hypercall to be able to inject interrupts from QEMU (see
> > > XEN_DMOP_set_irq_level).
> > 
> > Hi Julien,
> > 
> > Yes, I remember a thread on xen-devel too about this topic when EPAM
> > suggested a similar two-hostbridges approach. I was one of the people
> > suggesting to use a single hostbridge at the time.
> > 
> > However, when we looked at the implementation more closely, the
> > two-hostbridge approach was easier to get up and running. It works
> > (almost) out of the box. Currently, we have the two-hostbridge solution
> > working on both ARM and x86 to enable virtio-pci to work alongside vPCI
> > in Dom0less/Hyperlaunch configurations.
> 
> I understand this is the easiest solution... However, this requires code in
> Xen that I am not yet convinced it is good to have.
> 
> I am not too concerned about the MMIO range part. This can be (easily) solved.
> I am more concerned about the support of background region and the fact the OS
> needs to be able to rescan.
> 
> I am definitely not an expert of PCI, but AFAIK, it is possible to have the
> guest to be notified when a PCI device is hotplug. Why can't we use it?

Yes, that is the cleanest solution and Xenia has been working on that in
the last couple of weeks. I am not sure if she has it fully functional
yet. PCI rescan is just a crude but effective way to solve the problem.
We also have a prototype of a special "flag" on the PCI root complex to
tell the guest if the bus is ready, or whether it should wait. In any
case I am confident this issue can be solved well, and we were already
aiming for pci hotplug as the final solution.


> > While I think that a single hostbridge is better architecturally, it is
> > important to consider that virtio is moving toward a new transport
> > (virtio-msg, Bertrand is also involved) which does not require a
> > hostbridge. This new transport is key for all our use-cases as it
> > addresses safety requirements and supports AMP configurations without a
> > shared hypervisor between the frontend and backend. Edgar is one of the
> > top contributors to virtio-msg. Given this, I don't think it's
> > worthwhile to invest much effort in virtio-pci, as it will be replaced
> > soon in embedded applications.
>
> To me this raises the question why we should have a temporary solution
> upstream then?

Having virtio-pci support as a stopgap seems beneficial, especially
since it's reasonable to expect it will be needed for 1-2 years, which
is a considerable period. This would put Xen in a good position in
respect to other hypervisors in the same space. However, I also
recognize that this implementation isn't ideal.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.