[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Metadata and signalling channels for Zephyr virtio-backends on Xen



Stefano Stabellini <stefano.stabellini@xxxxxxxxxx> writes:

> On Mon, 7 Feb 2022, Alex Bennée wrote:
>> Hi Stefano,
>> 
>> Vincent gave an update on his virtio-scmi work at the last Stratos sync
>> call and the discussion moved onto next steps.
>
> Hi Alex,
>
> I don't know the specifics of virtio-scmi, but if it is about power,
> clocks, reset, etc. like the original SCMI protocol, then virtio-scmi is
> likely going to be very different from all the other virtio frontends
> and backends. That's because SCMI requires a full view of the system,
> which is different from something like virtio-net that is limited to the
> emulation of 1 device. For this reason, it is likely that the
> virtio-scmi backend would be a better fit in Xen itself, rather than run
> in userspace inside a VM.

That may be a good solution for Xen but I still think it's worthwhile
being able to package SCMI in a VM for other hypervisors. We are just
happening to use Xen as a nice type-1 example.

Vincents SCMI server code is portable anyway and can reside in a Zephyr
app, firmware blob or a userspace vhost-user client.

> FYI, a good and promising approach to handle both SCMI and SCPI is the
> series recently submitted by EPAM to mediate SCMI and SCPI requests in
> Xen: https://marc.info/?l=xen-devel&m=163947444032590
>
> (Another "special" virtio backend is virtio-iommu for similar reasons:
> the guest p2m address mappings and also the IOMMU drivers are in Xen.
> It is not immediately clear whether a virtio-iommu backend would need to
> be in Xen or run as a process in dom0/domU.)
>
> On the other hand, for all the other "normal" protocols (e.g.
> virtio-net, virtio-block, etc.) the backend would naturally run as a
> process in dom0 or domU (e.g. QEMU in Dom0) as one would expect.

Can domU's not be given particular access to HW they might want to
tweak? I assume at some point a block device backend needs to actually
talk to real HW to store the blocks (even if in most cases it would be a
kernel doing the HW access on it's behalf).

>> Currently the demo setup
>> is intermediated by a double-ended vhost-user daemon running on the
>> devbox acting as a go between a number of QEMU instances representing
>> the front and back-ends. You can view the architecture with Vincents
>> diagram here:
>> 
>>   
>> https://docs.google.com/drawings/d/1YSuJUSjEdTi2oEUq4oG4A9pBKSEJTAp6hhcHKKhmYHs/edit?usp=sharing
>> 
>> The key virtq handling is done over the special carve outs of shared
>> memory between the front end and guest. However the signalling is
>> currently over a virtio device on the backend. This is useful for the
>> PoC but obviously in a real system we don't have a hidden POSIX system
>> acting as a go between not to mention the additional latency it causes
>> with all those context switches.
>> 
>> I was hoping we could get some more of the Xen experts to the next
>> Stratos sync (17th Feb) to go over approaches for a properly hosted on
>> Xen approach. From my recollection (Vincent please correct me if I'm
>> wrong) of last week the issues that need solving are:
>
> Unfortunately I have a regular conflict which prevents me from being
> able to join the Stratos calls. However, I can certainly make myself
> available for one call (unless something unexpected comes up).
>
>
>>  * How to handle configuration steps as FE guests come up
>> 
>> The SCMI server will be a long running persistent backend because it is
>> managing real HW resources. However the guests may be ephemeral (or just
>> restarted) so we can't just hard-code everything in a DTB. While the
>> virtio-negotiation in the config space covers most things we still need
>> information like where in the guests address space the shared memory
>> lives and at what offset into that the queues are created. As far as I'm
>> aware the canonical source of domain information is XenStore
>> (https://wiki.xenproject.org/wiki/XenStore) but this relies on a Dom0
>> type approach. Is there an alternative for dom0less systems or do we
>> need a dom0-light approach, for example using STR-21 (Ensure Zephyr can
>> run cleanly as a Dom0 guest) providing just enough services for FE's to
>> register metadata and BE's to read it?
>
> I'll try to answer the question for a generic virtio frontend and
> backend instead (not SCMI because SCMI is unique due to the reasons
> above.)
>
> Yes, xenstore is the easiest way to exchange configuration information
> between domains. I think EPAM used xenstore to exchange the
> configuration information in their virtio-block demo. There is a way to
> use xenstore even between dom0less VMs:
> https://marc.info/?l=xen-devel&m=164340547602391 Not just xenstore but
> full PV drivers too. However, in the dom0less case xenstore is going to
> become available some time after boot, not immediately at startup time.
> That's because you need to wait until xenstored is up and running.
>
> There are other ways to send data from one VM to another which are
> available immediately at boot, such as Argo and static shared memory.
>
> But dom0less is all about static partitioning, so it makes sense to
> exploit the build-time tools to the fullest. In the dom0less case, we
> already know what is going to run on the target before it is even turned
> on. As an example, we might have already prepared an environment with 3
> VMs using Yocto and ImageBuilder. We could also generate all
> configurations needed and place them inside each VMs using Yocto's
> standard tools and ImageBuilder. So for dom0less, I recommend to go via
> a different route and pre-generate the configuration directly where
> needed instead of doing dynamic discovery.

Even in a full dom0less setup you still need to manage lifetimes somehow
if a guest reboots.

>
>
>>  * How to handle mapping of memory
>> 
>> AIUI the Xen model is the FE guest explicitly makes grant table requests
>> to expose portions of it's memory to other domains. Can the BE query the
>> hypervisor itself to discover the available grants or does it require
>> coordination with Dom0/XenStore for that information to be available to
>> the BE domain?
>
> Typically the frontend passes grant table references to the backend
> (i.e. instead of plain guest physical addresses on the virtio ring.)
> Then, the backend maps the grants; Xen checks that the mapping is
> allowed.
>
> We might be able to use the same model with virtio devices. A special
> pseudo-IOMMU driver in Linux would return a grant table reference and an
> offset as "DMA address". The "DMA address" is passed to the virtio
> backend over the virtio ring. The backend would map the grant table
> reference using the regular grant table hypercalls.
>
>
>>  * How to handle signalling
>> 
>> I guess this requires a minimal implementation of the IOREQ calls for
>> Zephyr so we can register the handler in the backend? Does the IOREQ API
>> allow for a IPI style notifications using the global GIC IRQs?
>> 
>> Forgive the incomplete notes from the Stratos sync, I was trying to type
>> while participating in the discussion so hopefully this email captures
>> what was missed:
>> 
>>   
>> https://linaro.atlassian.net/wiki/spaces/STR/pages/28682518685/2022-02-03+Project+Stratos+Sync+Meeting+Notes
>
> Yes, any emulation backend (including virtio backends) would require an
> IOREQ implementation, which includes notifications via event channels.
> Event channels are delivered as a GIC PPI interrupt to the Linux kernel.
> Then, the kernel sends the notification to userspace via a file
> descriptor.

Thanks.

-- 
Alex Bennée



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.