[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Design Sessions notes: Xen system boot: launching VMs (DomB mode of dom0less)



# Session Notes on Xen system boot: launching VMs (DomB mode of dom0less)

Sessions Host: Christopher Clark. Scribing: Daniel Smith & Christopher Clark.

The DomB-mode-for-dom0less topic was covered in two design session slots
at the Xen Design & Developer Summit 2020.

## Session 1: Xen system boot: launching VMs (DomB)

A talk presenting background on the project and a progress update on the
development work sponsored by Star Lab Corp. This talk is preparatory
material for the following second session.

> A presentation of progress towards building DomB: a new mode of
> starting Xen with guest workloads launched at host boot - including
> support for x86 platforms, system disaggregation and running without
> dom0, and architecture to support measurement of system launch.

Slides are available here:
https://static.sched.com/hosted_files/xen2020/91/DomB%20-%20Xen%20Design%20%26%20Developer%20Summit%202020.pdf

## Session 2: Next steps for Xen system boot: launching VMs (DomB)

A Design Session discussion, discussing forward direction and topics
identified during building the initial prototype work.

### Session seed notes, shared on-screen during the session:

* Basics: general structure:
    - bootloader loads domain materials into RAM (kernels, ramdisks)
    - some metadata, in binary form, describes the domains to be launched
    - hypervisor performs domain construction
        - PVH and PV supported
    - only one guest is unpaused by the hypervisor: domB
    - domB unpauses other domains when ready to do so
        -> allows measurement to be performed by domB
        -> allows configuration to be applied by domB
        -> allows domB to sequence startup of other domains, if necessary
    - domB permissions: no hardware access, limited privileges to do
setup operations
    - hardware domain permissions: subset of the current dom0, no
is_priv for control ops

Needed for usability:
    - support for bringup of PV devices
        -> toolstack needs to be aware of Initial Domains as started
and initiate
           the bringup of backends

* Questions:
    - is claiming the first multiboot module, and dynamically toggling
dom0/domB mode, acceptable?
    - is there a tree + binary format outside of Xen that provides what we need?
    - is there momentum behind a technology elsewhere that Xen needs to support?
    - logic for building ACPI tables:
        - enable DomB to do this for other Initial Domains?
    - how best to implement "atomic handoff":
        - exit of DomB
        - continuation of the Initial Domains after their configuration by DomB
    - how best to surface the Launch Control Module contents to DomB?
        - ACPI tables? (PVH)
        - what about PV mode?

* Guidance:
    - how to bringup PV disk and network (etc) for the Initial Domains?
        - A: the toolstack domain interrogates Xen, gets data on the
Initial Domains,
             and then uses its own database to bring them up
            - means coordination between data in the toolstack and
config in the LCM
    - guest kernel decompression
        - complex, and implementation is not shared with anything else
        - would prefer to do the decompress in a guest context rather
than the hypervisor
          and use a bootloader-supplied decompressor binary, outside
the hypervisor

* To Research / Investigate:
    - from Stefano: "system device tree"
    - Implementing support for HVM-mode initial domains:
        - primary use case is "non-PV VMs that can have devices
directly assigned"
            - so PVH with working PCI passthrough would suffice
            - but having the ability to launch HVM too would be nice
        - needs bringup of the device emulator, and Xen configured to connect it


### Comments and discussion during the session:

_Jason Andryuk, Q: does the hypervisor construct multiple domains or domB only?_
Christopher, A: the hypervisor constructs multiple domains

_Damien Thenot (dthenot), Q: Could DomB be used to explore hardware
and create domain driver as needed ?_
Christopher: yes it could; but don't want domB to become a dom0 again

> post-session note: this is about wanting to avoid unbounded scope creep for a 
> single domB:
> the domB structure will enable doing these functions in other independent 
> initial domains
> that are also launched and run at host boot. DomB is unlikely to have 
> permission to perform
> any domain creation by default, since it won't need it - it just applies 
> configuration and
> unpause to the other domains that Xen builds at host boot.

_Bobby Eshleman, Q: does no hardware access imply no TPM access here?
Just thinking about the measurement capabilities of DomB._
Christopher: yes but is under discussion. a possibility is to put a
minimal tpm driver in xen so that DomB can be measured before launch.
Roger Pau Monné: TPM is just assigned to dom0 (or the hw domain),
there's no special handling of it in Xen
Jason Andryuk: You can have the bootloader measure all the pieces into
the TPM before transitioning to Xen/domB, but those would be the
compressed artifacts.

> post-session note: enabling a strong full-system architecture for measured 
> launch and
> virtual TPM support for domains, where the vTPM is rooted in the physical TPM 
> is
> important and a motivation behind the DomB architecture.

_Bertrand Marquis: Could you explain a bit more the decompression? i
do not quite get why it is done in Xen?_
Christopher: if the dom0 kernel is detected as compressed, Xen will
decompress it.
Andy Cooper: Xen needs to decompress the elf header to get elf notes
to boot a PV domain.
Christopher: one thought is to do it another vmcs context
Andy: yes but adding a lot of overhead to do that

_Christopher: Is the proposed LCM detection a reasonable upstreamable approach?_
Andy: yes it is acceptable
??: Arm uses device tree
- Christopher: isn't it fixed, to describe hardware?
Bertrand: Xen already has logic to extend the tree
Stefano: could domb use a small key/value device tree with LCM fields
and use existing DTB parser in Arm XEN
Julien: don't use libft on untrusted device trees, not suitable for
the hypervisor

_Christopher: Is it foreign to use ACPI to expose LCM to guests on ARM?_
Bertrand, Stefano: ARM now has ACPI so its not really foreign, is ok

_Topic: Getting device info to the toolstack after launch_
Jürgen Groß: xenstore stubdom is upstream/available
xl/libxl is a separate issue
problem with dom0less is issue with getting xenstored up before domU
starts trying to do xenstore/device setups

< xenstore setup discussion >
  - basic conclusion is that it is a bit of mess and needs cleaned up

_Nicolas Poirot, Q: if domB starts guests and quit, will there be no
management (reboot, shutdown) of the guests?_
Rich Persaud [stacktrust]: I think yes for the static partitioning use
case, which overlaps somewhat with dom0less.  If management is needed,
one of the started guests can be a privileged toolstack domain.

> post-session note: domB doesn't host the management, control or toolstack 
> software that does that, and
> it does not have the control domain privileged permissions that are needed to 
> do it.
> However, you can start a control domain at host boot, with the DomB 
> architecture, and that will handle it.
> You describe the control domain you want in the Launch Control Module, 
> provide a kernel and optional ramdisk
> to the bootloader, and then Xen will build it and DomB will assist with 
> configuring and starting that domain.


_Rich Persaud [stacktrust] : For those new to domB, some material for
offline reading:_

* Dec 2019 design meeting in Cambridge: https://lists.gt.net/xen/devel/577800
* May 2020 domB design doc v1: https://lists.gt.net/xen/devel/586365
* TrenchBoot (DRTM):
https://github.com/TrenchBoot/documentation/tree/master/presentations
* OpenXT & boot integrity: https://openxt.org/ecosystems/
* PSEC 2019: https://www.platformsecuritysummit.com/2019/videos/
* PSEC 2018: https://www.platformsecuritysummit.com/2018/videos/

We can also schedule a separate conference call after Xen Summit.
You can email rp@xxxxxxxxxxxxxx if you're interested in being included
in a future domB conference call to review v2 of the design doc

> post-session note: DomB is being built with intent to support the
> 'Hardened Access Terminals' (HAT) architecture, also presented at the
> summit, with slides available here:

https://static.sched.com/hosted_files/xen2020/46/Reliable%20Platform%20Security_%20Xen%20and%20the%20Fidelis%20Platform%20for%20Hardened%20Access%20Terminals%20%28HAT%29.pdf

### Observations

> - general tone was supportive from many sides
> - device tree needs looking at, and if so, will need a security-capable
>   parser (libfdt is specifically not suitable for it)
> - xenstore is a pain point (yet again)
> - we can’t ditch the existing kernel decompressor since PV needs to read
>   the ELF notes, which need decompressing to access
> - TPM access needs explaining in our documentation

A big thanks to the conference attendees for the interest expressed in
the two sessions that enabled both of these to be scheduled, and for the
positive and active engagement in the discussions.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.