[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC: Automatically making a PCI device assignable in the config file



On 07/09/2013 03:25 PM, Konrad Rzeszutek Wilk wrote:
On Tue, Jul 09, 2013 at 01:52:38PM +0100, George Dunlap wrote:
On 07/08/2013 08:23 PM, Konrad Rzeszutek Wilk wrote:
On Fri, Jul 05, 2013 at 02:52:08PM +0100, George Dunlap wrote:
On 05/07/13 14:48, Andrew Cooper wrote:
On 05/07/13 14:45, George Dunlap wrote:
On 05/07/13 14:39, Andrew Cooper wrote:
On 05/07/13 12:01, George Dunlap wrote:
I've been doing some work to try to make driver domains easier to set
up and use.  At the moment, in order to pass a device through to a
guest, you first need to assign it to pciback.  This involves doing
one of three things:
* Running xl pci-assignable-add for the device
* Specifying the device to be grabbed on the dom0 Linux command-line
* Doing some hackery in /etc/modules.d

None of these are very satisfying.  What I think would be better is if
there was a way to specify in the guest config file, "If device X is
not assignable, try to make it assignable".  That way you can have a
driver domain grab the appropriate device just by running "xl create
domnet"; and once we have the xendomains script up and running with
xl, you can simply configure your domnet appropriately, and then put
it in /etc/xen/auto, to be started automatically on boot.

My initial idea was to add a parameter to the pci argument in the
config file; for example:

pci = ['08:04.1,permissive=1,seize=1']

The 'seize=1' would indicate that if bdf 08:04.1 is not already
assignable, that xl should try to make is assignable.

The problem here is that this would need to be parsed by
xlu_pci_parse_bdf(), which only takes an argumen tof type
libxl_device_pci.

Now it seems to me that the right place to do this "seizing" is in xl,
not inside libxl -- the functions for doing assignment exist already,
and are simple and straightforward.  But doing it in xl, but as a
parameter of the "pci" setting, means changing xlu_pci_parse_bdf() to
pass something else back, which begins to get awkward.

So it seems to me we have a couple of options:
1. Create a new argument, "pci_seize" or something like that, which
would be processed separately from pci
2. Change xlu_pci_parse_bdf to take a pointer to an extra struct, for
arguments directed at xl rather than libxl
3. Add "seize" to libxl_device_pci, but have it only used by xl
4. Add "seize" to libxl_device_pci, and have libxl do the seizing.

Any preference -- or any other ideas?

   -George
How about a setting in xl.conf of "auto-seize pci devices" ?  That way
the seizing is entirely part of xl
Auto-seizing is fairly dangerous; you could easily accidentally yank
out the ethernet card, or even the disk that dom0 is using.  I really
think it should have to be enabled on a device-by-device basis.

I suppose another option would be to be able to set, in xl.conf, a
list of auto-seizeable devices.  I don't really like that option as
well, though.  I'd rather be able to keep all the configuration in one
place.

  -George
Or a slight less extreme version.

If xl sees that it would need seize a device, it could ask "You are
trying to create a domain with device $FOO.  Would you like to seize it
>from dom0 ?"

That won't work for driver domains, as we want it all to happen
automatically when the host is booting. :-)

The high-level goal is that we want to put the network devices with a
network backend and storage devices with storage backend. Ignorning
that for network devices you might want seperate backends for each
device (say one backend for Wireless, one for Ethernet, etc).

Perhaps the logic ought to do grouping - so you say:
  a) "backends:all-network" (which would created one backend with all of the
    wireless, ethernet, etc PCI devices), or
  b) "backends:all-network,seperate-storage", which  create one backend with
   all of the wireless, ethernet in one backend; and one backend domain for each
   storage device?

Naturally the user gets to chose which grouping they would like?

We seem to be talking about different things.  You seem to be
talking about automatically starting some pre-made VMs and assigning
devices and backends to them?  But I'm not really sure.

I am trying to look at it from a high perspective to see whether we can
make this automated for 99% of people out of the box. Hence the
idea of grouping. And yes to '..assigning devices and backends to them'.

I was assuming that the user was going to be installing and
configuring their own driver domains.  The user already has to
specify "pci=['$BDF']" in their config file to get specific devices
passed through -- this would just be making it easy to have the
device assigned to pciback as well.

I think the technical bits what libxl is doing and yanking devices
around is driven either by the admin or a policy. If the policy
is this idea of grouping (that is a terrible name now that I think
of it), then perhaps we should think how to make that work and then
the details (such as this automatic yanking of devices to pci-back)
can be filled in.



I suspect that a lot of people will want to have one network card
assigned to domain 0 as a "management network", and only have other
devices assigned to driver domains.  I think that having one device
per domain is probably the best recommendation; although we
obviously want to support someone who wants a single "manage all the
devices" domain, we should assume that people are going to have one
device per driver domain.

I don't know. My feeble idea was that we would have at minimum _two_
guests on bootup. One is a control one that has no devices - but is
the one that launches the guests.

Then there is the dom1 which would have all (or some) of the storage
and network devices plugged in along with the backends. Then a dom2
which would be the old-style-dom0 - so it would have the graphic card
and the rest of the PCI devices.

In other words, when I boot I would have two tiny domains launch
right before "old-style-dom0" is started. But I am getting in specifics
here.

Perhaps you could explain to me how you envisioned how the device
driver domains idea would work? How would you want it to work on your
laptop?

Or are we right now just thinking of the small pieces of making the
code be able to yank the devices around and assign them?

I was thinking for now just making the "manually configure it" case easier. I decided to switch one of my test boxen to using a network driver domain by default, and although the core is there, there are a bunch of things that are unnecessarily crufty.

I do agree that long term it would be nice to make it easy to make driver domains the default, but that's not what I had in mind for this conversation. :-)

The hard part for making it really automated, it seems to me, comes from two things. O

One, you have to make sure your driver domain has the appropriate hardware drivers for your system as well. We don't want to be in the business of maintaining a distro; most people will probably want the driver domain to be from the same distro they're using for dom0, which means that setting up such a domain will need to be done differently on a distro-by-distro basis.

Two, you have the configuration problem. In Debian, for instance, if you wanted to switch a device from being owned by dom0 to being in a driver domain, you'd have to: * Copy over the udev rules recognizing the mac address, so it got the same ethN * copy over the eth and bridge info from dom0's /etc/network/interfaces into the guest /etc/network/interfaces

I'm not sure exactly what you have to do in Fedora, but I bet it's something similar.

It might be nice to work with distros to make the process of making driver domains / stub domains easier, and to make it easy to configure driver domain networking options from the distro's network scripts; but that's kind of another level of functionality.

I think first things first: make manually-set-up driver domains actually easy to use.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.