[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu



Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben:
> > -----Original Message-----
> > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > Sent: 16 October 2015 17:12
> > To: Paul Durrant
> > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx
> > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> > missed in qemu
> > 
> > Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > > > -----Original Message-----
> > > > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > > > Sent: 16 October 2015 16:02
> > > > To: Paul Durrant
> > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx
> > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> > ahci
> > > > missed in qemu
> > > >
> > > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > > > -----Original Message-----
> > > > > > From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> > > > > > Sent: 16 October 2015 15:04
> > > > > > To: Paul Durrant
> > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> > qemu-
> > > > > > devel@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxx; qemu-
> > block@xxxxxxxxxx
> > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> > for
> > > > ahci
> > > > > > missed in qemu
> > > > > >
> > > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > > > -----Original Message-----
> > > > > > > > From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx]
> > > > > > > > Sent: 14 October 2015 12:12
> > > > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > > > Cc: John Snow; Anthony Perard; qemu-devel@xxxxxxxxxx; xen-
> > > > > > > > devel@xxxxxxxxxxxxx; qemu-block@xxxxxxxxxx; Paul Durrant
> > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> > support
> > > > for
> > > > > > ahci
> > > > > > > > missed in qemu
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > > > [ CC qemu-block ]
> > > > > > > > >
> > > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > > > >>>> I added ahci disk support in libxl and using it for week 
> > > > > > > > >>>> seems
> > > > that
> > > > > > was
> > > > > > > > >>>> ok, after a reply of Stefano Stabellini seems that xen disk
> > unplug
> > > > > > > > >>>> support only ide disks:
> > > > > > > > >>>>
> > > > > > > >
> > > > > >
> > > >
> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > > > c905374ee8663d5d8
> > > > > > > > >>>>
> > > > > > > > >>>> Today Paul Durrant told me that even if pv disk is ok also
> > with
> > > > ahci
> > > > > > and
> > > > > > > > >>>> the emulated one is offline can be a risk:
> > > > > > > > >>>> http://lists.xenproject.org/archives/html/win-pv-
> > devel/2015-
> > > > > > > > 10/msg00021.html
> > > > > > > > >>>>
> > > > > > > > >>>>
> > > > > > > > >>>> I tried to take a fast look in qemu code but I not 
> > > > > > > > >>>> understand
> > the
> > > > > > > > needed
> > > > > > > > >>>> thing for add the xen disk unplug support also for ahci, 
> > > > > > > > >>>> can
> > > > > > someone do
> > > > > > > > >>>> it or tell me useful information for do it please?
> > > > > > > > >>>>
> > > > > > > > >>>> Thanks for any reply and sorry for my bad english.
> > > > > > > > >>>>
> > > > > > > > >>> I'm not entirely sure what features you need AHCI to support
> > in
> > > > > > order
> > > > > > > > >>> for Xen to be happy.
> > > > > > > > >>>
> > > > > > > > >>> I'd guess hotplugging, but where I get confused is that IDE
> > disks
> > > > don't
> > > > > > > > >>> support hotplugging either, so I guess I'm not sure sure 
> > > > > > > > >>> what
> > you
> > > > > > need.
> > > > > > > > >>>
> > > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > > > >>
> > > > > > > > >> Hi John,
> > > > > > > > >>
> > > > > > > > >> we need something like
> > > > hw/i386/xen/xen_platform.c:unplug_disks
> > > > > > but
> > > > > > > > that
> > > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear"
> > like
> > > > > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > > > > Maybe this would be the right time to stop the craziness with
> > your
> > > > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would 
> > > > > > > > > never
> > > > happen
> > > > > > > > > on real hardware.
> > > > > > >
> > > > > > > Unfortunately, it's going to be difficult to remove such 
> > > > > > > 'craziness'
> > when
> > > > you
> > > > > > don't know a priori whether the VM has PV drivers or not.
> > > > > >
> > > > > > Why wouldn't you know that beforehand? I mean, even on real
> > > > hardware
> > > > > > you
> > > > > > can have different disk interfaces (IDE, AHCI, SCSI) and you install
> > > > > > the exact driver that your hardware needs. You just do the same
> > thing on
> > > > > > VM: If your hardware is PV, you install a PV driver. If your 
> > > > > > hardware is
> > > > > > IDE, you install an IDE driver. Whether it's PV or IDE is something 
> > > > > > that
> > > > > > you, the user, decided when configuring the VM, so you definitely
> > know.
> > > > > >
> > > > >
> > > > > That's not necessarily true. The host admin that provisions the VM 
> > > > > does
> > not
> > > > necessarily know what OS the user of that VM will install. The admin may
> > just
> > > > be providing a generic VM with an emulated CD drive that the user can
> > point
> > > > at any ISO they want.
> > > > >
> > > > > So, as a host admin, if you provide a VM with only PV backends and
> > your
> > > > user is trying to boot an OS with no PV drivers they are not going to be
> > > > happy, so you provide emulated devices. Then, at some point later, when
> > > > the user installs PV drivers, there really should be some way for those
> > drivers
> > > > to start up without any need to contact the host admin and have the VM
> > > > reconfigured.
> > > >
> > > > Why only IDE and xendisk then? Maybe I have an OS that works great
> > with
> > > > AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI
> > > > controller, or USB sticks, or... (and IDE will hardly ever be the
> > > > optimal one)
> > > >
> > > > What about network cards? My OS might support the Xen PV one, or it
> > > > might support rtl8139, or e1000, or virtio-net, or pcnet, or...
> > > >
> > > > Should we always put all of the hardware that can possibly be emulated
> > > > in a VM just so that the one right device is definitely included even
> > > > though we don't know what OS will be running?
> > > >
> > > > This is ridiculous.
> > >
> > > It might be, but to some extent it's reality. The reason that the
> > > default emulated network device chosen by xl is rtl8193 is that it has
> > > drivers in just about every OS. The same reason for IDE being the
> > > default choice for storage.
> > 
> > So what does this mean for a justification for the AHCI + xendisk hybrid
> > proposal?
> > 
> > > > Just tell your admin what virtual hardware you really need. (Or tell
> > > > them to give you a proper interface to configure your VMs yourself.)
> > > >
> > >
> > > My point is that the virtual hardware that the OS user wants will
> > > change. Before they install PV drivers, they will need emulated
> > > device. After installing PV drivers they will want PV devices. Should
> > > they really have to contact their cloud provider to make the switch,
> > > when at the moment it happens automatically and transparently (the
> > > AHCI problem aside)?
> > 
> > My point is that such a magic change shouldn't happen. It doesn't happen
> > on real hardware either and people still get things installed to non-IDE
> > disks.
> > 
> > There is no reason to install the OS onto a different device than will
> > be used later. With Linux, it's no problem at all because the PV drivers
> > are already included on the installation media anyway, and on Windows or
> > presumably any other OS you can load and install the drivers right from
> > the beginning.
> > 
> > In fact, I would be surprised if using xendisk instead of IDE for
> > installing Windows didn't result in a noticably faster installation.
> >
> 
> It most certainly would, but requiring users do it this way is likely to meet 
> some resistance I suspect.

Why do you think so? Installing the PV drivers afterwards doesn't seem
easier than just providing them during the installation.

> > Now, if you really insist on providing a legacy interface even to guests
> > that eventually use PV drivers, there actually are sane ways to
> > implement this. It will be tricky to make that transition now without
> > breaking compatibility, but it could have been done from the start.
> > 
> > Sane means for example that you don't open the same image twice (and
> > even read-write!) at the same time. This is a recipe for disaster and
> > it's surprising that you don't see corrupted images more often.
> > 
> 
> We don't because unplug is supposed to ensure the emulated device is
> gone before the PV frontend is started

The important part is the backend, but it seems that you open the second
instance of the image only when starting the PV frontend?

As long as you don't enable the user to use most of qemu's functionality
like starting block jobs (which would keep the IDE instance around even
after unplugging the disk), it might actually be safe assuming that the
guest cooperates. Not sure what a malicious guest could do, though, as
nobody seems to check whether IDE is really unplugged before the second
instance is opened. raw and qcow2 should be safe these days, but in
earlier times it would probably have been possible for the guest to
overwrite the image header and access arbitrary files on the host as
backing file. It might still be true for other image formats.

> > So if you wanted to have a clean solution, try to think how real
> > hardware would solve the problem. If you want me to suggest something
> > off the top of my head, I would come up with an extended IDE device (one
> > single device!) that provides the IDE I/O ports and additionally some
> > MMIO BAR that enables access to PV functionality.
> > 
> > Once you enable PV functionality, the IDE ports stop working; device
> > reset disables the PV ring and goes back to IDE mode. No hard disk
> > suddenly disappearing from the machine, no image corruption if the IDE
> > device is written to before enabling PV, etc.
> > 
> 
> That's not sufficient though. The IDE device must not be enumerated by
> the OS and, in Windows at least, that enumeration occurs before the PV
> frontend has started up.

The trick is that it's only a single device, so there is no second
device that must be prevented from being enumerated. You provide a
driver for this specific IDE controller, so Windows wouldn't even try
the generic IDE driver when your driver is available.

It's kind of the same sort of IDE controller extension as Bus Master
DMA, which just added a new BAR. If you had an old driver, it would just
ignore the new registers. If you had a new one, it would use them. But
in no way would the old appearance of the device simply disappear, you
just use an extended register set on the same device.

> > But it's your choice. You can keep your broken hack in IDE. Just don't
> > expect anyone to support adding new broken hacks to other devices.
> > 
> 
> I'd prefer to have a cleaner solution and I believe can achieve that in 
> Windows by obscuring the emulated disks using filter drivers, so that's the 
> way I'll probably go.

I wouldn't consider anything that works with two distinct disk devices
and two separate BlockDriverStates for the same image file a clean
solution.

Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.