[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Full virtualization and I/O



Hi Mats

Thanks a lot for your detailed reply!

You wrote:
> For fully virtualized mode (hardware supported virtual machine, such as
> AMD-V or Intel VT, aka HVM), there is a different model, where a "device
> model" is involved to perform the hardware modelling. In Xen, this is
> using a modified version of qemu (called qemu-dm), which has a fairly
> complete set of "hardware" in it's model. It's got for example IDE
> controller, several types of network devices, graphics and
> mouse/keyboard models. The things you'd usually find in a PC, that is.
> The way it works is that the hypervisor intercepts IOIO and memory
> mapped IO regions that match the devices involved (such as the
> A0000-BFFFF region for VGA frame buffer memory or the 0x1F0-0x1F7 IO
> ports for the IDE controller), and forwards a request from the
> hypervisor to qemu-dm, where the operation changes the current state,
> and when it's necessary, the state-change will result in for example a
> read-request to the "hard-disk" (which may be a real disk, a file on a
> local disk, or a file on a network storage device, to give some
> examples).

This is very interesting. So qemu models the low level device interface 
(I/O interface) in software and translates I/O actions to either model 
changes or to library or system calls (since QEMU runs as normal process).

Is there any documentation about this or is the source the doc ;)

> Do you by ISA mean "Instruction Set Architecture" or something else (I
> presume it's NOT meaning ISA-bus...)?

Yes, I mean instruction set architecture.

> Intercepting IOIO instructions or MMIO instructions is not that hard -
> in HVM the two processor architectures have specific intercepts and
> bitmaps to indicate which IO instructions should be intercepted. MMIO
> will require the page-tables to be set up such that the memory mapped
> region is mapped "not present" so that any operation to this region
> gives a page-fault, and then the page-fault is analyzed to see if it's
> for a MMIO address or for a "real page fault".
>
> For para-virtualization, the model is similar, but the exact model of
> how to intercept the IOIO or MMIO instruction is slightly different -
> but in essence it's the same principle. Let me know if you really need
> to know how Xen goes about doing this, as it's quite complicated (more
> so than the HVM version, for sure).

Although it is interesting to see how interception works in detail, I am 
currently more interested in how device state is modelled and translated 
into system/library calls or sequences of I/O instructions. So, in fact 
the operation after the interception has taken place.

> Not sure what you're asking for here. Since the devices are either
> modeled after a REAL device (qemu-dm) and as such will resemble as
> closely as possible the REAL hardware device that it's emulating, or in
> the frontend/backend driver, there is an "idealized model", such that
> the request contains just the basic data that the OS provides normally
> to the driver, and it's placed in a queue with a message-signaling
> system to tell the other side that it's got something in the queue.

I am basically asking about general/theoretical concepts behind device 
modelling as e.g. done by qemu. I think it's a good idea to understand how 
qemu actually does this.

> Certainly not - I would say that almost all devices are NOT time
> partitionable, as the state in the device is dependant on the current
> usage. The more complex the device is, the more likely it is to have
> difficulties, but even such a simple deevice as a serial port would
> struggle to work in a time-shared fashion (not to mention that serial
> ports generally are used for multiple transactions to make a whole
> "bigger picture transaction", so for example a web-server connected via
> a serial modem would send a packet of several hundred bytes to the
> serial port driver, which is then portioned out as and when the serial
> port is ready to send another few bytes. If you switch from one guest to
> another during this process, and the second guest also has something to
> send on the serial port, you'd end up with a very scrambled message from
> the first guest and quite likely the second guests message completely
> lost!).

Very nice example. Clearly, high level driver interfaces (e.g. 
send/receive, read/write) can be designed in a way that time-sharing is 
possible, e.g. using message/transaction queues. On the I/O level, it is 
likely to be harder to reconstruct the "full transaction". It might also 
be necessary to make assumptions about the actual guest, i.e. the way the 
device is being used.

> A particular problem is devices where you can't necessarily read back
> the last mode-setting, which may well be the case in many different
> devices. You can't, for example, read back all the registers on an IDE
> device, because the read of a particular address amy give the status
> rather than the current comamnd sent, or some such.

This could be stored in memory when you have a virtual (in-memory) device 
model.


Best wishes

Thomas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.