[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] blktap2 and CONFIG_XEN_BLKBACK_PAGEMAP



On Mon, 2010-07-19 at 09:36 -0400, Ian Campbell wrote:
> On Fri, 2010-07-16 at 21:16 +0100, Daniel Stodden wrote:
> > 
> > The reason for the duplicate mapping is that userspace has to re-queue
> > those frames at the physical device layer, and -- iirc -- the problem
> > was that queuing pages twice, once on the blktap2 bdev and once on the
> > underlying disk, will deadlock.
> 
> I was wondering what the duplicate mappings were for just last week.
> 
> So is this need to play tricks with the p2m to avoid a deadlock the only
> dependency blktap2 has on Xen? IOW if we could find another way around
> the deadlock would a) blktap2 be esable on native and/or b) would all
> the Xen specific bits (grant mappings etc) be confined to blkback only?

[cc Jake. Did most of the mapping code, and still the one who knows best
what prevents that path from getting simpler.]

Both the xen and native datapaths are presently inlined in the same disk
type. The solution to that would be an ops struct to separate the
handling. But that's certainly not a hard problem.

Apart from that, I believe native was more of a problem than blkback.

Only out my memory: Consider non-foreign r/w in dom0. There's going to
be a page lock foregoing queuing on the tapdev. And a second lock
attempt on the path from tapdisk to the physical device, because what
userland is sending down the native I/O path is sold as normal user
memory.

So it's probably rather tribute to zero-copy than anything else. The
problem might evaporate if the physical I/O were bounced off anon
memory. That might be one possible alternative.

Note that the blkback path is different, because it directly goes for
the disk queue, not through the filemap. I'd expect that to just work. 


> I guess the difference between blktap and e.g. device mapper is that in
> the later case the requeuing is done in the kernel and in the former the
> page goes via userspace and hence the association with the original I/O
> is lost?

Yep.

I think another difference was that dm nodes only do request
translation, then just pass them on the the physical layer. So dm nodes
are rather thin compared to a tapdev. But that might not matter here.

Daniel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.