[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: Weird Issue with raid 5+0
I forwarding this to xen-devel because it appears to be a bug in dom0 kernel. I recently experienced a strange issue with software raid1+0 under Xen on a new machine. I was getting corruption in my guest volumes and tons of kernel messages such as: [305044.571962] raid0_make_request bug: can't convert block across chunks or bigger than 64k 14147455 4 The full thread is located at http://marc.info/?t=126672694700001&r=1&w=2 Detailed output at http://pastebin.com/f6a52db74 It appears after speaking with the linux-raid mailing list that this is due a bug which has been fixed but the fix is not included in the dom0 kernel. I'm not sure what sources kernel 2.6.26-2-xen-amd64 is based on, but since xenlinux is still at 2.6.18 I was assuming that this bug would still exist. My questions for xen-devel are: Can you tell me if there is any dom0 kernel where this issue is fixed? Is there anything I can do to help get this resolved? Testing? Patching? - chrris On Mon, Mar 8, 2010 at 12:50 AM, Neil Brown <neilb@xxxxxxx> wrote: > On Sun, 21 Feb 2010 19:16:40 +1100 > Neil Brown <neilb@xxxxxxx> wrote: > >> On Sun, 21 Feb 2010 02:26:42 -0500 >> chris <tknchris@xxxxxxxxx> wrote: >> >> > That is exactly what I didn't want to hear :( I am running >> > 2.6.26-2-xen-amd64. Are you sure its a kernel problem and nothing to >> > do with my chunk/block sizes? If this is a bug what versions are >> > affected, I'll build a new domU kernel and see if I can get it working >> > there. >> > >> > - chris >> >> I'm absolutely sure it is a kernel bug. > > And I think I now know what the bug is. > > A patch was recently posted to dm-devel which I think addresses exactly this > problem. > > I reproduce it below. > > NeilBrown > > ------------------- > If the lower device exposes a merge_bvec_fn, > dm_set_device_limits() restricts max_sectors > to PAGE_SIZE "just to be safe". > > This is not sufficient, however. > > If someone uses bio_add_page() to add 8 disjunct 512 byte partial > pages to a bio, it would succeed, but could still cross a border > of whatever restrictions are below us (e.g. raid10 stripe boundary). > An attempted bio_split() would not succeed, because bi_vcnt is 8. > > One example that triggered this frequently is the xen io layer. > > raid10_make_request bug: can't convert block across chunks or bigger than 64k > 209265151 1 > > Signed-off-by: Lars <lars.ellenberg@xxxxxxxxxx> > > > --- > drivers/md/dm-table.c | 12 ++++++++++-- > 1 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > index 4b22feb..c686ff4 100644 > --- a/drivers/md/dm-table.c > +++ b/drivers/md/dm-table.c > @@ -515,14 +515,22 @@ int dm_set_device_limits(struct dm_target *ti, struct > dm_dev *dev, > > /* > * Check if merge fn is supported. > - * If not we'll force DM to use PAGE_SIZE or > + * If not we'll force DM to use single bio_vec of PAGE_SIZE or > * smaller I/O, just to be safe. > */ > > - if (q->merge_bvec_fn && !ti->type->merge) > + if (q->merge_bvec_fn && !ti->type->merge) { > limits->max_sectors = > min_not_zero(limits->max_sectors, > (unsigned int) (PAGE_SIZE >> 9)); > + /* Restricting max_sectors is not enough. > + * If someone uses bio_add_page to add 8 disjunct 512 byte > + * partial pages to a bio, it would succeed, > + * but could still cross a border of whatever restrictions > + * are below us (e.g. raid0 stripe boundary). An attempted > + * bio_split() would not succeed, because bi_vcnt is 8. */ > + limits->max_segments = 1; > + } > return 0; > } > EXPORT_SYMBOL_GPL(dm_set_device_limits); > -- > 1.6.3.3 > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |