[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fwd: NetBSD xl core-dump not working... Memory fault (core dumped)



On Fri, 2013-11-08 at 09:20 -0800, John Nemeth wrote:
> On Nov 8, 10:29am, Ian Campbell wrote:
> } On Thu, 2013-11-07 at 21:04 +0000, Miguel C. wrote:
> } > yes its 4.2 from pkgsrc.
> } 
> } Thanks, that might be enough.
> 
>      More specifically, it's 4.2.3.

Thanks. This seems to confirm that it is the memcpy I pointed to below.

I'm afraid that any further progress here is going to require input from
you on the other questions I asked, and perhaps from someone who
understands how the NetBSD kernel (in particular the privcmd driver)
operates.

Ian.

> 
> } >  how can i get the changeset id?
> } 
> } that'd be one for the port-xen folks I think. It might be printed in the
> } xen dmesg, but that depends on how it was built and 4.2 may be too old
> } to have such functionalilty.
> 
>      xl dmesg says:
> 
> (XEN) Latest ChangeSet: unavailable
> 
> The package was built using this tarball:
> 
> http://bits.xensource.com/oss-xen/release/4.2.3/xen-4.2.3.tar.gz
> 
> And, just for reference, this is the info we have on the tarball:
> 
> SHA1 (xen-4.2.3.tar.gz) = 7c72e1aa870cc938afdc50bd9f2d879118aa8b99
> RMD160 (xen-4.2.3.tar.gz) = da0fbb7bbc0796bd83c223f7d21015ce0d4c8553
> Size (xen-4.2.3.tar.gz) = 15613235 bytes
> 
> } > Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> } > >On Mon, 2013-11-04 at 22:13 +0000, Mike C. wrote:
> } > >> On 31.10.13 04:34, Miguel Clara wrote:
> } > >> 
> } > >> > I was trying to get a core-dump for a domU with xl and got this
> } > >error:
> } > >> >
> } > >> > # xl dump-core 20 test.core
> } > >> > Memory fault
> } > >> >
> } > >> > GDB shows this:
> } > >> >
> } > >> > a# gdb xl xl.core
> } > >> > GNU gdb (GDB) 7.3.1
> } > >> > Copyright (C) 2011 Free Software Foundation, Inc.
> } > >> > License GPLv3+: GNU GPL version 3 or
> } > >later<http://gnu.org/licenses/gpl.html>
> } > >> > This is free software: you are free to change and redistribute it.
> } > >> > There is NO WARRANTY, to the extent permitted by law.  Type "show
> } > >copying"
> } > >> > and "show warranty" for details.
> } > >> > This GDB was configured as "x86_64--netbsd".
> } > >> > For bug reporting instructions, please see:
> } > >> > <http://www.gnu.org/software/gdb/bugs/>...
> } > >> > Reading symbols from /usr/sbin/xl...done.
> } > >> > [New process 1]
> } > >> > Core was generated by `xl'.
> } > >> > Program terminated with signal 11, Segmentation fault.
> } > >> > #0  0x00007f7ff7007b45 in xc_domain_dumpcore_via_callback
> } > >> > (xch=0x7f7ff7b0d800, domid=20, args=0x7f7fffffdae0,
> } > >> > dump_rtn=0x7f7ff700632c<local_file_dump>)
> } > >> >      at xc_core.c:860
> } > >
> } 
> } In 4.2.0 this corresponds to
> }  memcpy(dump_mem, vaddr, PAGE_SIZE);
> } which is a plausible source of a segfault.
> } 
> } xc_core.c has only changed in immaterial ways (although ways which
> } caused all the line numbers to shift) since 4.2.0 AFAICT so it is likely
> } that this bug is still present.
> } 
> } Can you tell via gdb what the faulting address was and whether it
> } corresponds to dump_mem or vaddr? gdb's "info locals" might give you at
> } least some of that? Also you can use disas to identify the precise
> } instruction at 0x00007f7ff7007b45, which will show you the registers
> } which might lead you to the faulting address.
> } 
> } vaddr is certainly not NULL, it's checked right before. It could be
> } non-NULL and still invalid if xc_map_foreign_range were buggy on NetBSD,
> } but that is surely used elsewhere? I suppose it might have mapped an MFN
> } which was either invalid (or became invalid, but your bug is
> } deterministic, right?. IIRC NetBSD's privcmd foreign mappings are
> } populated lazily and not immediately like on Linux? If that were the
> } case (and I'm only vaguely aware of how NetBSD operates) then it would
> } be plausible that xc_map_foreign_range would succeed but that a
> } subsequent attempt to access the region would fault?
> } 
> } dump_mem isn't NULL, it's a pointer into the dump_mem_start array which
> } has a check for failure when it is allocated. Since dump_mem is just
> } normal process memory and vaddr is a magic foreign mapping I'd be
> } inclined to suspect vaddr was not right in some way...
> } 
> } Does "xl -vvv core-dump" give any useful additional logging?
> } 
> } Unfortunately I don't think anyone has done valgrind support for
> } debugging processes which use Xen hypercalls for *BSD (if you were very
> } keen you could probably follow what was done for Linux
> } 
> http://blog.xen.org/index.php/2013/01/18/using-valgrind-to-debug-xen-toolstacks/
> } and wire up the BSD privcmd ioctl to the generic Xen hypercall code I
> } added)
> } 
> } I fear this bug is going to take someone on the ground with a NetBSD
> } system and the ability to dive into BSD kernel internals to get to the
> } bottom of...
> } 
> } Ian.
> } 
> }-- End of excerpt from Ian Campbell
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.