[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PATastic fun



Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 22, 2013 at 02:54:16PM +0100, Stefan Bader wrote:
>> Hi Konrad,
> 
> Hey Stefan,
>> 
>> here is another one from the hm-what? department:
> 
> Heh - the really good-bug-hunting one. Lets also include Jinsong as
> he has been tracking a similar one with mcelog.
>> 
>> Colin discovered that running the attached program with the fork
>> active (e.g. "./mmap-example -f 0x10000", the address can be that or
>> iomem) this triggers the following weird messages: 
>> 
>> [ 6824.453724] mmap-example:3481 map pfn expected mapping type
>> write-back for [mem 0x00010000-0x00010fff], got uncached-minus
>> [ 6824.453776] ------------[ cut here ]------------
>> [ 6824.453796] WARNING: at
>> /build/buildd/linux-3.8.0/arch/x86/mm/pat.c:774
>> untrack_pfn+0xb8/0xd0() ... [ 6824.453920] Pid: 3481, comm:
>> mmap-example Tainted: GF 
>> 3.8.0-6-generic #13-Ubuntu
>> [ 6824.453926] Call Trace:
>> [ 6824.453944]  [<ffffffff8105879f>] warn_slowpath_common+0x7f/0xc0
>> [ 6824.453954]  [<ffffffff810587fa>] warn_slowpath_null+0x1a/0x20
>> [ 6824.453963]  [<ffffffff8104bcc8>] untrack_pfn+0xb8/0xd0
>> [ 6824.453975]  [<ffffffff81156c1c>] unmap_single_vma+0xac/0x100
>> [ 6824.453985]  [<ffffffff81157459>] unmap_vmas+0x49/0x90
>> [ 6824.453995]  [<ffffffff8115f808>] exit_mmap+0x98/0x170
>> [ 6824.454007]  [<ffffffff810559a4>] mmput+0x64/0x100
>> [ 6824.454017]  [<ffffffff810560f5>] dup_mm+0x445/0x660
>> [ 6824.454027]  [<ffffffff81056d9f>]
>> copy_process.part.22+0xa5f/0x1510 [ 6824.454038] 
>> [<ffffffff81057931>] do_fork+0x91/0x350 [ 6824.454048] 
>> [<ffffffff81057c76>] sys_clone+0x16/0x20 [ 6824.454060] 
>> [<ffffffff816ccbf9>] stub_clone+0x69/0x90 [ 6824.454069] 
>> [<ffffffff816cc89d>] ? system_call_fastpath+0x1a/0x1f [ 6824.454076]
>> ---[ end trace 4918cdd0a4c9fea4 ]--- 
>> 
>> I found that this is related to your bandaid patch
>> 
>> commit 8eaffa67b43e99ae581622c5133e20b0f48bcef1
>> Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>> Date:   Fri Feb 10 09:16:27 2012 -0500
>> 
>>     xen/pat: Disable PAT support for now.
>> 
>> I just do not understand how this happens. From the trace it seems
>> the fork 
>> fails when duplicating the VMAs (dup_mm calls mmput on failure). So
>> maybe the 
>> warning is just related to this. So primarily the question is how on
>> fork the _PAGE_PCD bit can become set? That and _PAGE_PWT are
>> cleared from the supported 
>> mask by the patch, so somehow I would think nothing should be able
>> to set it... 
>> But apparently not so.
>> Not sure it is a big deal since I never saw this in normal operation
>> and it 
>> seems to be ok when unapping before doing the fork. It is just plain
>> odd. 
> 
> Jinsong mentioned that there is some oddity with the MTRR. Somehow the
> ranges are swapped or not correct. Jinsong, could you shed some light
> on what you have found so far?
> 

Yes, Sander once also reported a similar weird warning when start mcelog 
daemon, as attached.

Basically, it occurs when mcelog user daemon start, 
do_fork
  --> copy_process
    --> dup_mm
      --> dup_mmap
        --> copy_page_range
          --> track_pfn_copy
            --> reserve_pfn_range
              --> line 624: flags != want_flags
It comes from different memory types of page table (_PAGE_CACHE_WB) and mtrr 
(_PAGE_CACHE_UC_MINUS).

However, why it get different memory types from page table and mtrr is still 
unclear, reproducing the bug is difficult and unstable.

Thanks,
Jinsong

>> 
>> -Stefan
> 
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <stdint.h>
>> #include <stdbool.h>
>> #include <unistd.h>
>> #include <sys/mman.h>
>> #include <sys/types.h>
>> #include <sys/stat.h>
>> #include <sys/types.h>
>> #include <sys/wait.h>
>> #include <fcntl.h>
>> 
>> int main(int argc, char **argv)
>> {
>>      uint8_t *data;
>>      int fd;
>>      unsigned long long offset;
>>      pid_t pid;
>>      int status;
>>      int opt;
>>      bool opt_fork = false;
>> 
>>      while ((opt = getopt(argc, argv, "f")) != -1) {
>>              switch (opt) {
>>              case 'f':
>>                      opt_fork = true;
>>                      break;
>>              }
>>      }
>> 
>>      if (argc <= optind) {
>>              fprintf(stderr, "%s: [-f] address\n", argv[0]);
>>              fprintf(stderr, "\t-f specifices if we should fork with the
>>      mmap\n");               exit(EXIT_FAILURE); }
>>      if (sscanf(argv[optind], "%lli", &offset) != 1) {
>>              fprintf(stderr, "Cannot determine mmap address from %s\n",
>>      argv[optind]);          exit(EXIT_FAILURE); }
>> 
>>      if ((fd = open("/dev/mem", O_RDONLY)) < 0) {
>>              fprintf(stderr, "Cannot open /dev/mem\n");
>>              exit(EXIT_FAILURE);
>>      }
>> 
>>      printf("mmap: 0x%llx..0x%llx\n", offset, offset + 4095);
>> 
>>      if ((data = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd,
>>              (off_t)offset)) == MAP_FAILED) { fprintf(stderr, "Cannot mmap
>>              0x%llx\n", offset); exit(EXIT_FAILURE);
>>      }
>> 
>>      close(fd);
>> 
>>      if (opt_fork) {
>>              pid = fork();
>>              if (pid == 0) {
>>                      /* child */
>>                      _exit(0);
>>              } else {
>>                      /* parent */
>>                      waitpid(pid, &status, 0);
>>              }
>>      }
>> 
>>      if (munmap(data, 4096) < 0) {
>>              fprintf(stderr, "Cannot munmap %p\n", data);
>>              exit(EXIT_FAILURE);
>>      }
>>      exit(EXIT_SUCCESS);
>> }
>> 
> 
> 
> 
> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel

--- Begin Message ---
Saturday, November 17, 2012, 3:14:10 PM, you wrote:

> Konrad Rzeszutek Wilk wrote:
>> On Fri, Nov 16, 2012 at 05:47:54PM +0100, Sander Eikelenboom wrote:
>>> 
>>> Friday, November 16, 2012, 5:07:33 PM, you wrote:
>>> 
>>>> On Fri, Nov 16, 2012 at 01:40:56PM +0100, Sander Eikelenboom wrote:
>>>>> Hi Konrad,
>>>>> 
>>>>> Sometime ago i reported this one at boot up:
>>>>> 
>>>>> [ 3009.778974] mcelog:16842 map pfn expected mapping type
>>>>> write-back for [mem 0x0009f000-0x000a0fff], got uncached-minus [
>>>>> 3009.788570] ------------[ cut here ]------------ [ 3009.798175]
>>>>> WARNING: at arch/x86/mm/pat.c:774 untrack_pfn+0xa1/0xb0() [
>>>>> 3009.807966] Hardware name: MS-7640 [ 3009.817677] Modules linked
>>>>> in: [ 3009.827524] Pid: 16842, comm: mcelog Tainted: G        W   
>>>>> 3.7.0-rc5-20121116-reverted-persistent-warn-patwarn #1 [
>>>>> 3009.837415] Call Trace: [ 3009.847110]  [<ffffffff810674fa>]
>>>>> warn_slowpath_common+0x7a/0xb0 [ 3009.856857] 
>>>>> [<ffffffff81067545>] warn_slowpath_null+0x15/0x20 [ 3009.866562] 
>>>>> [<ffffffff81042041>] untrack_pfn+0xa1/0xb0 [ 3009.876201] 
>>>>> [<ffffffff8111a59b>] unmap_single_vma+0x86b/0x8e0 [ 3009.885895] 
>>>>> [<ffffffff81100f16>] ? release_pages+0x196/0x1f0 [ 3009.895488] 
>>>>> [<ffffffff8111a65c>] unmap_vmas+0x4c/0xa0 [ 3009.905134] 
>>>>> [<ffffffff8111c8fa>] exit_mmap+0x9a/0x180 [ 3009.914706] 
>>>>> [<ffffffff81064e72>] mmput+0x52/0xd0 [ 3009.924252] 
>>>>> [<ffffffff810652b7>] dup_mm+0x3c7/0x510 [ 3009.933839] 
>>>>> [<ffffffff81065fd5>] copy_process+0xac5/0x14a0 [ 3009.943430] 
>>>>> [<ffffffff81066af3>] do_fork+0x53/0x360 [ 3009.952843] 
>>>>> [<ffffffff810b25c7>] ? lock_release+0x117/0x250 [ 3009.962283] 
>>>>> [<ffffffff817d26c0>] ? _raw_spin_unlock+0x30/0x60 [ 3009.971532] 
>>>>> [<ffffffff817d3495>] ? sysret_check+0x22/0x5d [ 3009.980820] 
>>>>> [<ffffffff81017523>] sys_clone+0x23/0x30 [ 3009.990046] 
>>>>> [<ffffffff817d37f3>] stub_clone+0x13/0x20 [ 3009.999335] 
>>>>> [<ffffffff817d3469>] ? system_call_fastpath+0x16/0x1b [
>>>>> 3010.008667] ---[ end trace 2d9694c2c0a24da8 ]---  
>>>>> 
>>>>> 
>>>>> It seems to be due to the "mcelog" userspace tool provided with
>>>>> Debian Squeeze (mcelog 1.0~pre3-3  x86-64 Machine Check Exceptions
>>>>> collector and decoder). I can trigger this warning easily by
>>>>> restarting the mcelog tool with /etc/init.d/mcelog restart  
>>>>> 
>>>>> Should that one also function with the xen mcelog driver, or is a
>>>>> newer version required ? 
>>> 
>>>> The reason we get is b/c I had to disable the PAT functionality in
>>>> the Linux kernel for Xen. This is b/c it only worked one way -
>>>> meaning you could convert a page from 
>>>> WriteBack to WriteCombine or WriteBack to Uncached. But you could
>>>> not 
>>>> do WriteCombine back to WriteBack - due to one of the functions that
>>>> changes the bits was using an "unfiltered" way to identify the bits
>>>> on the 
>>>> page.
>>> 
>>>> Anyhow, we had a disaster b/c some of these pages that used to
>>>> WriteBack (WB) 
>>>> got converted to WriteCombine (WC) and then were returned back as
>>>> such 
>>>> to the page pool. And if they were re-used by filesystem invariably
>>>> we got 
>>>> corruptions.
>>> 
>>>> So until the PAT table lookup thing that Peter H. Anvin suggested
>>>> gets implemented this splat gotta show up :-(
>>> 
>>> Not a big problem for me, i was just wondering :-)
>>> I'm more interested in the netfront troubles, since it's already rc5.
>>> 
>>>> Does mcelog still work even with this warning?
>>> 
>>> Not the daemon:
>>> 
>>> serveerstertje:~# sh /etc/init.d/mcelog start
>>> Starting Machine Check Exceptions decoder: daemon: Cannot allocate
>>> memory 
>>> 
>> Ugh.
>> CC-ing Liu here.
>> 

> How to reproduce it (sorry I miss the history of this thread)?
> I have a try at Xen side, w/ kernel 3.6.0-rc7+, no problem at my side.

I'm using:
- Xen-unstable, latest changeset=26152
- Linux 3.7.0-rc5 kernel latest 
changeset=79e979eae0df58831e85281e3285f63663f3cf76
- Dom0 OS = Debian Squeeze
- mcelog package: 1.0~pre3-3  x86-64 Machine Check Exceptions collector and 
decoder)

I don't have particular interest in mcelog, but i just encountered this warn on 
boot.
The warn (see above) is shown when the mcelog daemon (from the debian package) 
tries to start (for example on boot)

I have attached the .config from the used kernel.

If you need any more info, please ask !

--
Sander


> Thanks,
> Jinsong

>>> 
>>>>> 
>>>>> --
>>>>> Sander

Attachment: dotconfig
Description: dotconfig


--- End Message ---
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.