[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Problem for modifying block split driver model (blkback and blkfront)

  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Hwandori <hwandori@xxxxxxxxx>
  • Date: Tue, 20 Feb 2007 23:15:16 +0900
  • Delivery-date: Tue, 20 Feb 2007 06:14:34 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type; b=Qy8MmjpiZz02hfEXHacJcvHHoS2+i5Yhqesfnpij3D//OPZFC2vOEh3L8k+jxyAKJnVw+8J0it6ymXlqiPkDttOHqvtHZO4MG/fVfq8EjhD1/4B3rEOCyZV2ut2h+9QIT1tH5lvilhfdttyO2pAOAc2cGAMO31N6Dyf7Oe7ts9o=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Dear experts,

I'm trying to modify the structure of blkback and blkfront driver.
Originally, blkfront driver grant foreign references of its own pages, then blkback driver modify the page table mapping to pages containing I/O data to operate DMA using HYPERVISOR_grant_table_op hypercall ( as you know, at the first this hypercall checks the permission of current domain accessing the pages of foreign domain, and update virtual address mapping of own page to the page of the target domain(foreign or domU) )

For testing purposes, I'm modifying the this structure of split driver model. In my design, blkback has global page caches, in other words dom0 has all page caches, so I got rid of the grant table mechanism from the original model.
First, blkfront driver simply requests the I/O request into I/O ring without granting reference of pages,  then blkback receive the this request, allocate pages as many segments as blkfront requests, and requests I/O operation to generic block layer(using bio interface) using allocated pages in dom0 as the target pages.( in original design, blkback requests I/O operation using pending_pages actually indicating to foreign pages as the target pages )
Second, blkback responses to blkfront including mfn of allocated pages in dom0 because blkfront will update its page table mapping to allocated pages in dom0.
Finally, blkfront receives the response via virtual interrupt mechanism in blkif_int interrupt handler. At this time, blkfront gets know about mfn of the pages containing actually requested I/O data in dom0, so updates mapping the virtual address of the requested page caches of domU to the machine address of pages of dom0.
At the first, I used grant table mechanism, but intention of the grant table is not suitable for the modified design. So, I used the HYPERVISOR_do_update_va_mapping_otherdomain hypercall. This hypercall is permitted to the privileged domain(dom0), but the caller domain is domU in my design. Therefore, I temporally remove the code of making a check on privileged domain. Also, this hypercall modify only the virtual to machine mapping in caller domain. So, I append the code of setting the machine to phyical mapping in this hypercall, and the code of setting physical to machine mapping in the blkfront driver after calling this hypercall.

In this design and implementation, what did I overlooking?
As a result of testing, V2M, M2P, and P2M turn out to be updated correctly according to the log, but during the booting of the domU(exactly while mounting the root filesystem), kernel panic is occured. The panic message is as below.

Begin: Loading essential drivers... ...
Begin: Loading blkfront driver(xenblk.ko) ...
Registering block device major 3
blkfront: hda2: barriers enabled
blkfront: hda1: barriers enabled
Begin: Running /scripts/init-premount ...
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...

Unable to handle kernel paging request at virtual address c85008cc
 printing eip:
*pde = ma 01ce2067 pa 0001c067
*pte = ma 23745061 pa 08500061
Oops: 0003 [#1]
Modules linked in: ide_generic processor xenblk
CPU:    0
EIP:    0061:[<c0154b90>]    Not tainted VLI
EFLAGS: 00010282   ( #17)
EIP is at __handle_mm_fault+0x8a0/0xc40
eax: ffffffea   ebx: b7e33000   ecx: 2e7ad067   edx: 00000000
esi: 00000000   edi: 2e7ad067   ebp: c85008cc   esp: c8663efc
ds: 007b   es: 007b   ss: 0069
Process udevd (pid: 1590, threadinfo=c8662000 task=c124f030)
Stack: <0>00000000 00000000 c0158754 00000ea2 000081a4 c71cd7ac c71cd7ac 00000001
       b7e33000 c71cd7ac c03c2ac0 000008cc c03a4b7c c03c2b04 00000000 b7e33000
       00000000 c0159060 00000001 00100073 00000000 00000000 000b7e33 00000000
Call Trace:
 [<c0158754>] arch_get_unmapped_area_topdown+0x64/0x160
 [<c03a4b7c>] profile_setup+0x7c/0x110
 [<c0159060>] do_mmap_pgoff+0x540/0x72d
 [<c011490d>] do_page_fault+0x43d/0x8df
 [<c01144d0>] do_page_fault+0x0/0x8df
 [<c01055cf>] error_code+0x2b/0x30
Code: 48 78 74 10 81 f9 00 45 33 c0 74 08 89 7d 00 e9 67 fc ff ff 31 f6 8b 5c 24 20 89 f9 89 f2 e8 38 c6 fa ff 85 c0 0f 84 50 fc ff ff <89> 7d 00 e9 48 fc ff ff 8b 74 24 30 8b 4c 24 20 8b 54 24 24 8b

BUG: soft lockup detected on CPU#0!

Pid: 1590, comm:                udevd
EIP: 0061:[<c02db58a>] CPU: 0
EIP is at _spin_lock+0xa/0x10
 EFLAGS: 00000286    Not tainted  ( #17)
EAX: c03c2b04 EBX: c03c2ac0 ECX: 00000000 EDX: c03c2ac0
ESI: 00007ff0 EDI: c03c2ac0 EBP: c124f030 DS: 007b ES: 007b
CR0: 8005003b CR2: c85008cc CR3: 0130a000 CR4: 00000640
 [<c01134d8>] mm_unpin+0x18/0x30
 [<c0113578>] _arch_exit_mmap+0x88/0x190
 [<c0144e35>] __do_IRQ+0xc5/0x110
 [<c01574ec>] exit_mmap+0x1c/0x100
 [<c02da057>] cond_resched+0x37/0x50
 [<c011ccc3>] mmput+0x33/0xa0
 [<c012226f>] do_exit+0xcf/0x850
 [<c011007b>] nmi_watchdog_tick+0x7b/0xd0
 [<c0105f3d>] die+0x24d/0x250
 [<c01147de>] do_page_fault+0x30e/0x8df
 [<c01144d0>] do_page_fault+0x0/0x8df
 [<c01055cf>] error_code+0x2b/0x30
 [<c0154b90>] __handle_mm_fault+0x8a0/0xc40
 [<c0158754>] arch_get_unmapped_area_topdown+0x64/0x160
 [<c03a4b7c>] profile_setup+0x7c/0x110
 [<c0159060>] do_mmap_pgoff+0x540/0x72d
 [<c011490d>] do_page_fault+0x43d/0x8df
 [<c01144d0>] do_page_fault+0x0/0x8df
 [<c01055cf>] error_code+0x2b/0x30

first panic message is occured from the page fault. faulting address 23745061 is not updated PTE through the debugging log.
second panic message is about mmap..

All panic is about udev. I don't know about the reason of these panics. ( when implementing using grant_table_op before, panic was occured in usplash_write process related to the mmap operation, so I doubt that panic is related to the mmap operation or udev process.
Please help me.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.