[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ocplib+endian improvement



Hm, but trying out TCP seems to trigger a softlockup in Xen netback (both 
kernels 3.2 and latest 3.7). Will need to do some more debugging tomorrow...

Steven (Smith): any luck with the grant-free netback modification?  I could try 
it out at the same time as debugging this particular issue.

-anil

[  277.249069] Code: 00 89 c1 7c c8 41 59 c3 90 90 90 65 c6 04 25 41 b1 00 00 
00 65 f6 04 25 40 b1 00 00 ff 74 05 e8 47 00 00 00 c3 66 0f 1f 44 00 00 <65> c6 
04 25 41 b1 00 00 01 c3 66 0f 1f 44 00 00 65 f6 04 25 41 
[  305.248782] BUG: soft lockup - CPU#0 stuck for 23s! [netback/0:3772]
[  305.248883] Modules linked in: xt_physdev iptable_filter ip_tables x_tables 
xen_netback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss nfs_acl 
nfs lockd dns_resolver fscache sunrpc bridge stp llc loop crc32c_intel 
ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd xts lrw gf128mul 
snd_pcm sp5100_tco snd_page_alloc snd_timer snd soundcore tpm_tis tpm 
amd64_edac_mod edac_mce_amd i2c_piix4 i2c_core dcdbas evdev pcspkr tpm_bios 
edac_core microcode psmouse fam15h_power k10temp serio_raw acpi_power_meter 
button processor thermal_sys ext4 crc16 jbd2 mbcache dm_mod sg sd_mod 
crc_t10dif ata_generic ohci_hcd pata_atiixp ixgbe ahci ptp libahci pps_core 
libata ehci_hcd dca mdio scsi_mod bnx2 usbcore usb_common
[  305.248932] CPU 0 
[  305.248934] Pid: 3772, comm: netback/0 Tainted: G        W    
3.7-trunk-amd64 #1 Debian 3.7.1-1~experimental.2 Dell Inc. PowerEdge R415/08WNM9
[  305.248936] RIP: e030:[<ffffffffa021c153>]  [<ffffffffa021c153>] 
xen_netbk_tx_build_gops+0x19d/0x7ad [xen_netback]
[  305.248941] RSP: e02b:ffff88020eba3ca8  EFLAGS: 00000217
[  305.248943] RAX: 0000000073202626 RBX: ffffc90007ab5000 RCX: 000000001ce71e08
[  305.248945] RDX: 000000001ce71e08 RSI: ffffc90007ab02a8 RDI: ffff880653403800
[  305.248946] RBP: ffffc90007ab70c0 R08: ffff8806534038d8 R09: ffff88020eba3c74
[  305.248947] R10: ffffc90007ab0208 R11: ffffc90007ab0208 R12: ffff880653403800
[  305.248949] R13: 0000000000007320 R14: 0000000000000000 R15: 000000000104210e
[  305.248953] FS:  00007f59123a4700(0000) GS:ffff8807ff800000(0000) 
knlGS:0000000000000000
[  305.248955] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  305.248956] CR2: 00007fd7c822b070 CR3: 00000007f2177000 CR4: 0000000000000660
[  305.248958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  305.248960] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  305.248961] Process netback/0 (pid: 3772, threadinfo ffff88020eba2000, task 
ffff8807ee921180)
[  305.248963] Stack:
[  305.248964]  ffffffff81004067 ffffffff81004202 1ce71e0800000003 
ffff8807ee921180
[  305.248968]  ffffc90007ab70c0 ffff8807ff811740 0000000000000000 
2074706fffff2e2f
[  305.248972]  ffff4f2073202626 2074706f20646c72 ffffffff810037f7 
ffff8807ee921180
[  305.248975] Call Trace:
[  305.248978]  [<ffffffff81004067>] ? arch_local_irq_restore+0x7/0x8
[  305.248982]  [<ffffffff81004202>] ? xen_mc_flush+0x11d/0x160
[  305.248985]  [<ffffffff810037f7>] ? xen_mc_issue.constprop.22+0x10/0x4d
[  305.248988]  [<ffffffff8100d02f>] ? load_TLS+0x7/0xa
[  305.248991]  [<ffffffff8100d60c>] ? __switch_to+0x195/0x3f8
[  305.248994]  [<ffffffff8105fadb>] ? mmdrop+0xd/0x1c
[  305.248996]  [<ffffffff81061390>] ? finish_task_switch+0x83/0xb4
[  305.249000]  [<ffffffff813778e9>] ? __schedule+0x4b2/0x4e0
[  305.249003]  [<ffffffff8107f0d3>] ? arch_local_irq_disable+0x7/0x8
[  305.249006]  [<ffffffff81378465>] ? _raw_spin_lock_irqsave+0x14/0x35
[  305.249009]  [<ffffffff8107f0d3>] ? arch_local_irq_disable+0x7/0x8
[  305.249012]  [<ffffffff81378465>] ? _raw_spin_lock_irqsave+0x14/0x35
[  305.249016]  [<ffffffffa021c897>] ? xen_netbk_kthread+0x134/0x78d 
[xen_netback]
[  305.249019]  [<ffffffff8105d78f>] ? arch_local_irq_enable+0x7/0x8
[  305.249022]  [<ffffffff81061357>] ? finish_task_switch+0x4a/0xb4
[  305.249025]  [<ffffffff81057987>] ? abort_exclusive_wait+0x79/0x79
[  305.249029]  [<ffffffffa021c763>] ? xen_netbk_tx_build_gops+0x7ad/0x7ad 
[xen_netback]
[  305.249032]  [<ffffffffa021c763>] ? xen_netbk_tx_build_gops+0x7ad/0x7ad 
[xen_netback]
[  305.249035]  [<ffffffff810570ac>] ? kthread+0x81/0x89
[  305.249038]  [<ffffffff810037f7>] ? xen_mc_issue.constprop.22+0x10/0x4d
[  305.249041]  [<ffffffff8105702b>] ? __kthread_parkme+0x5c/0x5c
[  305.249043]  [<ffffffff8137d6bc>] ? ret_from_fork+0x7c/0xb0
[  305.249046]  [<ffffffff8105702b>] ? __kthread_parkme+0x5c/0x5c
[  305.249048] Code: bc 24 80 00 00 00 4d 89 a4 24 a8 00 00 00 49 c7 84 24 a0 
00 00 00 58 bf 21 a0 4c 89 f6 e8 86 d9 e2 e0 e9 c3 05 00 00 8b 54 24 14 <66> 8b 
74 24 3e 41 8d 4f ff 0f b7 44 24 42 48 c7 44 24 30 00 00 
avsm@gabriel:~/src/git/mirage/mirage-www$ 
Message from syslogd@gabriel at Jan  6 21:06:38 ...

On 6 Jan 2013, at 18:44, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:

> I've been porting the network stack to take advantage of the cstruct 
> turbo-boost that Pierre and Thomas worked on. This optimisation adds compiler 
> built-ins (in 4.01.0+) which let the code generator optimise away many of the 
> temporary values required for low-level optimisation.
> 
> Here's a (very quick) before/after for a ping flood (which is a good stress 
> test of the low-level shared ring, network driver and protocol stack).
> 
> For a ping flood With 4.00.1 without the optimisation:
> 73755 packets transmitted, 73702 received, +49 duplicates, 0% packet loss, 
> time 6283ms
> rtt min/avg/max/mdev = 0.031/0.228/1209.178/9.887 ms, pipe 14850, ipg/ewma 
> 0.085/0.036 ms                                                                
>                           
> 
> and with the optimisation:
> 41791 packets transmitted, 41764 received, +25 duplicates, 0% packet loss, 
> time 3539msrtt min/avg/max/mdev = 0.030/0.188/1261.042/8.459 ms, pipe 14742, 
> ipg/ewma 0.084/0.039 ms
> 
> So our average latency drops quite significantly (0.228 -> 0.188), as does 
> CPU load (not shown above).
> 
> I've not committed these changes to the mainstream yet until I test out TCP 
> more, but it's getting there!
> 
> -anil




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.