[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Poor SMP performance pv_ops domU
I've tried with various kernel's today - pv_ops seems to only use 1 core out of 8. PV spinlocks makes no difference. The thing that sticks out most is I cannot get the dom0 (xen-3.4.2) to show more that about 99.7% cpu usage for any pv_ops kernel. #!/usr/bin/perl while () {} running 8 of these loads 2.6.18.8-xenU with nearly 800% cpu as shown in dom0 running the same 8 in any pv_ops kernel's only gets as high as about 99.7% Inside the pv and xenU kernels top -s show all 8 cores being used. John On 18 May 2010, at 19:38, Jeremy Fitzhardinge wrote: > On 05/18/2010 10:34 AM, John Morrison wrote: >> Hi, >> >> Over the last year we have tried many times to get acceptable performance >> from pv_ops kernels. >> >> Tests done with 1,2,4 and 8 cores. The more cores the lower the score. >> >> Inside the domU it shows all cores, top -s shows all cores in use. >> xentop in dom0 never shows over 99% cpu. >> >> 2.6.18.8-xenU kernel show's over 700% cpu and the scores are about 8 x the >> pv_ops score. >> >> Any ideas ? >> > > Well, I guess some kind of bad serialization is going on in there, and > it should be fairly obvious with a bit of examination. > > Have you tried building your own pvops domu kernels? Does enabling PV > spinlocks make any difference? Also enabling some of the lock > debugging/profiling/contention monitoring stuff may give useful results. > > Can you post the corresponding 2.6.18 results? Are there specific > sub-tests which show the effect more strongly than the others? > > How does the 2.6.32 kernel fare when booted native? > > Thanks, > J > >> >> John >> >> >> 1 core >> >> BYTE UNIX Benchmarks (Version 4.1-wht.2) >> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC >> 2010 x86_64 GNU/Linux >> /dev/xvda1 141110136 1066476 132875660 1% / >> >> Start Benchmark Run: Tue May 18 13:54:54 BST 2010 >> 13:54:54 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 >> >> End Benchmark Run: Tue May 18 14:06:12 BST 2010 >> 14:06:12 up 11 min, 2 users, load average: 11.48, 5.20, 2.43 >> >> >> INDEX VALUES >> TEST BASELINE RESULT INDEX >> >> Dhrystone 2 using register variables 376783.7 8950813.0 237.6 >> Double-Precision Whetstone 83.1 2103.7 253.2 >> Execl Throughput 188.3 1568.4 83.3 >> File Copy 1024 bufsize 2000 maxblocks 2672.0 64198.0 240.3 >> File Copy 256 bufsize 500 maxblocks 1077.0 17781.0 165.1 >> File Read 4096 bufsize 8000 maxblocks 15382.0 643717.0 418.5 >> Pipe-based Context Switching 15448.6 85379.4 55.3 >> Pipe Throughput 111814.6 478490.1 42.8 >> Process Creation 569.3 3329.6 58.5 >> Shell Scripts (8 concurrent) 44.8 380.7 85.0 >> System Call Overhead 114433.5 498712.3 43.6 >> ========= >> FINAL SCORE 114.1 >> >> 2-cores >> >> ============================================================== >> BYTE UNIX Benchmarks (Version 4.1-wht.2) >> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC >> 2010 x86_64 GNU/Linux >> /dev/xvda1 141110136 1066548 132875588 1% / >> >> Start Benchmark Run: Tue May 18 14:07:27 BST 2010 >> 14:07:27 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 >> >> End Benchmark Run: Tue May 18 14:18:04 BST 2010 >> 14:18:04 up 10 min, 1 user, load average: 12.78, 5.53, 2.49 >> >> >> INDEX VALUES >> TEST BASELINE RESULT INDEX >> >> Dhrystone 2 using register variables 376783.7 10124838.6 268.7 >> Double-Precision Whetstone 83.1 1188.7 143.0 >> Execl Throughput 188.3 1596.2 84.8 >> File Copy 1024 bufsize 2000 maxblocks 2672.0 58323.0 218.3 >> File Copy 256 bufsize 500 maxblocks 1077.0 17776.0 165.1 >> File Read 4096 bufsize 8000 maxblocks 15382.0 568217.0 369.4 >> Pipe-based Context Switching 15448.6 86111.3 55.7 >> Pipe Throughput 111814.6 469957.8 42.0 >> Process Creation 569.3 3298.1 57.9 >> Shell Scripts (8 concurrent) 44.8 378.9 84.6 >> System Call Overhead 114433.5 532828.4 46.6 >> ========= >> FINAL SCORE 107.9 >> >> 4-cores >> >> ============================================================== >> BYTE UNIX Benchmarks (Version 4.1-wht.2) >> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC >> 2010 x86_64 GNU/Linux >> /dev/xvda1 141110136 1066628 132875508 1% / >> >> Start Benchmark Run: Tue May 18 14:19:17 BST 2010 >> 14:19:17 up 0 min, 1 user, load average: 0.00, 0.00, 0.00 >> >> End Benchmark Run: Tue May 18 14:29:53 BST 2010 >> 14:29:53 up 10 min, 1 user, load average: 13.59, 6.35, 2.97 >> >> >> INDEX VALUES >> TEST BASELINE RESULT INDEX >> >> Dhrystone 2 using register variables 376783.7 10185429.8 270.3 >> Double-Precision Whetstone 83.1 759.8 91.4 >> Execl Throughput 188.3 1386.2 73.6 >> File Copy 1024 bufsize 2000 maxblocks 2672.0 62331.0 233.3 >> File Copy 256 bufsize 500 maxblocks 1077.0 16492.0 153.1 >> File Read 4096 bufsize 8000 maxblocks 15382.0 563402.0 366.3 >> Pipe-based Context Switching 15448.6 87176.0 56.4 >> Pipe Throughput 111814.6 481068.1 43.0 >> Process Creation 569.3 3128.9 55.0 >> Shell Scripts (8 concurrent) 44.8 394.9 88.1 >> System Call Overhead 114433.5 539996.1 47.2 >> ========= >> FINAL SCORE 102.6 >> 8-cores >> >> ============================================================== >> BYTE UNIX Benchmarks (Version 4.1-wht.2, 8 threads) >> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC >> 2010 x86_64 GNU/Linux >> /dev/xvda1 141110136 1066680 132875456 1% / >> >> Start Benchmark Run: Tue May 18 14:30:59 BST 2010 >> 14:30:59 up 0 min, 1 user, load average: 0.07, 0.02, 0.00 >> >> End Benchmark Run: Tue May 18 14:42:52 BST 2010 >> 14:42:52 up 12 min, 1 user, load average: 25.56, 10.84, 4.96 >> >> >> INDEX VALUES >> TEST BASELINE RESULT INDEX >> >> Dhrystone 2 using register variables 376783.7 9972130.3 264.7 >> Double-Precision Whetstone 83.1 755.2 90.9 >> Execl Throughput 188.3 1584.7 84.2 >> File Copy 1024 bufsize 2000 maxblocks 2672.0 58981.0 220.7 >> File Copy 256 bufsize 500 maxblocks 1077.0 16904.0 157.0 >> File Read 4096 bufsize 8000 maxblocks 15382.0 557735.0 362.6 >> Pipe-based Context Switching 15448.6 80738.2 52.3 >> Pipe Throughput 111814.6 450891.2 40.3 >> Process Creation 569.3 2948.5 51.8 >> Shell Scripts (8 concurrent) 44.8 378.1 84.4 >> System Call Overhead 114433.5 537443.2 47.0 >> ========= >> FINAL SCORE 100.9 >> >> >> >> -- >> Professional hosting without compromise >> www.clustered.net >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxxxxxxxx >> http://lists.xensource.com/xen-devel >> >> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |