[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ocplib+endian improvement



Fixed; it was a minor problem in the C start_info stubs that was truncating the 
size of the bigarray mapping in OCaml. This only showed up once bounds checking 
was activated, hence you didn't notice it before.
https://github.com/mirage/mirage-platform/commit/b34dd44c5e20f9534d74cef71a8d875005e3cc7e

I'm cutting stable releases of all these changes now...

On 8 Jan 2013, at 18:54, Dave Scott <Dave.Scott@xxxxxxxxxxxxx> wrote:

> That's interesting-- line 306 in ring.ml is an attempt to read the ring input 
> (from the frontend PoV) producer pointer, in a function (suspiciously) named 
> "unsafe_read". That ought to be at offset 2048+12 in the page. Perhaps this 
> buffer isnt actually the xenstore ring?
> 
> Is this easy to repro?
> 
> --
> Dave Scott
> 
> On Jan 8, 2013, at 5:42 PM, "Anil Madhavapeddy" <anil@xxxxxxxxxx> wrote:
> 
>> This turns out to due to a lack of bounds checking, causing a buffer
>> overflow.  I've re-added the bounds checks into the new cstruct, and now
>> get a nice exception instead of a crash from the netfront end.  The fact
>> that the dom0 crashes is a good case for a Mirage fuzz tester in the
>> future ;-)
>> 
>> Dave, it seems like the real problem is in xenstore-client somewhere,
>> according to the backtrace:
>> 
>> x86_mm.c: Demand map pfns at 8001000-2008001000.
>> gnttab_init: 2047
>> gnttab_stubs.c: gnttab_table mapped at 0x8001000
>> Caught: Invalid_argument("index out of bounds")
>> Called from file "lib/ring.ml", line 306, characters 27-51
>> Called from file "core/xs_protocol.ml", line 274, characters 6-31
>> Called from file "client_lwt/xs_client.ml", line 110, characters 30-42
>> Called from file "src/core/lwt.ml", line 679, characters 20-24
>> 
>> Is this something to do with multipage rings? It seems to read beyond the
>> actual xenstore page, which is odd...
>> 
>> -anil
>> 
>>> On Mon, Jan 07, 2013 at 03:08:51PM +0000, Steven Smith wrote:
>>>> Right.  So I'm going to stop looking at this now.  I'd like to say
>>>> that's because I've fixed it and everything's fine, but, actually,
>>>> it's because the BMC on my test box has failed and I can't find any
>>>> other machines which are in a usable state.  Pretty much the only
>>>> concrete thing which I've found is that the ring indices in the shared
>>>> page are corrupted, with what looks like ASCII; not sure if that helps
>>>> you at all.
>>>> 
>>>>> [   39.830497] <0>In xen_netbk_tx_build_gops; done 95000 iterations so far
>>>>> [   39.837138] <0>nr_pending_reqs 0
>>>>> [   39.840485] <0>Ring status: rsp_prod_pvt 24cda, req_cons 238da
>>>>> [   39.846437] <0>Shared: req_prod 75422121, req_event 69646c69, rsp_prod 
>>>>> 24cda, rsp_event 6c6d6143
>>>> 
>>>> Decode indices as ascii:
>>>> 
>>>> 75422121 -> uB!!   !!Bu
>>>> 69646c69 -> idli   ildi
>>>> 6c6d6143 -> lmaC   Caml
>>>> 
>>>> Which could plausibly be fragments of strings which might be present
>>>> in an ocaml program, but it's not really terribly convincing.
>>>> 
>>>> I do have a patch which might at least stop the bug from taking down
>>>> dom0, which I've attached, but I've not been able to test it all.
>>>> 
>>>> So that's not desperately helpful, really.  Sorry about that.
>>>> 
>>>> Steven.
>>>> 
>>>> 
>>>> 
>>>>> http://www.recoil.org/~avsm/www-crashes-pvops.xen is the offending 
>>>>> kernel, which always gives me the traceback below when booted using xm 
>>>>> (on Xen 4.1, either dom0 kernel of 3.2 or 3.7)... can you repro on your 
>>>>> setup?  You have to provide one VIF.
>>>>> 
>>>>> -anil
>>>>> 
>>>>> Begin forwarded message:
>>>>> 
>>>>>> From: Anil Madhavapeddy <anil@xxxxxxxxxx>
>>>>>> Subject: Re: ocplib+endian improvement
>>>>>> Date: 6 January 2013 21:08:18 GMT
>>>>>> To: "<cl-mirage@xxxxxxxxxxxxxxx> List" <cl-mirage@xxxxxxxxxxxxxxx>
>>>>>> Cc: Steven Smith <steven.smith@xxxxxxxxxxxx>
>>>>>> 
>>>>>> Hm, but trying out TCP seems to trigger a softlockup in Xen netback 
>>>>>> (both kernels 3.2 and latest 3.7). Will need to do some more debugging 
>>>>>> tomorrow...
>>>>>> 
>>>>>> Steven (Smith): any luck with the grant-free netback modification?  I 
>>>>>> could try it out at the same time as debugging this particular issue.
>>>>>> 
>>>>>> -anil
>>>>>> 
>>>>>> [  277.249069] Code: 00 89 c1 7c c8 41 59 c3 90 90 90 65 c6 04 25 41 b1 
>>>>>> 00 00 00 65 f6 04 25 40 b1 00 00 ff 74 05 e8 47 00 00 00 c3 66 0f 1f 44 
>>>>>> 00 00 <65> c6 04 25 41 b1 00 00 01 c3 66 0f 1f 44 00 00 65 f6 04 25 41
>>>>>> [  305.248782] BUG: soft lockup - CPU#0 stuck for 23s! [netback/0:3772]
>>>>>> [  305.248883] Modules linked in: xt_physdev iptable_filter ip_tables 
>>>>>> x_tables xen_netback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd 
>>>>>> auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc bridge stp llc 
>>>>>> loop crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper 
>>>>>> cryptd xts lrw gf128mul snd_pcm sp5100_tco snd_page_alloc snd_timer snd 
>>>>>> soundcore tpm_tis tpm amd64_edac_mod edac_mce_amd i2c_piix4 i2c_core 
>>>>>> dcdbas evdev pcspkr tpm_bios edac_core microcode psmouse fam15h_power 
>>>>>> k10temp serio_raw acpi_power_meter button processor thermal_sys ext4 
>>>>>> crc16 jbd2 mbcache dm_mod sg sd_mod crc_t10dif ata_generic ohci_hcd 
>>>>>> pata_atiixp ixgbe ahci ptp libahci pps_core libata ehci_hcd dca mdio 
>>>>>> scsi_mod bnx2 usbcore usb_common
>>>>>> [  305.248932] CPU 0
>>>>>> [  305.248934] Pid: 3772, comm: netback/0 Tainted: G        W    
>>>>>> 3.7-trunk-amd64 #1 Debian 3.7.1-1~experimental.2 Dell Inc. PowerEdge 
>>>>>> R415/08WNM9
>>>>>> [  305.248936] RIP: e030:[<ffffffffa021c153>]  [<ffffffffa021c153>] 
>>>>>> xen_netbk_tx_build_gops+0x19d/0x7ad [xen_netback]
>>>>>> [  305.248941] RSP: e02b:ffff88020eba3ca8  EFLAGS: 00000217
>>>>>> [  305.248943] RAX: 0000000073202626 RBX: ffffc90007ab5000 RCX: 
>>>>>> 000000001ce71e08
>>>>>> [  305.248945] RDX: 000000001ce71e08 RSI: ffffc90007ab02a8 RDI: 
>>>>>> ffff880653403800
>>>>>> [  305.248946] RBP: ffffc90007ab70c0 R08: ffff8806534038d8 R09: 
>>>>>> ffff88020eba3c74
>>>>>> [  305.248947] R10: ffffc90007ab0208 R11: ffffc90007ab0208 R12: 
>>>>>> ffff880653403800
>>>>>> [  305.248949] R13: 0000000000007320 R14: 0000000000000000 R15: 
>>>>>> 000000000104210e
>>>>>> [  305.248953] FS:  00007f59123a4700(0000) GS:ffff8807ff800000(0000) 
>>>>>> knlGS:0000000000000000
>>>>>> [  305.248955] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>> [  305.248956] CR2: 00007fd7c822b070 CR3: 00000007f2177000 CR4: 
>>>>>> 0000000000000660
>>>>>> [  305.248958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>>>>> 0000000000000000
>>>>>> [  305.248960] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>>>>>> 0000000000000400
>>>>>> [  305.248961] Process netback/0 (pid: 3772, threadinfo 
>>>>>> ffff88020eba2000, task ffff8807ee921180)
>>>>>> [  305.248963] Stack:
>>>>>> [  305.248964]  ffffffff81004067 ffffffff81004202 1ce71e0800000003 
>>>>>> ffff8807ee921180
>>>>>> [  305.248968]  ffffc90007ab70c0 ffff8807ff811740 0000000000000000 
>>>>>> 2074706fffff2e2f
>>>>>> [  305.248972]  ffff4f2073202626 2074706f20646c72 ffffffff810037f7 
>>>>>> ffff8807ee921180
>>>>>> [  305.248975] Call Trace:
>>>>>> [  305.248978]  [<ffffffff81004067>] ? arch_local_irq_restore+0x7/0x8
>>>>>> [  305.248982]  [<ffffffff81004202>] ? xen_mc_flush+0x11d/0x160
>>>>>> [  305.248985]  [<ffffffff810037f7>] ? 
>>>>>> xen_mc_issue.constprop.22+0x10/0x4d
>>>>>> [  305.248988]  [<ffffffff8100d02f>] ? load_TLS+0x7/0xa
>>>>>> [  305.248991]  [<ffffffff8100d60c>] ? __switch_to+0x195/0x3f8
>>>>>> [  305.248994]  [<ffffffff8105fadb>] ? mmdrop+0xd/0x1c
>>>>>> [  305.248996]  [<ffffffff81061390>] ? finish_task_switch+0x83/0xb4
>>>>>> [  305.249000]  [<ffffffff813778e9>] ? __schedule+0x4b2/0x4e0
>>>>>> [  305.249003]  [<ffffffff8107f0d3>] ? arch_local_irq_disable+0x7/0x8
>>>>>> [  305.249006]  [<ffffffff81378465>] ? _raw_spin_lock_irqsave+0x14/0x35
>>>>>> [  305.249009]  [<ffffffff8107f0d3>] ? arch_local_irq_disable+0x7/0x8
>>>>>> [  305.249012]  [<ffffffff81378465>] ? _raw_spin_lock_irqsave+0x14/0x35
>>>>>> [  305.249016]  [<ffffffffa021c897>] ? xen_netbk_kthread+0x134/0x78d 
>>>>>> [xen_netback]
>>>>>> [  305.249019]  [<ffffffff8105d78f>] ? arch_local_irq_enable+0x7/0x8
>>>>>> [  305.249022]  [<ffffffff81061357>] ? finish_task_switch+0x4a/0xb4
>>>>>> [  305.249025]  [<ffffffff81057987>] ? abort_exclusive_wait+0x79/0x79
>>>>>> [  305.249029]  [<ffffffffa021c763>] ? 
>>>>>> xen_netbk_tx_build_gops+0x7ad/0x7ad [xen_netback]
>>>>>> [  305.249032]  [<ffffffffa021c763>] ? 
>>>>>> xen_netbk_tx_build_gops+0x7ad/0x7ad [xen_netback]
>>>>>> [  305.249035]  [<ffffffff810570ac>] ? kthread+0x81/0x89
>>>>>> [  305.249038]  [<ffffffff810037f7>] ? 
>>>>>> xen_mc_issue.constprop.22+0x10/0x4d
>>>>>> [  305.249041]  [<ffffffff8105702b>] ? __kthread_parkme+0x5c/0x5c
>>>>>> [  305.249043]  [<ffffffff8137d6bc>] ? ret_from_fork+0x7c/0xb0
>>>>>> [  305.249046]  [<ffffffff8105702b>] ? __kthread_parkme+0x5c/0x5c
>>>>>> [  305.249048] Code: bc 24 80 00 00 00 4d 89 a4 24 a8 00 00 00 49 c7 84 
>>>>>> 24 a0 00 00 00 58 bf 21 a0 4c 89 f6 e8 86 d9 e2 e0 e9 c3 05 00 00 8b 54 
>>>>>> 24 14 <66> 8b 74 24 3e 41 8d 4f ff 0f b7 44 24 42 48 c7 44 24 30 00 00
>>>>>> avsm@gabriel:~/src/git/mirage/mirage-www$
>>>>>> Message from syslogd@gabriel at Jan  6 21:06:38 ...
>>>>>> 
>>>>>> On 6 Jan 2013, at 18:44, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
>>>>>> 
>>>>>>> I've been porting the network stack to take advantage of the cstruct 
>>>>>>> turbo-boost that Pierre and Thomas worked on. This optimisation adds 
>>>>>>> compiler built-ins (in 4.01.0+) which let the code generator optimise 
>>>>>>> away many of the temporary values required for low-level optimisation.
>>>>>>> 
>>>>>>> Here's a (very quick) before/after for a ping flood (which is a good 
>>>>>>> stress test of the low-level shared ring, network driver and protocol 
>>>>>>> stack).
>>>>>>> 
>>>>>>> For a ping flood With 4.00.1 without the optimisation:
>>>>>>> 73755 packets transmitted, 73702 received, +49 duplicates, 0% packet 
>>>>>>> loss, time 6283ms
>>>>>>> rtt min/avg/max/mdev = 0.031/0.228/1209.178/9.887 ms, pipe 14850, 
>>>>>>> ipg/ewma 0.085/0.036 ms
>>>>>>> 
>>>>>>> and with the optimisation:
>>>>>>> 41791 packets transmitted, 41764 received, +25 duplicates, 0% packet 
>>>>>>> loss, time 3539msrtt min/avg/max/mdev = 0.030/0.188/1261.042/8.459 ms, 
>>>>>>> pipe 14742, ipg/ewma 0.084/0.039 ms
>>>>>>> 
>>>>>>> So our average latency drops quite significantly (0.228 -> 0.188), as 
>>>>>>> does CPU load (not shown above).
>>>>>>> 
>>>>>>> I've not committed these changes to the mainstream yet until I test out 
>>>>>>> TCP more, but it's getting there!
>>>>>>> 
>>>>>>> -anil
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>>> diff --git a/drivers/net/xen-netback/netback.c 
>>>> b/drivers/net/xen-netback/netback.c
>>>> index 3f27707..1a6f816 100644
>>>> --- a/drivers/net/xen-netback/netback.c
>>>> +++ b/drivers/net/xen-netback/netback.c
>>>> @@ -776,8 +776,7 @@ static struct xenvif *poll_net_schedule_list(struct 
>>>> xen_netbk *netbk)
>>>> 
>>>>   vif = list_first_entry(&netbk->net_schedule_list,
>>>>                          struct xenvif, schedule_list);
>>>> -   if (!vif)
>>>> -           goto out;
>>>> +   BUG_ON(!vif);
>>>> 
>>>>   xenvif_get(vif);
>>>> 
>>>> @@ -857,7 +856,9 @@ static void tx_credit_callback(unsigned long data)
>>>> }
>>>> 
>>>> static void netbk_tx_err(struct xenvif *vif,
>>>> -                    struct xen_netif_tx_request *txp, RING_IDX end)
>>>> +                    struct xen_netif_tx_request *txp,
>>>> +                    RING_IDX end,
>>>> +                    int fatal)
>>>> {
>>>>   RING_IDX cons = vif->tx.req_cons;
>>>> 
>>>> @@ -868,8 +869,15 @@ static void netbk_tx_err(struct xenvif *vif,
>>>>           txp = RING_GET_REQUEST(&vif->tx, cons++);
>>>>   } while (1);
>>>>   vif->tx.req_cons = cons;
>>>> -   xen_netbk_check_rx_xenvif(vif);
>>>> +   if (!fatal) {
>>>> +           /* If this is a fatal error then we don't need to poll
>>>> +              the device again (and doing so might lead to some
>>>> +              quite bad behaviour if the ring structure has
>>>> +              become corrupted in some way). */
>>>> +           xen_netbk_check_rx_xenvif(vif);
>>>> +   }
>>>>   xenvif_put(vif);
>>>> +
>>>> }
>>>> 
>>>> static int netbk_count_requests(struct xenvif *vif,
>>>> @@ -1232,7 +1240,9 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>   struct gnttab_copy *gop = netbk->tx_copy_ops, *request_gop;
>>>>   struct sk_buff *skb;
>>>>   int ret;
>>>> +   int cntr;
>>>> 
>>>> +   cntr = 0;
>>>>   while (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) &&
>>>>           !list_empty(&netbk->net_schedule_list)) {
>>>>           struct xenvif *vif;
>>>> @@ -1245,15 +1255,43 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>           int work_to_do;
>>>>           unsigned int data_len;
>>>>           pending_ring_idx_t index;
>>>> +           int loud;
>>>> +
>>>> +           cntr++;
>>>> +           loud = 0;
>>>> +           if (cntr % 1000 == 0) {
>>>> +                   printk("<0>In xen_netbk_tx_build_gops; done %d 
>>>> iterations so far\n",
>>>> +                          cntr);
>>>> +                   printk("<0>nr_pending_reqs %d\n",
>>>> +                          nr_pending_reqs(netbk));
>>>> +                   loud = 1;
>>>> +           }
>>>> 
>>>>           /* Get a netif from the list with work to do. */
>>>>           vif = poll_net_schedule_list(netbk);
>>>> -           if (!vif)
>>>> +           if (!vif) {
>>>> +                   /* This can sometimes happen because the test
>>>> +                      of list_empty(net_schedule_list) at the top
>>>> +                      of the loop is unlocked.  Just go back and
>>>> +                      have another look. */
>>>> +                   printk("<0>No vif, keep going\n");
>>>>                   continue;
>>>> +           }
>>>> +
>>>> +           if (loud) {
>>>> +                   printk("<0>Ring status: rsp_prod_pvt %x, req_cons 
>>>> %x\n",
>>>> +                          vif->tx.rsp_prod_pvt, vif->tx.req_cons);
>>>> +                   printk("<0>Shared: req_prod %x, req_event %x, rsp_prod 
>>>> %x, rsp_event %x\n",
>>>> +                          vif->tx.sring->req_prod,
>>>> +                          vif->tx.sring->req_event,
>>>> +                          vif->tx.sring->rsp_prod,
>>>> +                          vif->tx.sring->rsp_event);
>>>> +           }
>>>> 
>>>>           RING_FINAL_CHECK_FOR_REQUESTS(&vif->tx, work_to_do);
>>>>           if (!work_to_do) {
>>>>                   xenvif_put(vif);
>>>> +                   printk("<0>No work on %p\n", vif);
>>>>                   continue;
>>>>           }
>>>> 
>>>> @@ -1264,6 +1302,7 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>           /* Credit-based scheduling. */
>>>>           if (txreq.size > vif->remaining_credit &&
>>>>               tx_credit_exceeded(vif, txreq.size)) {
>>>> +                   printk("<0>Stopped by credit scheduler\n");
>>>>                   xenvif_put(vif);
>>>>                   continue;
>>>>           }
>>>> @@ -1279,14 +1318,14 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>                                                     work_to_do);
>>>>                   idx = vif->tx.req_cons;
>>>>                   if (unlikely(work_to_do < 0)) {
>>>> -                           netbk_tx_err(vif, &txreq, idx);
>>>> +                           netbk_tx_err(vif, &txreq, idx, 1);
>>>>                           continue;
>>>>                   }
>>>>           }
>>>> 
>>>>           ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do);
>>>>           if (unlikely(ret < 0)) {
>>>> -                   netbk_tx_err(vif, &txreq, idx - ret);
>>>> +                   netbk_tx_err(vif, &txreq, idx - ret, 1);
>>>>                   continue;
>>>>           }
>>>>           idx += ret;
>>>> @@ -1294,7 +1333,7 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>           if (unlikely(txreq.size < ETH_HLEN)) {
>>>>                   netdev_dbg(vif->dev,
>>>>                              "Bad packet size: %d\n", txreq.size);
>>>> -                   netbk_tx_err(vif, &txreq, idx);
>>>> +                   netbk_tx_err(vif, &txreq, idx, 1);
>>>>                   continue;
>>>>           }
>>>> 
>>>> @@ -1304,7 +1343,7 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>                              "txreq.offset: %x, size: %u, end: %lu\n",
>>>>                              txreq.offset, txreq.size,
>>>>                              (txreq.offset&~PAGE_MASK) + txreq.size);
>>>> -                   netbk_tx_err(vif, &txreq, idx);
>>>> +                   netbk_tx_err(vif, &txreq, idx, 1);
>>>>                   continue;
>>>>           }
>>>> 
>>>> @@ -1320,7 +1359,7 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>           if (unlikely(skb == NULL)) {
>>>>                   netdev_dbg(vif->dev,
>>>>                              "Can't allocate a skb in start_xmit.\n");
>>>> -                   netbk_tx_err(vif, &txreq, idx);
>>>> +                   netbk_tx_err(vif, &txreq, idx, 0);
>>>>                   break;
>>>>           }
>>>> 
>>>> @@ -1333,7 +1372,8 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>> 
>>>>                   if (netbk_set_skb_gso(vif, skb, gso)) {
>>>>                           kfree_skb(skb);
>>>> -                           netbk_tx_err(vif, &txreq, idx);
>>>> +                           netbk_tx_err(vif, &txreq, idx, 1);
>>>> +                           printk("<0>Killed by netbk_set_skb_gso\n");
>>>>                           continue;
>>>>                   }
>>>>           }
>>>> @@ -1342,7 +1382,8 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>           page = xen_netbk_alloc_page(netbk, skb, pending_idx);
>>>>           if (!page) {
>>>>                   kfree_skb(skb);
>>>> -                   netbk_tx_err(vif, &txreq, idx);
>>>> +                   netbk_tx_err(vif, &txreq, idx, 0);
>>>> +                   printk("<0>No pages for payload?\n");
>>>>                   continue;
>>>>           }
>>>> 
>>>> @@ -1382,7 +1423,7 @@ static unsigned xen_netbk_tx_build_gops(struct 
>>>> xen_netbk *netbk)
>>>>                                                skb, txfrags, gop);
>>>>           if (request_gop == NULL) {
>>>>                   kfree_skb(skb);
>>>> -                   netbk_tx_err(vif, &txreq, idx);
>>>> +                   netbk_tx_err(vif, &txreq, idx, 0);
>>>>                   continue;
>>>>           }
>>>>           gop = request_gop;
>>> 
>>> 
>>> --
>>> Anil Madhavapeddy                                 http://anil.recoil.org
>>> 
>> 
>> --
>> Anil Madhavapeddy                                 http://anil.recoil.org
>> 
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.