[Xen-devel] arp during live migration


I am having some trouble with the send_fake_arp in the netfront driver.

Normally, on my domU, which has no queuing disciplines compiled in, the packets are sent via dev_queue_xmit in net/core/dev.c and enqueued using pfifo_fast_enqueue in net/sched/sch_generic.c.

However, during live migration, send_fake_arp() returns -2 and does not go to pfifo_fast_enqueue any more. I have been able to trace it further than this code in dev_queue_xmit:

if (q->enqueue) {
                /* Grab device queue */
                rc = q->enqueue(skb, q);
                rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc;
                goto out;

I noticed that the error code returned by send_fake_arp() is not checked. Would it be a good option to check the error code and reschedule the arp broadcast at a later time?

I have made some changes to xen 3.0.3 regarding block device migration so I might have messed things up. It could be the reason only few people reported this problem on xen-users. Obviously, the problem can also go unnoticed if a downtime of 1-2 seconds is tolerated.

Does anyone have any hints on why this might happen or how to search for more clues?

Thank you.


