[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Linux: balloon_process() causing workqueue lockups?


  • To: Juergen Gross <jgross@xxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 27 Aug 2021 11:01:30 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SAHcHyJWXLvA18OIYwvYT2sPhL1aH7krd7s/dpGZjQU=; b=iZ007c9U2xss/gT8G5DSw/rrI9UnCvXW+fDIylyflUqrWJFeznpNGEVDCcUysG1sN7wO9n2n+lynDuO3kck2HK1uze8VmdLufdCW+9PzZeBv5JLuBw/Kqd6GXcox6ftlUiU+lFA9avcbEPLMChz0L1XyjTTFsnIcQ+LsZGAj0ALmBwSTlit9rQ43yM+NakdIc5lEbhbvM9DXqCy48mSog+JG1c+gbKlQ4ykENN3YN+ymjEherIL/jW0L2fz+JOdwhW533YwTqXLtETaqSlXUx+0nIbX1+7N8WunqCVz4ChKU91dtZnxro8TewIMxx3QtpxlJofM8euDt5KFch/lR6A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aKgIilriRzdFfoTNzY0i7otRcYg8ZeXnxkiHtXOQPWKhCSOoqa2eU2EizJ0VyfnCoW4fz7y+egl5ZsDFWJ+56sRlU1CjKiBALLVMTHqKwwm1/V9SMDg7xkYeaizb+HFBPOVZaqTA4lUL9NZoY6zxg8lSGvCr5EMmWuJHvrSDJpEKK9chC7RR4q6JNYgtbLFnm8vZg25ED7QvypXTfdegLu6pckRkw4WKUUBXxjeMXoJJysJdRGyc/guhEtmCzkM2wCQzrBcO7+Or3N7E0ZgfhMZuXOV53Zu2qS8zJ1LZTGoVgocMsTjsuw2YJc7PDE7+wrTT97aVqsU924MhmcaQKw==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 27 Aug 2021 09:01:45 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello,

ballooning down Dom0 by about 16G in one go once in a while causes:

BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3
    in-flight: 229:balloon_process
    pending: cache_reap
workqueue events_freezable_power_: flags=0x84
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
    pending: disk_events_workfn
workqueue mm_percpu_wq: flags=0x8
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
    pending: vmstat_update
pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43

I've tried to double check that this isn't related to my IOMMU work
in the hypervisor, and I'm pretty sure it isn't. Looking at the
function I see it has a cond_resched(), but aiui this won't help
with further items in the same workqueue.

Thoughts?

Thanks, Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.