Xen project Mailing List

Re: [Xen-devel] BUG_ON() vs ASSERT()

To: George Dunlap <george.dunlap@xxxxxxxxxx>

From: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>

Date: Wed, 14 Sep 2016 11:26:32 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 14 Sep 2016 09:26:57 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Wednesday, September 14, 2016, 10:35:30 AM, you wrote: > On Tue, Sep 13, 2016 at 7:16 PM, Andrew Cooper > <andrew.cooper3@xxxxxxxxxx> wrote: >> On 12/09/16 16:23, Jan Beulich wrote: >>> All, >>> >>> in >>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01201.html >>> and >>> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01210.html >>> Andrew basically suggests that we should switch away from using >>> ASSERT() and over to BUG_ON() in perhaps quite broad a set of >>> cases. And honestly I'm not convinced of this: We've been adding >>> quite a few ASSERT()s over the last years with the aim of doing >>> sanity checking in debug builds, without adding overhead to non- >>> debug builds. I can certainly see possible cases where using >>> BUG_ON() to prevent further possible damage is appropriate, but >>> I don't think we should overdo here. >> >> I am not advocating switching all ASSERT()s to BUG_ON()s. That would be >> silly. >> >> However, ASSERT()'s as a bounds check very definitely are dangerous. If >> there is any uncertainty about the bounds, the check must not disappear >> in a release build. (Better yet, code which copes cleanly with >> insufficient bounds). >> >> >> For anyone reading this email who hasn't already worked out the details >> of XSA-186, the data corruption issue is here: >> >> static int hvmemul_insn_fetch(...) >> { >> unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip; >> ... >> ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf)); >> memcpy(&hvmemul_ctxt->insn_buf[insn_off], p_data, bytes); >> ... >> >> It is left as an exercise to the reader to work out how to exploit this >> on a release build of Xen, but it is hopefully obvious that the ASSERT() >> isn't helpful. A BUG_ON() in this case would have been a host DoS, >> which is substantially better than a guest escape. > This seems quite sensible, and I'm glad Andy clarified. (Although in > a lot of these cases, a domain_crash() would be preferable to a > BUG_ON() if possible.) > -George Just my two cents, but please try to limit BUG_ON()'s to only really unrecoverable cases. Because it is quite a disaster especially on remote administrated machines to have the host crash on a BUG_ON() (with or without auto restart). Try to recover (for instance if necessary by crashing the guest instead of the host) as much as you can to make at least the hypervisor and dom0 survive. Put a big fat warning in xl dmesg and taint the hypervisor (preferably in a way you could easily read out with monitoring tools), so you could propagate that to the sysadmin that the system is now deemed instable, so he can investigate (and report the bug), triage and schedule a reboot. Instead of only having a hard to debug crashing machine with probably no or limited logs to go by. In the Linux kernel the problem now is to get a lot of them converted back to ratelimited warnings instead of BUG_ON()'s. -- Sander _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.