Xen project Mailing List

Re: [Xen-devel] [PATCH v2 13/13] fuzz/x86_emulate: Add an option to limit the number of instructions executed

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Fri, 6 Oct 2017 11:40:32 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Ian Jackson <ian.jackson@xxxxxxxxxx>

Delivery-date: Fri, 06 Oct 2017 10:40:42 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 10/04/2017 09:28 AM, Jan Beulich wrote: >>>> On 25.09.17 at 16:26, <george.dunlap@xxxxxxxxxx> wrote: >> AFL considers a testcase to be a useful addition not only if there are >> tuples exercised by that testcase which were not exercised otherwise, >> but also if the *number* of times an individual tuple is exercised >> changes significantly; in particular, if the number of the highes bit >> changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c). > > Perhaps I simply don't know about AFL (yet) to understand how "highest > bit" matters here, or even whose highest bits there's talk of. Probably the easiest way to get this would be to read the 'technical_details.txt' [1] document about AFL, specifically the section "Detecting new behaviors". The section isn't long, and I'm not sure I could explain the situation more concisely than the author has there. [1] http://lcamtuf.coredump.cx/afl/technical_details.txt >> Unfortunately, one simple way to increase these stats it to execute >> the same (or similar) instructions multiple times. > > But the change here doesn't look at instruction similarity at all. I'm talking about how blind changes AFL makes to the input affect what AFL sees at the "output". Suppose it has a testcase where instruction A is executed once, and it sees tuple N executed twice. Now suppose it morphs the instruction so instruction A is executed twice. It will now see tuple N executed 4 times. This is seen as 'new behavior', and so it will add that as a 'new' test case to its set of interesting things to fuzz. Then suppose it morphs one of those so that instruction A is executed four times. Tuple N will be executed 8 times, which again is new behavior. The highest tuple count it sees as unique is 128; so in our example, it will generate sample inputs up to 64 instructions -- even if the actual path through the code for each instruction is identical to the single-instruction one. A 64-instruction test case will take at least 64x as long to execute as a 1-instruction test case; and it will generally also take 64x as long to fuzz. This makes AFL is spending nearly 1000x as much time fuzzing that test case as the 1-instruction test case, but for no very good reason -- if you can't get actual new behavior we care about out of 2-3 instructions, you're not going to get it out of 60 instructions. IOW, arbitrary numbers of instructions fool AFL into thinking it's found something new and interesting when it hasn't. Limiting the number of instructions should in theory keep AFL from getting distracted with test cases it thinks are new and unique but aren't. And we see that for the old format, this is true. I suspect there's some number of instructions past which we get diminishing returns even for the 'compact' format, but since testing involves running things for 24 hours, there's also a diminishing returns for that kind of optimization. :-) >> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c >> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c >> @@ -960,10 +960,13 @@ void setup_fuzz_state(struct fuzz_state *state, const >> uint8_t *data_p, size_t si >> state->data_num = size; >> } >> >> +int opt_instruction_limit = 0; > > unsigned int (and formally no need for an initializer) > >> int runtest(struct fuzz_state *state) { >> int rc; >> >> struct x86_emulate_ctxt *ctxt = &state->ctxt; >> + int icount = 0; > > unsigned int Ack > >> @@ -988,7 +991,9 @@ int runtest(struct fuzz_state *state) { >> >> rc = x86_emulate(ctxt, &state->ops); >> printf("Emulation result: %d\n", rc); >> - } while ( rc == X86EMUL_OKAY ); >> + } while ( rc == X86EMUL_OKAY && >> + (!opt_instruction_limit || >> + (++icount < opt_instruction_limit)) ); > > Hmm, if the initalizer of opt_instruction_limit was UINT_MAX, I think > this wouldn't severely impact results (running 4 billion emulations is > simply going to take too long) and this expression could be a simple > comparison. Yes, we could do that -- we'd have to change the argument parsing code to handle that case instead, but that's probably a better trade-off. And I don't have to argue about how having an initializer is easier to understand what's going on even if it's not strictly necessary. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.