[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 11/13] fuzz/x86_emulate: Add --rerun option to try to track down instability
On 10/04/2017 09:27 AM, Jan Beulich wrote: >>>> On 25.09.17 at 16:26, <george.dunlap@xxxxxxxxxx> wrote: >> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c >> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c >> @@ -14,6 +14,7 @@ extern unsigned int fuzz_minimal_input_size(void); >> static uint8_t input[INPUT_SIZE]; >> >> extern bool opt_compact; >> +extern bool opt_rerun; > > Seeing a second such variable appear, I think it would really be better > to introduce a local header, included by both producer and consumer. Yes, sounds good. > >> @@ -886,21 +896,138 @@ int LLVMFuzzerInitialize(int *argc, char ***argv) >> return 0; >> } >> >> -int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size) >> +void setup_fuzz_state(struct fuzz_state *state, const uint8_t *data_p, >> size_t size) > > static (also for other new helper functions)? Ack > >> { >> - struct fuzz_state state = { >> - .ops = all_fuzzer_ops, >> - }; >> - struct x86_emulate_ctxt ctxt = { >> - .data = &state, >> - .regs = &state.regs, >> - .addr_size = 8 * sizeof(void *), >> - .sp_size = 8 * sizeof(void *), >> - }; >> + memset(state, 0, sizeof(*state)); >> + state->corpus = (struct fuzz_corpus *)data_p; > > Please don't cast away constness here. Perhaps best to make the parameter > const void *, allowing for the cast to be dropped altogether. Didn't notice that I was casting away const-ness, because state->corpus is const. :-) But I like `const void *`. > >> +int runtest(struct fuzz_state *state) { >> int rc; >> >> - /* Reset all global state variables */ >> - memset(&input, 0, sizeof(input)); >> + struct x86_emulate_ctxt *ctxt = &state->ctxt; >> + >> + state->ops = all_fuzzer_ops; >> + >> + ctxt->data = state; >> + ctxt->regs = &state->regs; >> + ctxt->addr_size = ctxt->sp_size = 8 * sizeof(void *); > > Is this actually necessary? I don't see a way for set_sizes() to be > bypassed. I was just duplicating the functionality that was already there (these were initialized at declaration). My instinct is to want to leave these initialize these for safety, but the code definitely should call set_sizes() first, so I'll take this out. > >> +void compare_states(struct fuzz_state state[2]) >> +{ >> + // First zero any "internal" pointers >> + state[0].corpus = state[1].corpus = NULL; >> + state[0].ctxt.data = state[1].ctxt.data = NULL; >> + state[0].ctxt.regs = state[1].ctxt.regs = NULL; >> + >> + > > No double blank lines please. > >> + if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) ) >> + { >> + int i; > > unsigned int (and then %u in the format strings below) Is there really an advantage to specifying 'unsigned int' for something like a loop variable? It hardly seems worth the effort to consider signed / unsigned in such a case. >> + printf("State mismatch\n"); >> + >> + for ( i=0; i<5; i++) > > Blanks missing and please use ARRAY_SIZE() again (also further down). Ack. > >> + if ( state[0].cr[i] != state[1].cr[i] ) >> + printf("cr[%d]: %lx != %lx\n", >> + i, state[0].cr[i], state[1].cr[i]); >> + >> + for ( i=0; i<MSR_INDEX_MAX; i++) >> + if ( state[0].msr[i] != state[1].msr[i] ) >> + printf("msr[%d]: %lx != %lx\n", >> + i, state[0].msr[i], state[1].msr[i]); >> + >> + for ( i=0; i<SEG_NUM; i++) >> + if ( memcmp(&state[0].segments[i], &state[1].segments[i], >> + sizeof(state[0].segments[0])) ) >> + printf("segments[%d] differ!\n", i); > > The actual values would likely be helpful to be printed here, just like > you do for all other state elements. Sure. > >> + if ( state[0].data_num != state[1].data_num ) >> + printf("data_num: %lx != %lx\n", state[0].data_num, >> + state[1].data_num); >> + if ( state[0].data_index != state[1].data_index ) >> + printf("data_index: %lx != %lx\n", state[0].data_index, >> + state[1].data_index); >> + >> + if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) ) >> + { >> + printf("registers differ!\n"); >> + /* Print If Not Equal */ >> +#define PINE(elem)\ >> + if ( state[0].elem != state[1].elem ) \ >> + printf(#elem " differ: %lx != %lx\n", \ >> + (unsigned long)state[0].elem, \ >> + (unsigned long)state[1].elem) >> + PINE(regs.r15); >> + PINE(regs.r14); >> + PINE(regs.r13); >> + PINE(regs.r12); >> + PINE(regs.rbp); >> + PINE(regs.rbx); >> + PINE(regs.r10); >> + PINE(regs.r11); >> + PINE(regs.r9); >> + PINE(regs.r8); >> + PINE(regs.rax); >> + PINE(regs.rcx); >> + PINE(regs.rdx); >> + PINE(regs.rsi); >> + PINE(regs.rdi); >> + >> + for ( i = offsetof(struct cpu_user_regs, error_code) / >> sizeof(unsigned); >> + i < sizeof(state[1].regs)/sizeof(unsigned); i++ ) >> + { >> + printf("[%04lu] %08x %08x\n", > > I think this wants to be %04zu (or perhaps better %4zu or %04zx). Same > for ctxt printing further down. > >> @@ -908,7 +1035,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, >> size_t size) >> return 1; >> } >> >> - if ( size > sizeof(input) ) >> + if ( size > sizeof(struct fuzz_corpus) ) > > What's the difference between the two variants? One fewer 'dereferences'. Rather than input -> struct fuzz_corpus -> [structure definition], you can just do struct fuzz_corpus -> [structure definition]. >> @@ -916,25 +1043,24 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, >> size_t size) >> >> memcpy(&input, data_p, size); >> >> - state.corpus = &input; >> - state.data_num = size; >> - >> - setup_state(&ctxt); >> + setup_fuzz_state(&state[0], data_p, size); >> + >> + if ( opt_rerun ) >> + printf("||| INITIAL RUN |||\n"); >> + >> + runtest(&state[0]); >> >> - sanitize_input(&ctxt); >> + if ( !opt_rerun ) >> + return 0; > > Could I talk you into inverting the condition such that there'll be > only a single "return 0" at the end of the function? Why is that valuable? If I don't return here, then the rerun code has to be indented, which to me makes it slightly more difficult to see that it's identical to the state setup & running code above. > And then - has this patch actually helped pinpoint any problems? The > ones deaslt with by the next patch perhaps? Yes, it helped find the one dealt with in the subsequent patch. Surprisingly, it didn't find anything else. Since the patch represented a non-trivial amount of work, I thought it might be useful to include so nobody would have to re-implement it again in the future. But I'd also be happy discarding this patch, as it's fairly invasive and I don't expect it to be used very often. Let me know what you think so I know whether to try to print something sensible for the segment differences or not bother. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |