[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 3/3] xen/arm: stack check instrumentation
On 7/30/24 14:07, Julien Grall wrote: > Hi, > > On 30/07/2024 18:50, Stewart Hildebrand wrote: >> On 7/29/24 14:36, Julien Grall wrote: >>> Hi Stewart, >>> >>> On 29/07/2024 15:24, Stewart Hildebrand wrote: >>>> +DEFINE_PER_CPU(unsigned int, stack_check_nesting); >>>> +DEFINE_PER_CPU(unsigned char *, stack_base); >>> >>> I think this could be "const unsigned char *" as the stack should not be >>> modified directly. >> >> Every time there's a vcpu context switch we will have a new stack. > > I am not sure I follow. "const unsigned char *" should still allow you to > update stack_base. It will just prevent anyone to try to write to modify the > stack via stack_base. Ah, of course. You're right. I'll change it to "const unsigned char *". > >> >>> >>>> + >>>> +void __attribute__((no_instrument_function)) stack_set(unsigned char >>>> *base) >>>> +{ >>>> + this_cpu(stack_base) = base; >>>> +} >>>> + >>>> +void __init __attribute__((no_instrument_function)) stack_check_init(void) >>>> +{ >>>> + this_cpu(stack_check_nesting) = 0; >>>> + stack_set(init_data.stack); >>>> +} >>>> + >>>> +__attribute__((no_instrument_function)) >>>> +void __cyg_profile_func_enter(void *this_fn, void *call_site) >>>> +{ >>>> + unsigned char *sp; >>>> + >>>> + if ( get_per_cpu_offset() == INVALID_PER_CPU_OFFSET ) >>>> + return; >>>> + >>>> + asm volatile ("mov %0, sp" : "=r" (sp) ); >>>> + >>>> + if ( sp < this_cpu(stack_base) || >>>> + sp > (this_cpu(stack_base) + STACK_SIZE) ) >>> >>> The top of the stack is used to store struct cpu_info. So you want to >>> substract its size (see arch_vcpu_create()). >> >> Will do. >> >>> >>>> + { >>>> + if ( this_cpu(stack_check_nesting) ) >>>> + return; >>>> + >>>> + this_cpu(stack_check_nesting)++; >>> >>> Looking at the use, it only seems to be used as "print/panic once". So >>> maybe use a bool to avoid any overflow. >> >> It will only ever be incremented once. I'll still change it to a bool, >> this should make it more obvious. >> >>> >>>> + printk("CPU %d stack pointer out of bounds (sp %#lx, stack base >>>> %#lx)\n", >>>> + smp_processor_id(), (uintptr_t)sp, >>>> + (uintptr_t)this_cpu(stack_base)); >>>> + BUG(); >>> >>> I would consider to call panic(). >> >> panic() alone doesn't show the stack trace / call trace. > > Ah good point. But TBH, I have never really understood why panic() didn't > return a call stack. There are a few places where I found beneficial when > debugging. > > Anyway, I guess this could be handled separately. > >> >>> But is it safe to call any of this if we blew the stack? >> >> Nope, it sure isn't! >> >>> IOW, should we have a buffer? >> >> Yes. After some experimentation, I found that this printk and a WARN >> (similar to BUG, but resumes execution and allows me to collect these >> metrics) uses approximately 1632 bytes of stack. Assuming BUG uses a >> similar amount of stack as WARN, and adding in a comfortable margin for >> error, I'll add a 4096 byte buffer (i.e. invoke the print/BUG with 4096 >> bytes remaining on the stack). > > AFAICT, the stack on Arm is 32KB. So we 1/8 of the stack as a buffer. Do you > know the current stack use in a normal setup (e.g. boot a guest)? In my particular test case simply booting a dom0, it uses about 14k of stack. Of course this could vary with booting dom0less domUs, complexity of device tree parsing, etc. > > Anyway, so long the feature is not enabled in production, then it might be ok > to steal 4KB. We could increase the stack if we see any issue. > > Cheers, >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |