[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/2] xen: merge temporary vcpu pinning scenarios
On 26.07.19 12:13, Andrew Cooper wrote: On 26/07/2019 10:53, Juergen Gross wrote:On 26.07.19 11:46, Andrew Cooper wrote:On 24/07/2019 12:26, Juergen Gross wrote:diff --git a/xen/common/wait.c b/xen/common/wait.c index 4f830a14e8..3fc5f68611 100644 --- a/xen/common/wait.c +++ b/xen/common/wait.c @@ -34,8 +34,6 @@ struct waitqueue_vcpu { */ void *esp; char *stack; - cpumask_t saved_affinity; - unsigned int wakeup_cpu; #endif }; @@ -131,12 +129,10 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv) ASSERT(wqv->esp == 0); /* Save current VCPU affinity; force wakeup on *this* CPU only. */ - wqv->wakeup_cpu = smp_processor_id(); - cpumask_copy(&wqv->saved_affinity, curr->cpu_hard_affinity); - if ( vcpu_set_hard_affinity(curr, cpumask_of(wqv->wakeup_cpu)) ) + if ( vcpu_temporary_affinity(curr, smp_processor_id(), VCPU_AFFINITY_WAIT) ) { gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n"); - domain_crash(current->domain); + domain_crash(curr->domain); for ( ; ; ) do_softirq(); @@ -170,7 +166,7 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv) if ( unlikely(wqv->esp == 0) ) { gdprintk(XENLOG_ERR, "Stack too large in %s\n", __func__); - domain_crash(current->domain); + domain_crash(curr->domain); for ( ; ; ) do_softirq(); @@ -182,30 +178,24 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv) static void __finish_wait(struct waitqueue_vcpu *wqv) { wqv->esp = NULL; - (void)vcpu_set_hard_affinity(current, &wqv->saved_affinity); + vcpu_temporary_affinity(current, NR_CPUS, VCPU_AFFINITY_WAIT); } void check_wakeup_from_wait(void) { - struct waitqueue_vcpu *wqv = current->waitqueue_vcpu; + struct vcpu *curr = current; + struct waitqueue_vcpu *wqv = curr->waitqueue_vcpu; ASSERT(list_empty(&wqv->list)); if ( likely(wqv->esp == NULL) ) return; - /* Check if we woke up on the wrong CPU. */ - if ( unlikely(smp_processor_id() != wqv->wakeup_cpu) ) + /* Check if we are still pinned. */ + if ( unlikely(!(curr->affinity_broken & VCPU_AFFINITY_WAIT)) ) { - /* Re-set VCPU affinity and re-enter the scheduler. */ - struct vcpu *curr = current; - cpumask_copy(&wqv->saved_affinity, curr->cpu_hard_affinity); - if ( vcpu_set_hard_affinity(curr, cpumask_of(wqv->wakeup_cpu)) ) - { - gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n"); - domain_crash(current->domain); - } - wait(); /* takes us back into the scheduler */ + gdprintk(XENLOG_ERR, "vcpu affinity lost\n"); + domain_crash(curr->domain); }I'm sorry to retract my R-by after the fact, but I've only just noticed (while rebasing some of my pending work over this) that it is buggy. The reason wait() was called is because it is not safe to leave that if() clause. With this change in place, we'll arrange for the VM to be crashed, then longjump back into the stack from from the waiting vCPU, on the wrong CPU. Any caller with smp_processor_id() or thread-local variables cache by pointer on the stack will then cause memory corruption. Its not immediately obvious how to fix this, but bear in mind that as soon as the vm-event interface is done, I plan to delete this whole waitqueue infrastructure anyway.Shouldn't just calling wait() after domain_crash() be fine then? That's what would have happened in the original error case, too.No - I don't think so. That was to try and get back into a position where the scheduler rescheduled this vcpu on the correct cpu, so it could safely longjmp back into context. But there was a domain_crash() in the code I removed. In case this already was a problem then I guess the domain_crash() might need to be replaced by panic(). The only case I'm aware of where this situation could arise would be a suspend/resume cycle where wait_event() was active and not all cpus came up again on resume. That seems to be quite improbable. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |