[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 05/18] xen/arm: Trigger Xen suspend when Dom0 completes suspend



Hi Julien,

On Wed, Nov 14, 2018 at 6:10 PM Julien Grall <julien.grall@xxxxxxx> wrote:
>
> Hi Mirela,
>
> On 14/11/2018 15:40, Mirela Simonovic wrote:
> > On Wed, Nov 14, 2018 at 4:07 PM Julien Grall <julien.grall@xxxxxxx> wrote:
> >> On 12/11/2018 11:30, Mirela Simonovic wrote:
> >>> When Dom0 finalizes its suspend procedure the suspend of Xen is
> >>> triggered by calling system_suspend(). Dom0 finalizes the suspend from
> >>> its boot core (VCPU#0), which could be mapped to any physical CPU,
> >>> i.e. the system_suspend() function could be executed by any physical
> >>> CPU. Since Xen suspend procedure has to be run by the boot CPU
> >>> (non-boot CPUs will be disabled at some point in suspend procedure),
> >>> system_suspend() execution has to continue on CPU#0.
> >>>
> >>> When the system_suspend() returns 0, it means that the system was
> >>> suspended and it is coming out of the resume procedure. Regardless
> >>> of the system_suspend() return value, after this function returns
> >>> Xen is fully functional, and its state, including all devices and data
> >>> structures, matches the state prior to calling system_suspend().
> >>> The status is returned by system_suspend() for debugging/logging
> >>> purposes and function prototype compatibility.
> >>>
> >>> Signed-off-by: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
> >>> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xxxxxxxxxx>
> >>> ---
> >>>    xen/arch/arm/suspend.c | 34 ++++++++++++++++++++++++++++++++++
> >>>    1 file changed, 34 insertions(+)
> >>>
> >>> diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> >>> index f2338e41db..21b45f8248 100644
> >>> --- a/xen/arch/arm/suspend.c
> >>> +++ b/xen/arch/arm/suspend.c
> >>> @@ -112,11 +112,20 @@ static void vcpu_suspend(register_t epoint, 
> >>> register_t cid)
> >>>        _arch_set_info_guest(v, &ctxt);
> >>>    }
> >>>
> >>> +/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
> >>> +static long system_suspend(void *data)
> >>> +{
> >>> +    BUG_ON(system_state != SYS_STATE_active);
> >>> +
> >>> +    return -ENOSYS;
> >>> +}
> >>> +
> >>>    int32_t domain_suspend(register_t epoint, register_t cid)
> >>>    {
> >>>        struct vcpu *v;
> >>>        struct domain *d = current->domain;
> >>>        bool is_thumb = epoint & 1;
> >>> +    int status;
> >>>
> >>>        dprintk(XENLOG_DEBUG,
> >>>                "Dom%d suspend: epoint=0x%"PRIregister", 
> >>> cid=0x%"PRIregister"\n",
> >>> @@ -156,6 +165,31 @@ int32_t domain_suspend(register_t epoint, register_t 
> >>> cid)
> >>>         */
> >>>        vcpu_block_unless_event_pending(current);
> >>>
> >>> +    /* If this was dom0 the whole system should suspend: trigger Xen 
> >>> suspend */
> >>> +    if ( is_hardware_domain(d) )
> >>> +    {
> >>> +        /*
> >>> +         * system_suspend should be called when Dom0 finalizes the 
> >>> suspend
> >>> +         * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 
> >>> could
> >>> +         * be mapped to any PCPU (this function could be executed by any 
> >>> PCPU).
> >>> +         * The suspend procedure has to be finalized by the PCPU#0 
> >>> (non-boot
> >>> +         * PCPUs will be disabled during the suspend).
> >>> +         */
> >>> +        status = continue_hypercall_on_cpu(0, system_suspend, NULL);
> >>
> >> Based on my comment in patch #2, I don't think this will do the correct 
> >> thing on
> >> Dom0. x0 will not contain cid but PSCI_SUCCESS has it is overriden in
> >> do_vpsci_0_2_call.
> >>
> >
> > Could you please explain? I can't follow
>
> General purpose (e.g xN, pc) registers live at the bottom of the vCPU stack. 
> The
> function vcpu_suspend will reset all of them to 0 but x0 (Context ID) and pc
> (entry point).
>
> You rely on those registers to be untouched in the return path. However, this 
> is
> not the case. If you look at do_vpsci_0_2_call, x0 will be set to whatever is
> the return value of domain_suspend (e.g PSCI_SUSPEND). So x0 will not contain
> anymore the Context ID as expected by the guest.

This is expected, the system should behave that way. If the guest
managed to suspend, i.e. domain_suspend has returned PSCI_SUCCESS, the
return value is meaningless to the guest because it won't return to
it. The guest has suspended, just the fact the it will start from an
another entry point implicitly carries the information that the
suspend was successful.
However, if the return value from domain_suspend is an error then the
error will be returned to the guest because the suspend has failed.
Then the x0 register should contain error code, not the context ID.
The PC should be untouched, i.e. it should not contain the resume
entry point, but whatever was in there once the hvc/system_suspend was
issued by the guest. Guests handle errors right below the code from
which they tried to suspend.

>
> You probably haven't noticed it because Linux is currently using 0 for the
> context ID. This is the same value as PSCI_SUCCESS.
>
> In the case of Dom0, this is a bit different to what I explained in my 
> previous
> e-mail because I got confused. The function continue_hypercall_on_cpu is not
> doing what you expect. The function will pause the vCPU, schedule the tasklet
> and then return directly.
>
> At some point the tasklet will get scheduled on CPU#0 and system_suspend will 
> be
> called. The return value of that function will be copied to x0. The vCPU will
> then get unpaused and continue to run.
>
> So x0 will not contain the Context ID but whatever system_suspend return.
>

I agree with all you described above, but that is intended - the
system should behave that way. I believe the cause of misunderstanding
could be in how the return value versus context ID is handled. Those
are different paths, and only one of the following executes: 1) either
the suspend is successful and nothing will be returned to the guest
because it is suspended; or 2) the suspend of the guest has failed so
context ID is useless.

> The more I look at the code, the more I think that domain_suspend should be
> executed in a tasklet (e.g continue_hypercall_on_cpu) because the vCPU would 
> not
> be running anymore. So you don't have to worry of anyone modifying the 
> registers
> and therefore can access them safely as they would all be synced to the
> structure vCPU.
>

Lets revisit this once we align on the points above. Please let me
know if there is anything else I need to clarify, or if I
misunderstood your point.

Thanks,
Mirela

> The tasklet code would be very similar to hvm_s3_suspend.
>
> >
> >> As upper layer may modify the vCPU context, I think it would be best to run
> >> system_suspend in a tasklet bound to CPU#0.
> >>
> >
> > I'm not following this too, please explain
> I tried to cover it in my explanation above. Let me know if it is unclear.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.