[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Using SYSCALL/SYSRET with a minios kernel



Daniel Stodden <stodden@xxxxxxxxxx> writes:

> On Mon, 2008-02-25 at 11:04 +0100, Goswin von Brederlow wrote:
>
>> >> --- kernel.c ---
>> >>   HYPERVISOR_set_callbacks((unsigned long)hypervisor_callback,
>> >>                      (unsigned long)failsafe_callback,
>> >>                      (unsigned long)syscall_callback);
>> >> 
>> >>   __asm__ __volatile__("syscall");
>> >> 
>> >> If I understood you right that should set the RIP to syscall_callback
>> >> and execute from there.
>> >
>> > M�¶�¶p! Only when calling in from virtual user mode. Otherwise, you're
>> > triggering a hypercall service routine, and one might suspect you're
>> > presently just generating an error condition with that. :)
>> 
>> That sounds verry odd. I'm getting no indication of it from xen.
>
> Why odd? That's how e.g. syscall processing in Xen's entry.S is structured.
> Many hypercalls fail with messages. But e.g. an invalid hypercall number
> would silently return -ENOSYS, so it does not appear too unlikely. 
> What do you get instead?

Nothing. The 'syscall' instruction behaves like a 'nop'. If Xen's
syscall emulation fails then I would expect it to raise some
exception, e.g. illegal instruction.

The amd64 tech docs indicate that syscall can be used recursively
(indicated by SS == 0) and no check on the CPL is performed by
'syscall'. So I would expect that i could call 'syscall' even in
kernel mode in Xen too. But now that I know better that is ok too.

>> But ok. How do I test that. Or differently phrased: What is the best
>> way to go into user space for the verry first time? Do I really have
>> to create a fake stack frame and call HYPERVISOR_iret?
>
> iret is the only method I am aware of, can't think about anything else. Doubt
> that a stack switch would be forcibly required.
>
> Does not neccesarily mean much, however, since I did not write the freaky 
> thing.

I added the following code to x86_64.S:

ENTRY(fail)
        syscall
        int $3
        jmp fail
        
ENTRY(go_user)
        pushq $0xe02b
        pushq %rsp
        subq  $64,(%rsp)
        pushq $0x10212
        pushq $0xe030
        pushq $fail
        orb   $3,1*8(%rsp)
        orb   $3,4*8(%rsp)
        pushq $0
        jmp  hypercall_page + (__HYPERVISOR_iret * 32)

So I construct a stack frame that looks like an interrupt happened and
the next instruction to run is at 'fail'. I set the ring to ring3
(orb) and then do an iret.

The code switches the ESP and EIP and continued executing 'fail' in
user mode. I know for sure it is user mode because first it gave an
error that there is no user page table set (see below). Anyway, in
user mode the 'syscall' then works.

>> > BTW: I found building Xen with 'debug=y' generates a helpful comment on
>> > the console every now and xen.
>> 
>> I did that and added a patch that makes HYPERVISOR_console_io work for
>> domU so it shows up in "xm dmesg".
>
> Ah, I see. Good idea.
>
>> >> But still, the syscall opcode does nothing.
>> >> In case you wonder. The "int $80" is there to crash the domain and
>> >> tell me it reached that point.
>
> Shouldn't that just get you a GPF? 

Which calls the do_general_protection, if installed, which dumps the
registers:
GPF rip: 00000000001031fc, error_code=282
RIP: e030:[<00000000001031fc>] 
RSP: e02b:0000000000121fc8  EFLAGS: 00010212
RAX: 0000000000000000 RBX: 0000000000119000 RCX: 00000000001031fc
RDX: 0000000000000100 RSI: 00000000deadbeef RDI: 00000000deadbeef
RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
R10: 00000000fffffff9 R11: 0000000000000212 R12: 00000000001033ea
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Without handler (or in other instances) the domain truely crashes and
'xen dmesg' has a nice register and stack dump like:

(XEN) traps.c:212:d18 Guest switching to user mode with no user page tables
(XEN) traps.c:241:d18 Fatal error
(XEN) Domain 18 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.0.4-1  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<00000000001022eb>]
(XEN) RFLAGS: 0000000000000206   CONTEXT: guest
(XEN) rax: 0000000000000017   rbx: 0000000000119000   rcx: 00000000001022eb
(XEN) rdx: 0000000000000100   rsi: 00000000deadbeef   rdi: 00000000deadbeef
(XEN) rbp: 0000000000000000   rsp: 0000000000108d00   r8:  00000000ffffffff
(XEN) r9:  0000000000000000   r10: 00000000fffffffc   r11: 0000000000000206
(XEN) r12: 00000000001033ea   r13: 0000000000000000   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 000000005b50f000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=0000000000108d00:
(XEN)    0000000000000000 0000000000000001 000000000011b000 0000000000000000
(XEN)    00000000001033ea 000000000000e033 0000000000010212 0000000000108d00
(XEN)    000000000000e02b 0000000000105872 0000000000000000 0000000000000000
...

Both I find rather helpfull in debugging. :)

> regards,
> Daniel

MfG
        Goswin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.