ABI Interface

From Linux/Xtensa
Revision as of 01:17, 10 November 2018 by Jcmvbkbc (talk | contribs) (fix ordinary call return value description)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Argument Passing

Arguments are passed in both registers and memory. The first six incoming arguments are stored in registers a2 through a7, and additional arguments are stored on the stack starting at the current stack pointer a1. Because Xtensa uses register windows that rotate during a function call, outgoing arguments that will become the incoming arguments must be stored to different register numbers. Depending on the call instruction and, thus, the rotation of the register window, the arguments are passed starting starting with register a(2+N), where N is the size of the window rotation. Therefore, the first argument in case of a call4 instruction is placed into a6, and for a call8 instruction into a10. Large arguments (8-bytes) are always passed in an even/odd register pair even if that means to omit a register for alignment. The return values are stored in a2 through a5 (so a function with return value occupying more than 2 registers may not be called with call12).

          return addr  stack ptr       arg0, arg1, arg2, arg3, arg4, arg5
          -----------  ---------       ----------------------------------
            a0           a1              a2,   a3,   a4,   a5,   a6,   a7

call4       a4           a5              a6,   a7,   a8,   a9,  a10,  a11
call8       a8           a9             a10,  a11,  a12,  a13,  a14,  a15
call12     a12          a13             a14,  a15   ---   ---   ---   --- 

Syscall ABI

Linux takes system-call arguments in registers. The ABI and Xtensa software conventions require the system-call number in a2. For improved efficiency, we try not to shift all parameters one register up to maintain the original order. Register a2 is, therefore, moved to a6, a6 to a8, and a7 to a9, if the system call requires these arguments.

syscall number               arg0, arg1, arg2, arg3, arg4, arg5
--------------               ----------------------------------
a2                           a6,   a3,   a4,   a5,   a8,   a9

Upon return, a2 contains the return value or error code. All other registers are preserved.


TLS and NPTL implementation

This is a description of the Xtensa-specific aspects of supporting thread-local storage (TLS). Originally developed by Sun, the implementation concept for TLS has been modified and generalized by Ulrich Drepper for Linux for a variety of platforms. "ELF Handling for Thread-Local Storage" provides background information and descriptions of the machine-independent aspects. Alex Oliva later developed an alternative and heavily optimized version for a Linux port to the FR-V processor and subsequently ported it to x86, x86_64, and ARM Linux systems. See also Oliva's implementation details for X86 and ARM. The current status of this alternative approach is somewhat ambiguous: it has been accepted into GCC and Binutils but the glibc patches have not made it into the mainline tree, yet. As a new port, however, Xtensa can uses this optimized implementation without affecting legacy software. Although it would be convenient to share some of the platform-independent code, it is not required and will work regardless of whether those patches are ever accepted.


Run-Time Handling of TLS

The __tls_get_addr function has a different prototype with this approach:

extern void *__tls_get_addr (tls_index *ti);

(This is actually the same prototype for __tls_get_addr that is used for most processors with the standard approach.)


TLS Access Models

The Initial Exec and Local Exec models are the same as with the standard approach and are not shown here.

General Dynamic Model

For the general dynamic model, the code sequence loads a function pointer and a single argument from the GOT and calls the function with that argument. If the thread-local symbol is in the static TLS, the runtime linker will set the argument to the TLS offset and the function to a short routine that returns the offset plus the thread pointer. Otherwise, the function is set to __tls_get_addr and the argument is a pointer to the tls_index structure holding the offset and module values. This tls_index structure is dynamically allocated in a hash table by the runtime linker.

This approach allows lazy relocation processing for TLS references, which should improve the start-up times. The runtime linker can initialize the TLS function pointer in the GOT to a resolver function. The details of this are presumably similar to other processors and are not yet specified for Xtensa.

Location Code Sequence Initial Relocations Dynamic Relocations
.literal
 
.LC0
.LC1
R_XTENSA_TLSDESC_FN (x)
R_XTENSA_TLSDESC_ARG (x)
R_XTENSA_TLSDESC_FN (x)
R_XTENSA_TLSDESC_ARG (x)
.text 
 
 
0x00  l32r   a8, .LC0
0x03  l32r   a10, .LC1
0x06  callx8 a8
R_XTENSA_TLS_FUNC (x)
R_XTENSA_TLS_ARG (x)
R_XTENSA_TLS_CALL (x)
 
 
 

The assembly syntax for this instruction sequence is:

movi   a8, x@TLSFUNC
movi   a10, x@TLSARG
callx8.tls a8, x@TLSCALL

This relies on the assembler to relax the movi instructions to l32r instructions, so that both the literals and the instructions get the appropriate relocations. The callx8.tls assembler macro generates a callx8 instruction with an extra relocation specified by its second operand.

Local Dynamic Model

This model uses the same relocations and functions as the general dynamic model except that it references a special linker-defined _TLS_MODULE_BASE_ symbol which is set to the start of the local TLS space.

Location Code Sequence Initial Relocations Dynamic Relocations
.literal
 
 
 
.LC0
.LC1
.LC2
.LC3
R_XTENSA_TLSDESC_FN (_TLS_MODULE_BASE_)
R_XTENSA_TLSDESC_ARG (_TLS_MODULE_BASE_)
R_XTENSA_TLS_DTPOFF (x)
R_XTENSA_TLS_DTPOFF (y)
R_XTENSA_TLSDESC_FN (_TLS_MODULE_BASE_)
R_XTENSA_TLSDESC_ARG (_TLS_MODULE_BASE_)
 
 
.text 
 
 
 
 
 
 
 
 
0x00  l32r   a8, .LC0
0x03  l32r   a10, .LC1
0x06  callx8 a8
...
0x09  l32r   a12, .LC2
0x0c  add    a12, a12, a10
...
0x0f  l32r   a13, .LC3
0x12  add    a13, a13, a10
R_XTENSA_TLS_FUNC (_TLS_MODULE_BASE_)
R_XTENSA_TLS_ARG (_TLS_MODULE_BASE_)
R_XTENSA_TLS_CALL (_TLS_MODULE_BASE_)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The assembly syntax for this code sequence is:

movi   a8, _TLS_MODULE_BASE_@TLSFUNC
movi   a10, _TLS_MODULE_BASE_@TLSARG
callx8.tls a8, _TLS_MODULE_BASE_@TLSCALL
...
movi   a12, x@DTPOFF
add    a12, a12, a10
...
movi   a13, y@DTPOFF
add    a13, a13, a10


Linker Optimizations

General Dynamic -> Initial Exec

.literal .LC0
.literal .LC1
 
0x00  l32r   a8, .LC0
0x03  l32r   a10, .LC1
0x06  callx8 a8
R_XTENSA_TLSDESC_FN (x)
R_XTENSA_TLSDESC_ARG (x)
 
R_XTENSA_TLS_FUNC (x)
R_XTENSA_TLS_ARG (x)
R_XTENSA_TLS_CALL (x)
->
 
.literal .LC1
 
0x00  rur    a8, THREADPTR
0x03  l32r   a10, .LC1
0x06  add    a10, a10, a8
 
R_XTENSA_TLS_TPOFF (x)
 
 
 
 

Local Dynamic -> Local Exec

.literal .LC0
.literal .LC1
.literal .LC2
 
0x00  l32r   a8, .LC0
0x03  l32r   a10, .LC1
0x06  callx8 a8
...
0x09  l32r   a12, .LC2
0x0c  add    a12, a12, a10
R_XTENSA_TLSDESC_FN (_TLS_MODULE_BASE_)
R_XTENSA_TLSDESC_ARG (_TLS_MODULE_BASE_)
R_XTENSA_TLS_DTPOFF (x)
 
R_XTENSA_TLS_FUNC (_TLS_MODULE_BASE_)
R_XTENSA_TLS_ARG (_TLS_MODULE_BASE_)
R_XTENSA_TLS_CALL (_TLS_MODULE_BASE_)
 
 
 
->
 
 
.literal .LC2
 
0x00  nop
0x03  nop
0x06  rur    a10, THREADPTR
...
0x09  l32r   a12, .LC2
0x0c  add    a12, a12, a10
 
 
R_XTENSA_TLS_TPOFF (x)
 
 
 
 
 
 
 


New Xtensa ELF Definitions

.literal
R_XTENSA_TLSDESC_FN
R_XTENSA_TLSDESC_ARG
R_XTENSA_TLS_DTPOFF
R_XTENSA_TLS_TPOFF
.text
R_XTENSA_TLS_FUNC
R_XTENSA_TLS_ARG
R_XTENSA_TLS_CALL