AMD64汇编函数参数传递
Intel386函数传参是通过ebp寄存器和栈实现的,而AMD64稍有不同。函数参数是通过寄存器传递的,从第一个参数开始依次为rdi, rsi, rdx, rcx, r8, r9。可以理解为这六个寄存器归被调函数所有。rbp, rbx, r12, r13, r14, r15属于调用函数,如果被调函数要用这六个寄存器,要在栈中进行push/pop操作。如果明白了这一点,再看汇编程序,大部分操作都是在准备rdi, rsi, rdx, rcx, r8, r9这六个寄存器,也就是函数参数。下面以函数bge_alloc_dma_mem
为例看看它是如何准备参数的。
/*
* Allocate an area of memory and a DMA handle for accessing it
*/
static int
bge_alloc_dma_mem(bge_t *bgep, size_t memsize, ddi_device_acc_attr_t *attr_p,
uint_t dma_flags, dma_area_t *dma_p)
{
caddr_t va;
int err;
BGE_TRACE(("bge_alloc_dma_mem($%p, %ld, $%p, 0x%x, $%p)",
(void *)bgep, memsize, attr_p, dma_flags, dma_p));
/*
* Allocate handle
*/
err = ddi_dma_alloc_handle(bgep->devinfo, &dma_attr,
DDI_DMA_DONTWAIT, NULL, &dma_p->dma_hdl);
if (err != DDI_SUCCESS)
return (DDI_FAILURE);
/*
* Allocate memory
*/
err = ddi_dma_mem_alloc(dma_p->dma_hdl, memsize, attr_p,
dma_flags, DDI_DMA_DONTWAIT, NULL, &va, &dma_p->alength,
&dma_p->acc_hdl);
if (err != DDI_SUCCESS)
return (DDI_FAILURE);
/*
* Bind the two together
*/
dma_p->mem_va = va;
err = ddi_dma_addr_bind_handle(dma_p->dma_hdl, NULL,
va, dma_p->alength, dma_flags, DDI_DMA_DONTWAIT, NULL,
&dma_p->cookie, &dma_p->ncookies);
...
}
$ pfexec mdb -k
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs sockfs ip hook neti sctp arp usba uhci s1394 qlc fctl random md lofs fcip sd fcp cpc crypto logindmux ptm ufs nsmb sppp nfs ipc mpt emlxs ]
> bge_alloc_dma_mem::dis
bge_alloc_dma_mem(bge_t *bgep, size_t memsize, ddi_device_acc_attr_t *attr_p, uint_t dma_flags, dma_area_t *dma_p)
执行到这里的时候,寄存器中的值如下:
rdi = *bgep
rsi = memsize
rdx = *attr_p
rcx = dma_flags
r8= *dma_p
bge_alloc_dma_mem: pushq%rbp
bge_alloc_dma_mem+1: movq %rsp,%rbp
bge_alloc_dma_mem+4: subq $0x28,%rsp
bge_alloc_dma_mem+8: movq %rdi,-0x8(%rbp) devinfo, &dma_attr, DDI_DMA_DONTWAIT, NULL, &dma_p->dma_hdl);
ddi_dma_alloc_handle用到了五个参数,下面的语句就是准备这五个参数的。
rdi = bgep->devinfo
rsi = dma_attr
rdx = 0
rcx = 0
r8= dma_p->dma_hdl
函数ddi_dma_alloc_handle也需要rdi, rsi, rdx, rcx, r8, r9这六个寄存器作为参数,所以将bge_alloc_dma_mem中的参数保存起来。
bge_alloc_dma_mem+0x27: movq %rsi,%r14 devinfo
bge_alloc_dma_mem+0x32: movq (%rdi),%rdi devinfo
bge_alloc_dma_mem+0x35: leaq -0x38166f34(%rip),%rsi dma_hdl
bge_alloc_dma_mem+0x41: xorq %rdx,%rdx
eax仍然保存函数(ddi_dma_alloc_handle)返回值。
bge_alloc_dma_mem+0x4c: testl%eax,%eax
bge_alloc_dma_mem+0x4e: je +0xa
赋函数返回值eax=-1(DDI_FAILURE),如果ddi_dma_alloc_handle调用失败。
bge_alloc_dma_mem+0x50: movl $-0x1,%eax
bge_alloc_dma_mem+0x55: jmp +0xa8
err = ddi_dma_mem_alloc(dma_p->dma_hdl, memsize, attr_p,
dma_flags, DDI_DMA_DONTWAIT, NULL, &va, &dma_p->alength,
&dma_p->acc_hdl);
执行到这里的时候,寄存器中的值如下:
ebx = dma_flags
r12 = *dma_p
r13 = *attr_p
r14 = memsize
ddi_dma_mem_alloc需要九个参数,分别如下:
rdi = dma_p->dma_hdl
rsi = memsize
rdx = *attr_p
rcx = dma_flags
r8= DDI_DMA_DONTWAIT
r9= NULL
va
dma_p->alength
dma_p->acc_hdl
只有六个寄存器可以用于传递参数,从下面的代码可以看出,对于多出的参数仍然采用Intel386的方式,也就是采用栈,顺序依然是后面的参数先压入栈。
bge_alloc_dma_mem+0x5a: subq $0x8,%rsp
bge_alloc_dma_mem+0x5e: movq 0x20(%r12),%rdi dma_hdl
bge_alloc_dma_mem+0x63: pushq%r12 acc_hdl(acc_hdl is the first member of dma_p)
bge_alloc_dma_mem+0x65: leaq 0x18(%r12),%r8
bge_alloc_dma_mem+0x6a: pushq%r8 alength
bge_alloc_dma_mem+0x6c: leaq -0x58(%rbp),%r8
bge_alloc_dma_mem+0x70: pushq%r8
bge_alloc_dma_mem+0x85: addq $0x20,%rsp
bge_alloc_dma_mem+0x89: testl%eax,%eax
bge_alloc_dma_mem+0x8b: je +0x7
bge_alloc_dma_mem+0x8d: movl $-0x1,%eax
bge_alloc_dma_mem+0x92: jmp +0x6e
bge_alloc_dma_mem+0x94: movq -0x58(%rbp),%r8
bge_alloc_dma_mem+0x98: movq %r8,0x8(%r12)
bge_alloc_dma_mem+0x9d: subq $0x8,%rsp
bge_alloc_dma_mem+0xa1: movq 0x20(%r12),%rdi
bge_alloc_dma_mem+0xa6: movq -0x58(%rbp),%rdx
bge_alloc_dma_mem+0xaa: movq 0x18(%r12),%rcx
bge_alloc_dma_mem+0xaf: leaq 0x48(%r12),%r8
bge_alloc_dma_mem+0xb4: pushq%r8
bge_alloc_dma_mem+0xb6: leaq 0x30(%r12),%r8
bge_alloc_dma_mem+0xbb: pushq%r8
bge_alloc_dma_mem+0xbd: pushq$0x0
bge_alloc_dma_mem+0xbf: xorq %rsi,%rsi
bge_alloc_dma_mem+0xc2: movl %ebx,%r8d
bge_alloc_dma_mem+0xc5: xorq %r9,%r9
bge_alloc_dma_mem+0xc8: call +0x36bbdcb
bge_alloc_dma_mem+0xcd: addq $0x20,%rsp
bge_alloc_dma_mem+0xd1: testl%eax,%eax
bge_alloc_dma_mem+0xd3: jne +0x28
bge_alloc_dma_mem+0xd5: cmpl $0x1,0x48(%r12)
bge_alloc_dma_mem+0xdb: jne +0x20
bge_alloc_dma_mem+0xdd: movl $-0x1,%eax
bge_alloc_dma_mem+0xe2: movl %eax,0x10(%r12)
bge_alloc_dma_mem+0xe7: movl %eax,0x14(%r12)
bge_alloc_dma_mem+0xec: movl %eax,0x4c(%r12)
bge_alloc_dma_mem+0xf1: xorq %r8,%r8
bge_alloc_dma_mem+0xf4: movq %r8,0x28(%r12)
bge_alloc_dma_mem+0xf9: xorl %eax,%eax
bge_alloc_dma_mem+0xfb: jmp +0x5
bge_alloc_dma_mem+0xfd: movl $-0x1,%eax
bge_alloc_dma_mem+0x102: addq $0x18,%rsp
bge_alloc_dma_mem+0x106: popq %r14
bge_alloc_dma_mem+0x108: popq %r13
bge_alloc_dma_mem+0x10a: popq %r12
bge_alloc_dma_mem+0x10c: popq %rbx
bge_alloc_dma_mem+0x10d: leave
bge_alloc_dma_mem+0x10e: ret
参考资料:
http://www.x86-64.org/documentation/abi.pdf
http://www.x86-64.org/documentation/assembly.html
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/23177/showart_1986031.html
页:
[1]