免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3208 | 回复: 0
打印 上一主题 下一主题

AT&T Assembly Syntax [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2006-10-23 23:43 |只看该作者 |倒序浏览

AT&T Assembly Syntax
By vivek
Updated: May/10 '06
This article is a 'quick-n-dirty' introduction to the AT&T assembly language syntax, as implemented in the GNU Assembler as(1). For the first timer the AT&T syntax may seem a bit confusing, but if you have any kind of assembly language programming background, it's easy to catch up once you have a few rules in mind. I assume you have some familiarity to what is commonly referred to as the INTEL-syntax for assembly language instructions, as described in the x86 manuals. Due to its simplicity, I use the NASM (Netwide Assembler) variant of the INTEL-syntax to cite differences between the formats.
The GNU assembler is a part of the GNU Binary Utilities (binutils), and a back-end to the GNU Compiler Collection. Although as is not the preferred assembler for writing reasonably big assembler programs, its a vital part of contemporary Unix-like systems, especially for kernel-level hacking. Often criticised for its cryptic AT&T-style syntax, it is argued that as was written with an emphasis on being used as a back-end to GCC, with little concern for "developer-friendliness". If you are an assembler programmer hailing from an INTEL-Syntax background, you'll experience a degree of stifling with regard to code-readability and code-generation. Nevertheless, it must be stated that, many operating systems' code-base depend on as as the assembler for generating low-level code.
The Basic Format
The structure of a program in AT&T-syntax is similar to any other assembler-syntax, consisting of a series of directives, labels, instructions - composed of a mnemonic followed by a maximum of three operands. The most prominent difference in the AT&T-syntax stems from the ordering of the operands.
For example, the general format of a basic data movement instruction in INTEL-syntax is,
mnemonic        destination, source
whereas, in the case of AT&T, the general format is
mnemonic        source, destination
To some (including myself), this format is more intuitive. The following sections describe the types of operands to AT&T assembler instructions for the x86 architecture.
Registers
All register names of the IA-32 architecture must be prefixed by a '%' sign, eg. %al,%bx, %ds, %cr0 etc.
mov        %ax, %bx
The above example is the mov instruction that moves the value from the 16-bit register AX to 16-bit register BX.
Literal Values
All literal values must be prefixed by a '$' sign. For example,
               
mov        $100,        %bx
mov        $A,        %al
The first instruction moves the the value 100 into the register AX and the second one moves the numerical value of the ascii A into the AL register. To make things clearer, note that the below example is not a valid instruction,
mov        %bx,        $100
as it just tries to move the value in register bx to a literal value. It just doesn't make any sense.
Memory Addressing
In the AT&T Syntax, memory is referenced in the following way,
segment-override:signed-offset(base,index,scale)
parts of which can be omitted depending on the address you want.
%es:100(%eax,%ebx,2)
Please note that the offsets and the scale should not be prefixed by '$'. A few more examples with their equivalent NASM-syntax, should make things clearer,
GAS memory operand                        NASM memory operand
------------------                        -------------------
100                                        [100]
%es:100                                        [es:100]
(%eax)                                        [eax]
(%eax,%ebx)                                [eax+ebx]
(%ecx,%ebx,2)                                [ecx+ebx*2]
(,%ebx,2)                                [ebx*2]
-10(%eax)                                [eax-10]
%ds:-10(%ebp)                                [ds:ebp-10]Example instructions, mov        %ax,        100
mov        %eax,        -100(%eax)
The first instruction moves the value in register AX into offset 100 of the data segment register (by default), and the second one moves the value in eax register to [eax-100].
Operand Sizes
At times, especially when moving literal values to memory, it becomes neccessary to specify the size-of-transfer or the operand-size. For example the instruction,
mov        $10,        100
only specfies that the value 10 is to be moved to the memory offset 100, but not the transfer size. In NASM this is done by adding the casting keyword byte/word/dword etc. to any of the operands. In AT&T syntax, this is done by adding a suffix - b/w/l - to the instruction. For example,
movb        $10,        %es:(%eax)
moves a byte value 10 to the memory location [ea:eax], whereas, movl        $10,        %es:(%eax)
moves a long value (dword) 10 to the same place.
A few more examples,
movl        $100, %ebx
pushl        %eax
popw        %ax
Control Transfer Instructions
The jmp, call, ret, etc., instructions transfer the control from one part of a program to another. They can be classified as control transfers to the same code segment (near) or to different code segments (far). The possible types of branch addressing are - relative offset (label), register, memory operand, and segment-offset pointers.
Relative offsets, are specified using labels, as shown below.
label1:
        .
        .
  jmp        label1
Branch addressing using registers or memory operands must be prefixed by a '*'. To specify a "far" control tranfers, a 'l' must be prefixed, as in 'ljmp', 'lcall', etc. For example, GAS syntax                        NASM syntax
==========                        ===========
jmp        *100                        jmp  near [100]
call        *100                        call near [100]
jmp        *%eax                        jmp  near eax
jmp        *%ecx                        call near ecx
jmp        *(%eax)                        jmp  near [eax]
call        *(%ebx)                        call near [ebx]
ljmp        *100                        jmp  far  [100]
lcall        *100                        call far  [100]
ljmp        *(%eax)                        jmp  far  [eax]
lcall        *(%ebx)                        call far  [ebx]
ret                                retn
lret                                retf
lret $0x100                        retf 0x100
Segment-offset pointers are specified using the following format:
jmp        $segment, $offset
For example:
jmp        $0x10, $0x100000If you keep these few things in mind, you'll catch up real soon. As for more details on the GNU assembler, you could try the
documentation
.
Topics
»
Algorithms
»
BSD
»
Computing
»
Culture
»
Humor
»
Linux
»
Programming
»
Science
»
Software
»
Web


Navigation
search

admin wiki

User login
Username: *  
Password: *   

Recent blog posts

more


Syndicate



本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/1883/showart_189332.html
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP