###notes
-
在8086处理器上,如果要用寄存器来提供偏移地址,只能使用BX,si,di,bP,不能使用其它寄存器.以下指令是非法的: mov [ax], dl
-
8086处理器只支持以下几种基地址寄存器和变址寄存器的组合:[bx+si] [bx+di] [bp+si] [bp+di]
条件转移指令
- js 当SF == 1则跳转 , jns与之相反
- jz 当zF == 1则跳转, jnz与之相反
- jo 当OF == 1则跳转, jno与之相反
- jc 当CF == 1则跳转, jnc与之相反
- jp 当PF == 1则跳转, jnp与之相反
常用指令
- movsb | movsw
DS:SI --> ES:DI
DF = 0 low->high
DF = 1 high->low
instruction 'cld' can clear df
- div A / B. A是被除数, B是为除数.
被除数A默认存在AX中, 或者AX和DX中(DX存高x位)
如果除数是8位,那么除法的结果AL保存商,AH保存余数.
如果除数16位,那么除法的结果 AX保存商,DX保存余数。
-
cli && sti 清除和设置IF(中断标志位)
-
repe/repne 相等(ZF==0)/不相等的时候重复
-
scasb(w|d) 两个对比数分别为 EAX / AX / AL 和 DS:EDI, 并且将edi-+1(2|4)
ret
- ret 弹栈到IP寄存器
- retf 弹栈到IP,再弹栈到CS
- iret 弹栈到IP,再弹栈到CS,再弹栈到FLAGS
call 指令
call 指令有三种形式:
第一种是16位相对近调用, 近调用的意思是被调用的目标过程位于当前代码段内,而非另一个不同的代码段,所以只需要得到偏移地址即可.
16位相对近调用是三字节指令, 操作码为0xE8. 在指令执行阶段, 处理器看到操作码0xE8, 就知道它应当调用一个过程. 于是,它用指令指针寄存器IP的当前内容加上指令中的操作数, 再加上3, 得到一个新的偏移地址. 接着将IP的原有内容压入栈. 最后,用刚才的偏移地址取代IP原油的内容. 这直接导致处理器的执行流程转移到目标位置处.
第二种是16位间接绝对近调用, 这种调用也是近调用, 只能调用当前代码段内的过程, 指令中的操作数不是偏移量, 而是被调用过程的真实偏移地址, 故称绝对地址. 这个地址不是直接出现在指令中,而是由16位的通用寄存器或者16位的内存单元简介给出.
第三种是16位直接绝对远调用, 调用另一个代码段内的过程. 比如:
call 0x20000:0x0030
如果被调用过程处于当前代码段, 也没关系.
第四种是16位间接绝对远调用, 这也属于段间调用,被调用过程属于另一个代码段. 比如:
call far [0x2000]
call far [bx]
间接远调用必须使用far.
中断
实模式下,处理器要求中断向量表需要存放在物理地址0x00000->0x003ff,同1k的空间内,共256个中断,每个中断向量4个字节.
0x00000 ... -> ....0x003ff
---------------------------------------------
偏移地址(2字节) | 段地址(2字节) | 偏移地址..
---------------------------------------------
内联汇编
常用约束
- r : Register(s)
- a: %eax, %ax, %al
- b: %ebx, %bx, %bl
- c: %ecx, %cx, %cl
- d: %edx, %dx, %dl
- S: %esi, %si
- D: %edi, %di
- i: 直接操作数
- =: 只写
函数调用寄存器约定
- rax - temporary register; when we call a syscal, rax must contain syscall number
- rdi - used to pass 1st argument to functions
- rsi - pointer used to pass 2nd argument to functions
- rdx - used to pass 3rd argument to functions
- rcx - fourth argument
- r8 - fifth argument
- r9 - sixth
- stack - over sixth
待整理
栈
- RBP is the base pointer register. It points to the base of the current stack frame.
- RSP is the stack pointer, which points to the top of current stack frame.
two operators:
- push argument - increments stack pointer (RSP) and stores argument in location pointed by stack pointer
- pop argument - copied data to argument from location pointed by stack pointer
段
- data - section is used for declaring initialized data or constants
- bss - section is used for declaring non initialized variables
- text - section is used for code
以sys_write为例:
size_t sys_write(unsigned int fd, const char * buf, size_t count);
section .data
msg db "hello, world!" ; db代表声明单字节类型的
section .text
global _start
_start:
mov rax, 1 ; sys_write调用号
mov rdi, 1 ; stdcout
mov rsi, msg
mov rdx, 13
syscall
mov rax, 60
mov rdi, 0
syscall
简单程序示例
section .data
; Define constants
num1: equ 100
num2: equ 50
; initialize message
msg: db "correct"
section .text
global _start
;; entry point
_start:
; set num1's value to rax
mov rax, num1
; set num2's value to rbx
mov rbx, num2
; get sum of rax and rbx, and store it's value in rax
add rax, rbx
; compare rax and 150
cmp rax, 150
; go to .exit label if rax and 150 are not equal
jne .exit
; go to .rightSum label if rax and 150 are equal
jmp .rightSum
; Print message that sum is correct
.rightSum:
;; write syscall
mov rax, 1
;; file descritor, standard output
mov rdi, 1
;; message address
mov rsi, msg
;; length of message
mov rdx, 8
;; call write syscall
syscall
; exit from program
jmp .exit
; exit procedure
; exit syscall
mov rax, 60
; exit code
mov rdi, 0
; call exit syscall
syscall
声明
initialized:
- equ(equate) similar to C’s define
- db define and allocate a byte,
- dw,dd,dt,do,dy and dz
non initialized variables:
- RESB, RESW, RESD, RESQ, REST, RESO, RESY and RESZ
简单程序示例
%macro PRINT 1
pusha
pushf
jmp %%astr
%%str db %1, 0
%%strln equ $-%%str
%%astr: _syscall_write %%str, %%strln
popf
popa
%endmacro
%macro _syscall_write 2
mov rax, 1
mov rdi, 1
mov rsi, %%str
mov rdx, %%strln
syscall
%endmacro
AT&T 语法
.data
// initialized data definition
.text
.global _start
_start:
// main routine
data definition
.section .data
// 1 byte
var1: .byte 10
// 2 byte
var2: .word 10
// 4 byte
var3: .int 10
// 8 byte
var4: .quad 10
// 16 byte
var5: .octa 10
// assembles each string (with no automatic trailing zero byte) into consecutive addresses
str1: .asci "Hello world"
// just like .ascii, but each string is followed by a zero byte
str2: .asciz "Hello world"
// Copy the characters in str to the object file
str3: .string "Hello world"
oprands order
;move source, destination
movw $10, %ax
GNU assembler has 6 postfixes for operations:
- b - 1 byte operands
- w - 2 bytes operands
- l - 4 bytes operands
- q - 8 bytes operands
- t - 10 bytes operands
- o - 16 bytes operands
This rule is ot only mov instruction, but also for all another like addl, xorb, cmpw and etc…