Addressing mode specifies how the processor reaches the data in the memory. x86 architecture implements immediate, direct and indirect memory addressing. Indirect addressing can use a single or two registers and a constant to calculate the final address. The addressing mode takes the name of the registers used. The 32-bit mode makes the choice of the register for addressing more flexible and enhances addressing with the possibility of scaling: multiplying one register by a small constant. In 64-bit mode, addressing relative to the instruction pointer was added for easy relocation of programs in memory. In this chapter, we will focus on the details of all addressing modes in 16, 32 and 64-bit processors. In each addressing mode, we are using the simple examples with the mov instruction. The move instruction copies data from the source operand to the destination operand. The order of the operands in instructions is similar to that of high-level languages. The left operand is the destination, the right operand is the source, as in the following example:
mov destination, source
The immediate argument is a constant encoded as part of the instruction. This means that this value is encoded in a code section of the program and can't be modified during program execution. In 16-bit mode, the size of the constant can be 8 or 16 bits, and in 32- and 64-bit mode, it can be up to 32 bits. The use of the immediate operand depends on the instruction. It can be, for example, a numerical constant, an offset of the address or additional displacement used for efficient address calculation, a value which specifies the number of iterations in shift instructions, a mask for choosing the elements of the vector to operate with or a modifier which influences the instruction behaviour. Please refer to the specific instruction documentation for the detailed description. The examples of instructions using immediate addressing
mov ax, 10 ; move the constant value of 10 to the accumulator AX mov ebx, 35 ; move the constant 35 to the EBX register mov rcx, 5 ; move the constant 5 to the RCX register
In direct addressing mode, the data is reached by specifying the target offset (displacement) as a constant. The processor uses this offset together with the appropriate segment register to access the byte in memory. The displacement can be specified in the instruction as a number, a previously defined constant or a constant expression. With segmentation enabled, it is possible to use the segment prefix to select the segment register we want to use. The example instructions which use numbers or a variable name as the displacement are shown in the following code and presented in figure 1.
; copy one byte from the BL register to memory address 0800h in the data segment mov ds:[0800h], bl ; copy two bytes from memory at address 0600h to the AX accumulator mov ax, ds:[0600h] ; copy eight bytes from the variable in memory (previously defined) to RAX mov rax, variable
64-bit processors have some specific addressing modes. The default mode for direct addressing mode in the x64 architecture is addressing relative to the RIP register. If there is no base or index register in the instruction, the address is encoded as a 32-bit signed value. This value represents the distance between the data byte in memory and the address of the next instruction (current value of RIP). In figure 2, the RIP relative addressing where the variable is moved to AL is shown.
In this mode, the instruction holds the 32-bit signed value, which is sign-extended to 64 bits by the processor. This limits the addressing space to two areas. The first region starts from the address 0 and can reach 2GB of memory. The second is 2GB at the very end of the entire address space. As a 2GB memory size is insufficient for modern operating systems, this addressing mode is not supported by Windows compilers. MASM can use this mode in conjunction with a base or index register.
In this mode, the address is a 64-bit unsigned value. As in general, the arguments of the instruction are 32-bit in length, this addressing mode can be used only with a specific version of the MOV instruction and only with the accumulator (AL, AX, EAX, RAX). MASM assembler does not use this mode.
In the x86 architecture, there is a possibility to use one or two registers in one instruction to calculate the effective address, which is the final offset within the current segment (or section). In case of the use of two registers, one of them is called the base register, the second one is called the index register. In 16-bit processors, the base registers can be BX and BP only, while the index registers can be SI or DI. If BP is used, the processor automatically chooses the stack segment by default. For BX used as the base register, or for instructions with an index register only, the processor accesses the data segment by default. The 32-bit architecture makes the choice of registers much more flexible, and any of the eight registers (including the stack pointer) can be used as the base register. Here, the stack segment is chosen if the base register is EBP or ESP. The index register can be any of the general-purpose registers, excluding the stack pointer. Additionally, the index register can be scaled by a factor of 1, 2, 4 or 8. In the 64-bit architecture, which introduces eight additional registers, any of the sixteen general-purpose registers can be the base register or index register (excluding the stack pointer, which can be base only). In the following sections, we will show examples of all possible addressing modes.
Base addressing mode uses the base register only. The base register is not scaled, no index register is used, and no additional displacement is added. In figure 3, the use of BX as the base register is shown to transfer data from memory to AL.
The code shows other examples of base addressing.
; copy one byte from the data segment in memory at the address from the BX register to AL mov al, [bx] ; copy two bytes from the data segment in memory at the address from the EBX register to AX mov ax, [ebx] ; copy eight bytes from memory at address from RBX to RAX (no segmentation in x64) mov rax, [rbx]
Base addressing mode with displacement uses the base register with an additional constant added. So the final effective address is a sum of the content of the base register and the constant. It can be interpreted as the base register holds the address of the data table, and the constant is an offset of the byte in the table. In figure 4, the use of BX as the base register with additional displacement is shown to transfer data from memory to AL.
Some code examples are shown below. The number represents the previously defined constant. MASM accepts various syntaxes of base plus displacement address notation.
; copy one byte from the data segment in memory at the address from the BX register ; increased by "number" to AL mov al, [bx] + number mov al, [bx + number] mov al, [bx] number mov al, number [bx] mov al, number + [bx] mov al, [number + bx]
Index addressing mode with displacement uses the index register with an additional constant added. So the final effective address is a sum of the content of the index register and the constant. It can be interpreted as the address of the data table is a constant, and the index register is an offset of the byte in a table. In figure 5, the use of DI as the index register with additional displacement is shown to transfer data from memory to AL.
Some code examples are shown below. The table represents the previously defined table in memory. MASM accepts various syntaxes of index plus displacement address notation.
; copy one byte from the data segment in memory at the address calculated ; as the sum of the table increased by index from DI to AL mov al, [di] + table mov al, [di + table] mov al, [di] table mov al, table [di] mov al, table + [di] mov al, [table + di]
In this addressing mode, the combination of two registers is used. In a 16-bit processor, the base register is BX or BP, and the index register is SI or DI. In a 32- or 64-bit processor, any register can be the base, and all except the stack pointer can be the index. The final effective address is the sum of the contents of two registers. In figure 6, the use of BP as the base and DI as the index register to transfer data from memory to AL is shown.
MASM assembler accepts different notations of the base + index registers combination, as shown in the code. In the x86, the order of registers written in the instruction is irrelevant.
; copy one byte from the data segment in the memory at the address calculated ; as the sum of the base (BX) register and the index (SI) register to AL mov al, [bx] + [si] mov al, [bx + si] mov al, [bx] [si] mov al, [si] [bx]
In x86, there are only four possible combinations of base and index registers. They are depicted in the code.
; copy one byte from the data or stack segment in memory at the address calculated ; as the sum of the base register and index register to AL mov al, [bx] + [si] ; data segment mov al, [bx] + [di] ; data segment mov al, [bp] + [si] ; stack segment mov al, [bp] + [di] ; stack segment
In 32- or 64-bit processors, the first register used in the instruction is the base register, and the second is the index register. While segmentation is enabled use of EBP or ESP as a base register determines the segment register choice. Notice that it is possible to use the same register as base and index in one instruction.
; copy one byte from the data or stack segment in memory at the address calculated ; as the sum of the base register and index register to AL mov al, [eax] + [esi] ; data segment mov al, [ebx] + [edi] ; data segment mov al, [ecx] + [esi] ; data segment mov al, [edx] + [edi] ; data segment mov al, [esi] + [esi] ; data segment mov al, [edi] + [edi] ; data segment mov al, [ebp] + [esi] ; stack segment mov al, [esp] + [edi] ; stack segment
In this addressing mode, the combination of two registers with an additional constant is used. In a 16-bit processor, the base register BX or BP, and the index register SI or DI. The constant can be encoded as an 8- or 16-bit value. In such a processor, this is the most complex mode available. In a 32- or 64-bit processor, any register can be the base, and all except the stack pointer can be the index. The constant is up to a 32-bit signed value. The final effective address is the sum of the contents of two registers and the displacement. In figure 7, the use of BX as the base and SI as the index register with displacement to transfer data from memory to AL is shown. It can be interpreted as the constant is the address of the table of structures, the base register holds the offset of the structure in a table, and the index register keeps the offset of an element within the structure.
MASM assembler accepts different notations of the base + index + displacement, as shown in the code. In the x86, the order of registers written in the instruction is irrelevant.
; copy one byte from the data segment in the memory at the address calculated ; as the sum of the base (BX) register, the index (SI) register and a displacement to AL mov al, [bx] + [si] + table mov al, [bx + si] + table mov al, [bx] [si] + table mov al, table [si] [bx] mov al, table [si] + [bx] mov al, table [si + bx]
In 32- or 64-bit processors, the first register used in the instruction is the base register, and the second is the index register. While segmentation is enabled, the use of EBP or ESP as a base register determines the segment register choice. The displacement can be placed at any position in the address argument expression. Some examples are shown below.
; copy one byte from the data or stack segment in memory at the address calculated ; as the sum of the base, index and displacement (table) to AL mov al, [eax] + [esi] + table ; data segment mov al, table + [ebx] + [edi] ; data segment mov al, table [ecx] [esi] ; data segment mov al, [edx] [edi] + table ; data segment mov al, table + [ebp] + [esi] ; stack segment mov al, [esp] + [edi] + table ; stack segment
Index addressing mode with scaling uses the index register multiplied by a simple constant of 1, 2, 4 or 8. This addressing mode is available for 32- or 64-bit processors and can use any general-purpose register except of stack pointer. In figure 8, the use of EBX as the index register with a scaling factor of 2 is shown to transfer data from memory to AL.
Because in these instructions, there is no base register used, if there is segmentation enabled, the data segment is always chosen.
; copy one byte from the data or stack segment in memory at the address calculated ; as the multiplication of the index register by a constant to AL mov al, [eax * 2] ; data segment mov al, [ebx * 4] ; data segment mov al, [ecx * 8] ; data segment mov al, [edx * 2] ; data segment mov al, [esi * 4] ; data segment mov al, [edi * 8] ; data segment mov al, [ebp * 2] ; data segment
Base indexed addressing mode with scaling uses the sum of the base register with the content of the index register multiplied by a simple constant of 1, 2, 4 or 8. This addressing mode is available for 32- or 64-bit processors and can use any general-purpose register as base and almost any general-purpose register as index, except of stack pointer. The figure 9 presents the use of EDI register as base, EAX as the index register with a scaling factor of 4 to transfer data from memory to AL.
The scaled register is assumed as the index, the other one is the base (even if it is not used first in the instruction). While segmentation is enabled, the use of EBP or ESP as a base register determines the segment register choice.
; copy one byte from the data or stack segment in memory at the address calculated ; as the sum of the base register and index register to AL mov al, [eax] + [esi * 2] ; data segment mov al, [ebx] + [edi * 4] ; data segment mov al, [ecx] + [eax * 8] ; data segment mov al, [edx] + [ecx * 2] ; data segment mov al, [esi] + [edx * 4] ; data segment mov al, [edi] + [edi * 8] ; data segment mov al, [ebp] + [esi * 2] ; stack segment mov al, [esp] + [edi * 4] ; stack segment
Base indexed addressing mode with displacement and scaling uses the sum of the base register, the content of the index register multiplied by a simple constant of 1, 2, 4 or 8, and an additional constant. This addressing mode is available for 32- or 64-bit processors and can use any general-purpose register as base and almost any general-purpose register as index, except of stack pointer. The displacement can be up to a 32-bit signed value, even in a 64-bit processor. The figure 10 presents the use of EDI register as base, EAX as the index register with a scaling factor of 4 to transfer data from memory to AL. The interpretation is similar to the base-indexed addressing mode with displacement for the x86 16-bit machine. The constant is the address of the beginning of the table of structures, the base register contains the offset of the structure, index register holds the scaled number of the element in a structure pointing to the chosen byte.
As in the base indexed mode with scaling without displacement, the scaled register is assumed as the index, and the other one is the base (even if it is not used first in the instruction). While segmentation is enabled, the use of EBP or ESP as a base register determines the segment register choice. The displacement can be placed at any position in the instruction.
; copy one byte from the data or stack segment in memory at the address calculated ; as the sum of the base, scaled index and displacement (table) to AL mov al, [eax] + [esi * 2] + table ; data segment mov al, table + [ebx] + [edi * 4] ; data segment mov al, table + [ebp] + [esi * 2] ; stack segment mov al, [esp] + [edi * 4] + table ; stack segment
In 16-bit processors, the base registers can be BX and BP only. The first one is used to access the data segment, the second one automatically chooses the stack segment by default. The additional offset can be unused or can be encoded as an 8 or 16-bit signed value. The schematic of the x86 effective address calculation for indirect address generation is shown in figure 11
The 32-bit architecture makes the choice of registers much more flexible, and any of the eight registers (including the stack pointer) can be used as the base register. Here, the stack segment is chosen if the base register is EBP or ESP. Index register can be any of the general-purpose registers except of stack pointer. The index register can be scaled by a factor of 1, 2, 4 or 8. Additional displacement can be unused or can be encoded as an 8, 16 or 32-bit signed value.
In the 64-bit architecture, any of the sixteen general-purpose registers can be the base register. As in 32-bit processors index register can not be the stack pointer and can be scaled by 1, 2, 4 or 8. Additional displacement can be unused or encoded as an 8, 16 or 32-bit signed number.
In figure 12, the schematic of the effective address calculation and use of registers for indirect addressing in the IA32 architecture is shown.