| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:multiasm:paarm:chapter_5_6 [2025/12/03 20:32] – [Arithmetical instructions] eriks.klavins | en:multiasm:paarm:chapter_5_6 [2025/12/03 21:41] (current) – [Data copy/move instructions] eriks.klavins |
|---|
| ''<fc #800000>ADD</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ adds the X1 and X2 values X0= X1 + X2</fc>'' | ''<fc #800000>ADD</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ adds the X1 and X2 values X0= X1 + X2</fc>'' |
| |
| If the postfix S is added (ADDS), the status register is updated. \\ | If the postfix S is added (ADDS), the status register is updated. \\ |
| ''<fc #800000>ADDS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ X0 = X1 + X2 Status register SR is updated</fc>''\\ | ''<fc #800000>ADDS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ X0 = X1 + X2 Status register SR is updated</fc>''\\ |
| ''<fc #800000>ADCS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ X0 = X1 + X2 + C from the SR register. The status register SR is updated</fc>'' | ''<fc #800000>ADCS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ X0 = X1 + X2 + C from the SR register. The status register SR is updated</fc>'' |
| |
| The ''<fc #800000>SUB</fc>'' and ''<fc #800000>DIV</fc>'' instructions rely on the order of the used variables to preserve a correct mathematical expression. | The ''<fc #800000>SUB</fc>'' and ''<fc #800000>DIV</fc>'' instructions rely on the order of the used variables to preserve a correct mathematical expression. |
| |
| ''<fc #800000>SUB</fc> <fc #008000>X0</fc>, <fc #008000>X0</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ X0 = X0 – 1</fc>'' \\ | ''<fc #800000>SUB</fc> <fc #008000>X0</fc>, <fc #008000>X0</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ X0 = X0 – 1</fc>'' \\ |
| ''<fc #800000>SUB</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ X0 = X1 – 1</fc>'' \\ | ''<fc #800000>SUB</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ X0 = X1 – 1</fc>'' \\ |
| ''<fc #800000>UDIV</fc> <fc #008000>X3</fc>, <fc #008000>X4</fc>, <fc #008000>X5</fc> <fc #6495ed>@ X3 = X4 / X5</fc>'' | ''<fc #800000>UDIV</fc> <fc #008000>X3</fc>, <fc #008000>X4</fc>, <fc #008000>X5</fc> <fc #6495ed>@ X3 = X4 / X5</fc>'' |
| |
| Some instructions can be combined to achieve better computational performance. In such cases, the first arithmetic operation is performed on the second source register, and then the instruction's operation is performed. Such instructions are: ''<fc #800000>MADD</fc>'', ''<fc #800000>MSUB</fc>'', ''<fc #800000>SMADDL</fc>'', ''<fc #800000>SMSUBL</fc>'', ''<fc #800000>UMADDL</fc>'' and ''<fc #800000>UMSUBL</fc>''. Basically, all the listed instructions are ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'', but with different options. Let's look at ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'' instructions. | Some instructions can be combined to achieve better computational performance. In such cases, the first arithmetic operation is performed on the second source register, and then the instruction's operation is performed. Such instructions are: ''<fc #800000>MADD</fc>'', ''<fc #800000>MSUB</fc>'', ''<fc #800000>SMADDL</fc>'', ''<fc #800000>SMSUBL</fc>'', ''<fc #800000>UMADDL</fc>'' and ''<fc #800000>UMSUBL</fc>''. Basically, all the listed instructions are ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'', but with different options. Let's look at ''<fc #800000>MADD</fc>'' and ''<fc #800000>MSUB</fc>'' instructions. |
| |
| ''<fc #800000>MADD</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 + X2*X3</fc>''\\ | ''<fc #800000>MADD</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 + X2*X3</fc>''\\ |
| ''<fc #800000>MSUB</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 - X2*X3</fc>'' | ''<fc #800000>MSUB</fc> <fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc> <fc #6495ed>@ X1 = X4 - X2*X3</fc>'' |
| |
| The ''<fc #800000>ADDG</fc>'' instruction means ''<fc #800000>ADD</fc>'' with Tag and is focused on pointers. The Tag is used to mark the pointer with a small identifier, allowing detection of pointer corruption or incorrect usage, among other options. Primarily, these instructions are used to authenticate pointers and ensure memory safety, for example, by tracking the boundaries of memory regions. | The ''<fc #800000>ADDG</fc>'' instruction means ''<fc #800000>ADD</fc>'' with Tag and is focused on pointers. The Tag is used to mark the pointer with a small identifier, allowing detection of pointer corruption or incorrect usage, among other options. Primarily, these instructions are used to authenticate pointers and ensure memory safety, for example, by tracking the boundaries of memory regions. |
| |
| For example: ''<fc #800000>ADDG</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#16</fc>, <fc #ffa500>#5</fc>''\\ | For example: ''<fc #800000>ADDG</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #ffa500>#16</fc>, <fc #ffa500>#5</fc>''\\ |
| CPU takes the pointer from the ''<fc #008000>X1</fc>'' register and adds the first constant ''<fc #ffa500>#16</fc>'' multiplied by 16. The pointer ''<fc #008000>X0</fc>'' points to X1+256 and has a tag set to ''<fc #ffa500>#5</fc>'' or in binary form ''0101<sub>2</sub>''. ''<fc #008000>X0</fc>'' now points 256 bytes ahead of the memory address stored in the register ''<fc #008000>X1</fc>''. | CPU takes the pointer from the ''<fc #008000>X1</fc>'' register and adds the first constant ''<fc #ffa500>#16</fc>'' multiplied by 16. The pointer ''<fc #008000>X0</fc>'' points to X1+256 and has a tag set to ''<fc #ffa500>#5</fc>'' or in binary form ''0101<sub>2</sub>''. ''<fc #008000>X0</fc>'' now points 256 bytes ahead of the memory address stored in the register ''<fc #008000>X1</fc>''. |
| |
| Postfix PT adds support for pointer tagging or authentication. For example, ''<fc #800000>ADDPT</fc> adds authenticated pointers and preserves the PAC.\\ | Postfix PT adds support for pointer tagging or authentication. For example, ''<fc #800000>ADDPT</fc>'' adds authenticated pointers and preserves the PAC.\\ |
| ''<fc #800000>ADDPT</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc>''\\ | ''<fc #800000>ADDPT</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc>'' |
| The ''<fc #008000>X1</fc>'' register contains an authenticated pointer; this can be signed before with the ''<fc #800000>PACIA</fc>'' or other PAC-enabled instruction. Register ''<fc #008000>X2</fc>'' is the value, an offset from the ''<fc #008000>X1</fc>'' pointer. The result is a pointer with an offset and tagged with the same tag as the ''<fc #008000>X1</fc>'' pointer. Such arithmetic operations are also available for the ''<fc #800000>SUB</fc>'' instruction, but not available for the ''<fc #800000>MUL</fc>'' multiplication and ''<fc #800000>DIV</fc>'' division instructions. Such a system enables powerful system-level encryption. | The ''<fc #008000>X1</fc>'' register contains an authenticated pointer; this can be signed before with the ''<fc #800000>PACIA</fc>'' or other PAC-enabled instruction. Register ''<fc #008000>X2</fc>'' is the value, an offset from the ''<fc #008000>X1</fc>'' pointer. The result is a pointer with an offset and tagged with the same tag as the ''<fc #008000>X1</fc>'' pointer. Such arithmetic operations are also available for the ''<fc #800000>SUB</fc>'' instruction, but not available for the ''<fc #800000>MUL</fc>'' multiplication and ''<fc #800000>DIV</fc>'' division instructions. Such a system enables powerful system-level encryption. |
| |
| Similar options are available for many other ARMv8 instructions. The instruction set documentation may provide the necessary information to determine the possibilities and restrictions on instruction usage. By examining the instruction's binary form, it is possible to identify its capabilities and limitations. Assembler code is converted to binary, and the final binary code for the instruction depends on the provided operands and, if available, options. | Similar options are available for many other ARMv8 instructions. The instruction set documentation may provide the necessary information to determine the possibilities and restrictions on instruction usage. By examining the instruction's binary form, it is possible to identify its capabilities and limitations. Assembler code is converted to binary, and the final binary code for the instruction depends on the provided operands and, if available, options. |
| |
| | |
| | ===== Data copy/move instructions ===== |
| | |
| | Remember, the processor primarily performs operations on data stored in registers. The data must be loaded into registers, and the result must be stored back in memory. For example, to change the value stored at a particular memory address, the ARM would require three instructions. First, the value from memory needs to be loaded into a register, then modified, and finally stored back into the memory from the register. Other architectures, such as x86, may allow operations on data directly in memory without register use. |
| | |
| | The ''<fc #800000>LDR</fc>'' and ''<fc #800000>STR</fc>'' are basic instructions that load data from memory into a register and store data from a register into memory, respectively.\\ |
| | ''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ fill the register X0 with the data located at address stored in X1 register</fc>'' |
| | ''<fc #800000>STR</fc> <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ store the content from register X1 into the memory at memory address given in the X2 register</fc>'' |
| | |
| | The ''<fc #800000>LDR</fc>'' instruction loads the data from the memory address pointed to in the ''<fc #008000>X1</fc>'' register into the destination register ''<fc #008000>X0</fc>''. The register in square brackets, ''[<fc #008000>X1</fc>]'', is called the base register because its value is used as a memory address. Similarly, the STR instruction stores data from the ''<fc #008000>X1</fc>'' register to the memory location specified by the ''<fc #008000>X2</fc>'' register. |
| | If the register holding the memory address must be updated after each memory access, then post-indexed or pre-indexed modes can be used. Pre-indexed mode updates the base register before reading the value from memory. Post-indexed mode will update the base register after reading the value from memory. |
| | |
| | ''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>, <fc #ffa500>#8</fc>]<fc #800080>**!**</fc> <fc #6495ed>@ Read the data located at address X1+8 and write into register X0 {PRE-INDEXED MODE X1 = X1 + 8}</fc>''\\ |
| | ''<fc #800000>LDR</fc> <fc #008000>X6</fc>, [<fc #008000>X7</fc>], <fc #ffa500>#16</fc> <fc #6495ed>@ loads a value to X6 register and then increases X7 by 16. {POST-INDEXED MODE X7 = X7 + 16}</fc>''\\ |
| | ''<fc #800000>STR</fc> <fc #008000>X6</fc>, [<fc #008000>X7</fc>], <fc #ffa500>#16</fc> <fc #6495ed>@ Store the value and then increase X7 by 16.</fc>'' |
| | |
| | There is also a third option: using the offset value. This option must be used with caution because the offset value is multiplied by 8 (8 bytes).\\ |
| | ''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>, <fc #ffa500>#8</fc>] <fc #6495ed>@ Read the data located at address X1+8*8 and write into register X0 {X1 = X1 + 8*8}</fc>'' |
| | |
| | <note important>Note that the exclamation mark after the square bracket makes a significant difference in how the data is accessed.</note> |
| | |
| | Load and store instructions have the most additional options, more than for the arithmetical and logical operations. For example, the ''<fc #800000>LDADD</fc>'' instruction combines a load and an arithmetic operation. This is a part of the so-called atomic operations. The ''<fc #800000>LDADD</fc>'' instruction atomically loads a value from memory, adds the value held in a register, and finally stores the result back in memory at a different location. NOTE that the registers used in this instruction must not be the same. This is something like what would be for the x86 architecture. Unfortunately, no other arithmetic operations are available besides addition.\\ |
| | ''<fc #800000>LDADD</fc> <fc #008000>W1</fc>, <fc #008000>W2</fc>, [<fc #008000>X0</fc>]'' \\ |
| | The register ''<fc #008000>X0</fc>'' holds a memory address. The data/value is loaded into the ''<fc #008000>W2</fc>'' register, and then the value is added to the ''<fc #008000>W1</fc>'' register value, after which the new value ''[<fc #008000>X0</fc>]+<fc #008000>W1</fc>'' is stored back into memory at the exact location pointed by ''[<fc #008000>X0</fc>]''. Basically, the ''<fc #008000>W2</fc>'' register now holds the ''[<fc #008000>X0</fc>]''- pointed data that was present before the ''<fc #008000>W1</fc>'' value was added. Similar instructions are available to perform atomic logic operations on the memory data. |
| | |
| | To copy content from one register to another, the ''<fc #800000>MOV</fc>'' instruction is used. The ''<fc #800000>FMOV</fc>'' instruction can also copy floating-point values. These instructions allow typecasting a floating-point value to an integer and vice versa. Here are some independent instruction examples\\ |
| | ''<fc #800000>MOV</fc> <fc #008000>X1</fc>, <fc #008000>X0</fc> <fc #6495ed>@ X1 = X0 (64 bit register copy)</fc>''\\ |
| | ''<fc #800000>MOV</fc> <fc #008000>W1</fc>, <fc #008000>W0</fc> <fc #6495ed>@ W1 = W0 (32 bit register copy)</fc>''\\ |
| | ''<fc #800000>FMOV</fc> <fc #008000>S1</fc>, <fc #008000>S0</fc> <fc #6495ed>@ float → float (32-bit floating-point copy between vector registers)</fc>''\\ |
| | ''<fc #800000>FMOV</fc> <fc #008000>X0</fc>, <fc #008000>D1</fc> <fc #6495ed> @ FP64 → int64 (copy from vector register to general-purpose register)</fc>''\\ |
| | ''<fc #800000>FMOV</fc> <fc #008000>D2</fc>, <fc #008000>X3</fc> <fc #6495ed>@ int64 → FP64 (copy from general-purpose register to vector register)</fc>''\\ |
| | ''<fc #800000>MOV</fc> <fc #008000>V1</fc>.<fc #808000>16b</fc>, <fc #008000>V0</fc>.<fc #808000>16b</fc> <fc #6495ed>@ vector register copy one byte</fc>''\\ |
| | The ''<fc #800000>MOV</fc>'' instructions can also be used to write a value into the register immediately. In the following example, all instructions are executed one by one:\\ |
| | ''<fc #800000>MOV</fc> <fc #008000>X0</fc>, <fc #ffa500>#123</fc> <fc #6495ed>@ assign value 291 to the register</fc>''\\ |
| | ''<fc #800000>MOVZ</fc> <fc #008000>X0</fc>, <fc #ffa500>#0x1234</fc>, <fc #800080>LSL</fc> <fc #ffa500>#48</fc><fc #6495ed> @ X0 = 0x1234 0000 0000 0000. The X0 value gets overvritten</fc>''\\ |
| | ''<fc #800000>MOVK</fc> <fc #008000>X0</fc>, <fc #ffa500>#0xABCD</fc>, <fc #800080>LSL</fc> <fc #ffa500>#0</fc> <fc #6495ed>@ X0 = 0x1234 0000 0000 ABCD, if before instruction execution the register value was 0x1234 0000 0000 0000</fc>'' |
| | |
| | |
| | ===== Data copy/move instructions ===== |
| | |
| | These instructions do not work with values that require arithmetic operations. Still, they are mainly used to manipulate individual bits in registers, widely used to test or verify values, and to perform other functions. Basic logic instructions for AARCH64 are:\\ |
| | ''<fc #800000>AND</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, <fc #008000>X2</fc> <fc #6495ed>@ logical AND between X1 and X2, result is stored in X0</fc>''\\ |
| | ''<fc #800000>ORR</fc> <fc #008000>X6</fc>, <fc #008000>X7</fc>, <fc #008000>X8</fc> <fc #6495ed>@ logical OR between X7 and X8, result is stored in X6</fc>''\\ |
| | ''<fc #800000>EOR</fc> <fc #008000>X12</fc>, <fc #008000>X13</fc>, <fc #008000>X14</fc> <fc #6495ed>@ logical XOR between X13 and X14, result is stored in X12</fc>''\\ |
| | ''<fc #800000>NEG</fc> <fc #008000>X24</fc>, <fc #008000>X25</fc> <fc #6495ed>@ logical NOT, X24 is set to inverted X25</fc>'' |
| | |
| | Remember that most instructions, which operate with registers, can update the status register by adding the postfix S at the end of the instruction. Logical instructions are fundamental for low-level programming. These instructions allow taking control over bits and are widely used in system code, device drivers, and embedded systems. Some instructions can perform combined bitwise operations, like ''<fc #800000>ORN</fc>'', which performs an OR operation with the inverted second operand. |