This is an old revision of the document!
All assembly language types use similar mnemonics for arithmetic operations (some may require additional suffixes to identify some options for the instruction). A32 assembly instructions have specific suffixes to make commands executed conditionally, and those four most significant bits for many instructions give this ability. Unfortunately, there is no such option for A64, but there are special conditional instructions that we will describe later.
We looked at a straightforward instruction and its exact machine code in the previous section. Examining machine code for each instruction is a perfect way to learn all the available options and restrictions. To help understand and read the instruction set documentation, another example of the ADD instruction in the A64 instruction set will be provided.
The ADD instruction: let's first look at the assembler instruction that adds two registers and stores the result in a third register.
ADD X0, X1, X2 @X0 = X1 + X2
We need to look at the instruction set documentation to determine the possible options for this instruction. The documentation lists three main differences between the ADD instructions. Despite that, for the data manipulation instruction, the ‘S’ suffix can be added to update the status flags in the processor Status Register.
1.The ADD and ADDS instructions with extended registers:
ADD X3, X4, W5, UXTW @X3 = X4 + W5
ADDS X3, X4, W5, UXTW @X3 = X4 + W5 and update the status flags
The machine code representation of the assembler instruction would be like:
ADDS X3, X4, W5, UXTW
We already know that the ‘sf’ bit identifies the length of the data (32 or 64 bits). The main difference between these two instructions is in the ‘S’ bit. The same is in the name of the instruction. The ‘S’ bit is meant to signal to the processor that the status bits should be updated after instruction execution. These status bits are crucial for conditions. The 30th ‘op’ bit and ‘opt’ bits are fixed and not used for this instruction. The three option bits (13th to 15th) extend the operation. These bits are used to extend the second source (Rm) operand. This is handy when the source operands differ in length, such as when the first operand is 16-bit wide and the second is 8-bit wide. The second register must be extended to maintain the data alignment. Overall, there are three bits: 8 different options to extend the second source operand. The table below explains all these options. Let's look only at those options; the bit values are irrelevant for learning the assembler.
| UXTB or SXTB | Unsigned or Signed byte (8-bit) is extended to a word (32-bit) |
| UXTH or SXTH | Unsigned or Signed halfword (16-bit) is extended to word (32-bit) |
| UXTW or SXTW | Unsigned/Signed word (32-bit) is extended to double word (64-bit) |
| UXTX, SXTX or LSL | Unsigned/Signed double word (64-bit) is extended to double word (64-bit), and there is no use for such extension with the unsigned data type. |
For the UXTX, the LSL shift is preferred if the ‘imm3’ bits are set from 0 to 4. Other ranges are reserved and unavailable because the result can be unpredictable. Moreover, this shift is only available if the ‘Rd’ or the ‘Rn’ operands are equal to ‘11111’, which is the stack pointer (SP). In all other cases, the UXTX extension will be used. In the conclusion for this instruction type, it is handy when the operands are of different lengths, but that’s not all. The shift provided to the second operand allows us to multiply it by 2, 4, 8 or 16, but it works only if the destination register is 64 bits wide (the Xn registers are used). The shift amount is restricted to 4 bits only, even when the ‘imm3’ can identify the larger values. Also, the SXTB/H/W/X are used when the second operand can store negative integers.
ADDS X3, X4, W5, SXTX #2
/ *extend the W5 register to 64 bits and then shift it by 2 (LSL), which makes a multiplication by 4 (W5=W5*4). Add the multiplied value to the X4+(W5*4), store the result in the X3 register X3 = X4 + (W5*4) */
ADD X3, X4, W5, UXTX #1
/ *Take the lowest byte from W5 (W5[7:0])
Zero-extend it to 64-bit
Shifts left by 1 (multiply by 2)
Add to X4 and store in X3*/
ADD X7, X8, W9, SXTX #2
/ * Take W9[15:0], sign-extend to 64 bits without shifting; Add to X8 and store in X7; X7 = X8 + W9[15:0] */
2.The ADDS (ADD) instructions with immediate value: In machine code, it is possible to determine the maximum value that can be added to a register. The ‘imm12’ bits limit the value to 0-4095. Besides that, the ‘sh’ bit allows to shift left (LSL) the immediate value by 12 bits.
Examples with immediate the ADD instruction
ADD W0, W1, #100 @W0 = W1 + 100 - Basic 32-bit ADD.@ Add 100 to W1, store the result in W0 and no shift is performedADD X0, X1, #4095 Basic 64-bit ADD.@ Add 4095 to X1, stores in X0ADD X2, X3, #1,LSL, #12 @ 64-bit ADD with shifted immediate (LSL #12)@ Add 4096 to X3 (1 « 12 = 4096) @ Store the result in X2 ADD W5, W6, #2,LSL, #12 @ 32-bit ADD with shifted immediateAdd 8192 to W6 and store the result in W5 (2 « 12 = 8192)ADD X4, SP, #256, @ Using SP as base register@ Add 256 to SP. Useful for frame setup or stack managementADDS X7, X8, #42, @ ADDS (immediate) – flag-setting@ Add 42 to X8, store the result in X7 and finally update condition flags (NZCV)ADDS X9, X10, #3,LSL, #12 @ ADDS with shifted immediate@ Add 12288(3 « 12 = 12288) to X10, store the result in X9 @ Update condition flags stored in status register ADDS X11, SP, #512, @ ADDS with SP base@ Add 512 to SP, store the result in X11 and update condition flags3.The ADDS (ADD) instruction with a shifted register: The final add instruction type adds two registers together, with one register shifted; the shift can be LSL (Logical Shift Left), LSR (Logical Shift Right), or ASR (Arithmetic Shift Right). The fourth shift option is not available. The number of bits in the ‘imm6’ field identifies the number of bits to be shifted for the ‘Rm’ register before it is added to the ‘Rn’ register.
Similar options are available for many other ARMv8 instructions. The instruction set documentation may provide the necessary information to determine the possibilities and restrictions on instruction usage. By examining the instruction's binary form, it is possible to identify its capabilities and limitations. Assembler code is converted to binary, and the final binary code for the instruction depends on the provided operands and, if available, options.