Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:paarm:chapter_5_5 [2024/09/27 20:59] pczekalskien:multiasm:paarm:chapter_5_5 [2025/12/03 20:03] (current) – [Complex addressing modes] eriks.klavins
Line 1: Line 1:
 ====== Addressing Modes ====== ====== Addressing Modes ======
 +
 +The ARMv8 has simple addressing modes based on load and store principles. ARM cores perform Arithmetic Logic Unit (ALU) operations only on registers. The only supported memory operations are the load (which reads data from memory into registers) or store (which writes data from registers to memory). By use of LDR/STR instructions and their available options, all the data must be loaded in the general-purpose registers before the operations on them can be performed. Multiple addressing modes can be used for load and store instructions:
 +
 +====Register addressing====
 +The register is used to store the address of the data in the memory. Or the data will be stored in the memory at the address given in the register. 
 +
 +''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ fill the register X0 with the data located at address stored in X1 register</fc>''
 +
 +''<fc #800000>STR</fc> <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ store the content for register X1 into the memory at location given in the X2 register</fc>''
 +
 +====Pre-indexed addressing====
 +An offset to the base register is added before the memory access. The address is calculated by adding the two registers: ''[<fc #008000>X1</fc>, <fc #008000>X2</fc>]'' will result in X1+X2. The register <fc #008000>X1</fc> represents the base register, and <fc #008000>X2</fc> represents the offset. The offset value can also be negative, but the final, calculated address must be positive.
 +
 +''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>, <fc #008000>X2</fc>] <fc #6495ed>@ address pointed to by X1 + X2</fc>''
 +
 +''<fc #800000>STR</fc> <fc #008000>X2</fc>, [<fc #008000>X4</fc>, <fc #008000>X3</fc>] <fc #6495ed>@ address pointed to by X4 + X3</fc>''
 +
 +The LDR instruction machine code also allows a shift operation to the offset register while maintaining the same addressing mode.
 +
 +''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #cd5c5c>LSL</fc> <fc #ffa500>#2</fc>] <fc #6495ed>@ address is X1 + (X2*4)</fc>''
 +
 +====Pre-indexed addressing with write back====
 +The ARM separates this addressing mode because the register that points to the address in memory can now be modified. The value will be added to the register before the access is performed.
 +
 +''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>, <fc #ffa500>#32</fc>]<fc #800080>**!**</fc> <fc #6495ed>@ read the data into <fc #008000>X0</fc> register from address pointed to by X1+32, then X1=X1 + 32</fc>''
 +
 +The exclamation mark at the end of the closing bracket indicates that the value of the register <fc #008000>X1</fc> must be increased by 32. After that, the memory access can be performed.
 +
 +
 +====Post-index with write back====
 +Like with the previous addressing mode, this one is also separated. As the addressing mode says, ‘post’ means that the value of the register will be increased after performing the memory access.
 +
 +''<fc #800000>LDR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>],<fc #ffa500> #32</fc> <fc #6495ed>@ read X0 from address pointed to by X1, then X1=X1 + 32</fc>''
 +
 +====Other addressing modes====
 +There are some other addressing modes available. Some addressing modes are used only for function or procedure calls, while others combine previously described addressing modes. This subsection will introduce some of the other available addressing modes.
 +  * Register to register, also called register direct addressing mode. It is used to copy data from one register to another. No memory access is performed with such operations.
 +  * Literal addressing, alternatively, the immediate addressing mode is used to identify the data directly in the instruction. 
 +  * PC-relative addressing
 +
 +Literal addressing mode allows the use of literal addresses in the program code. Something similar is done with function names, but in this situation, the data are addressed by literal names. PC-relative addressing 
 +Some instructions allow loading (and storing) a pair of data (LDP and STP instructions). 
 +
 +''<fc #800000>LDP</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, [<fc #008000>X2</fc>,<fc #ffa500> #32</fc>] <fc #6495ed>@ read X0 from address pointed to by X2, and then read X1 from X2 + 32</fc>''
 +
 +These instructions will be used to call a function or procedure.
 +<note important>Remember that general-purpose registers can be accessed in two different ways: as 64-bit registers <fc #008000>X0</fc>..<fc #008000>X31</fc>or as 32-bit registers <fc #008000>W0</fc>..<fc #008000>W31</fc>. </note>
 +
 +The load and store instructions also work with 32-bit registers. Load instructions include additional options that can be used not only for data loading but also for other operations on the data in the register. For example, to load a single data byte into the register, use the ''<fc #800000>LDRB</fc>'' instruction. If a byte holds a negative value, the entire register must preserve the sign – this is called sign extension and is performed with the ''<fc #800000>LDRSB</fc>'' instruction. Like, for example, the value -100 in hexadecimal is ''<fc #ffa500>0x9C</fc>'' (in binary 2’s complement 10011100).
 +
 +{{ :en:multiasm:paarm:ldrsbw0.svg |}}
 +
 +The data are loaded from memory, and the sign bit is preserved only for a 32-bit wide value when the destination register is addressed as a 32-bit register. If the 64-bit register is used as the destination, the sign bit is preserved in the entire 64-bit register.
 +
 +{{ :en:multiasm:paarm:ldrsbx0.svg |}}
 +
 +Zero extension is only available for 32-bit registers because the most significant bytes are cleared when a 32-bit register is written.
 +
 +{{ :en:multiasm:paarm:ldrbw0.svg |}}
 +====Complex addressing modes ====
 +This is not a real addressing mode, but some instructions allow the addressing to be a bit more complex. Loading from memory (or storing in it) data into the vector register. the LD1, LD2, LD3 and LD4 instructions loads vector register:
 +
 +''<fc #800000>LD1</fc> {<fc #008000>V1</fc>.<fc #808000>16b</fc>}, [<fc #008000>X1</fc>] <fc #6495ed>@ Load 16 bytes (128 bits) from memory address X1 into the vector register V1.</fc>''
 +
 +''<fc #800000>LD1</fc> {<fc #008000>V0</fc>.<fc #808000>4s</fc>, <fc #008000>V1</fc>.<fc #808000>4s</fc>}, [<fc #008000>X1</fc>], <fc #ffa500>#32</fc> <fc #6495ed>@ Load first 16 bytes in the V0 register and next 16 bytes into the V1 register (32 bytes in total). X1 is incremented by 32 after the load.</fc>''
 +
 +''<fc #800000>LD1</fc> {<fc #008000>V0</fc>.<fc #808000>4s</fc>}, [<fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #800080>LSL</fc> <fc #ffa500>#4</fc>], <fc #ffa500>#32</fc> <fc #6495ed>@ Load 16 bytes into V0 register with register offset: the efective address used to load the data is = x1 + (X2 <<4) </fc>''
 +
 +** Unprivileged addressing mode**
 +
 +The unprivileged addressing mode simulates EL0 memory access even when the CPU is running at EL1 exception level. Such an addressing mode is used to copy the data between different exception levels.
 +
 +''<fc #800000>LDTR  </fc> <fc #008000>W0</fc>, [<fc #008000>X1</fc>]''
 +
 +''<fc #800000>STTR   </fc> <fc #008000>W2</fc>, [<fc #008000>X1</fc>]''
 +
 +These two instruction examples load/store a 32-bit word from/to memory at address X1, but the data access is performed using EL0 permissions even if the CPU is currently running in EL1. If <fc #008000>X1</fc> register points to invalid user memory, the load instruction will fail with a fault.
 +
 +** Atomic/exclusive addressing**
 +
 +Exclusive addressing mode allows the processor to update shared memory between processes without data races. Exclusive operations use a load–reserve/store–conditional technique. Instruction ''<fc #800000>LDXR</fc>'' reads a value from memory and marks the address so the core can detect interference from other processes that may also read the exact memory location. A matching ''<fc #800000>STXR</fc>'' only commits the new value if no conflicting write has occurred. If another process (or core) has changed the location in the meantime, the store fails.
 +
 +''<fc #800000>LDXR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ Load 64-bit value from memory address pointed by X1 and mark the address in the CPU exclusive monitor.</fc>''
 +
 +Such instructions can be used as part of an atomic read-modify-write operation.
 +
 +''<fc #800000>STXR</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ Try to store X1 into memory at [X2] address. If no other process or CPU writes to this address since the LDXR instruction, the register X0 = 0 and the store succeeds. Otherwise the store has failed and X0 = 1</fc>''
 +
 +Complete example of atomic read-modify-write technique:
 +
 +''rmw:''\\
 + ''<fc #800000>LDXR </fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ load the current value and set exclusive monitor</fc>''\\
 + ''<fc #800000>ADD </fc> <fc #008000>X0</fc>, <fc #008000>X0</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ compute new value</fc>''\\
 + ''<fc #800000>STXR </fc> <fc #008000>W2</fc>, <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ try to store the new value atomically</fc>''\\
 + ''<fc #800000>CBNZ</fc> <fc #008000>W2</fc>, rmw <fc #6495ed>@ if store failed, retry</fc>''
 +
 +Atomic read-modify-write instructions such as <fc #800000>LDADD</fc>, <fc #800000>CAS</fc>, and <fc #800000>SWP</fc> perform the entire update in a single step. All these operations use a simple [<fc #008000>Xn</fc>] addressing form to keep behaviour predictable. This model supports locks, counters, and other shared data structures without relying on heavier synchronisation mechanisms.
 +
 +''<fc #800000>LDADD</fc> <fc #008000>W0</fc>, <fc #008000>W1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ atomically load the data, modify the value and write back to the memory. </fc>''\\
 +Analogically, this instruction can be described with pseudocode:\\
 +<codeblock code_label>
 +<caption>LDADD istruction pseudocode</caption>
 +<code>
 +oldValue = [X2]
 +[X2] = oldValue + W0
 +W1 = oldValue
 +</code>
 +</codeblock>
 +
 +Another example of the CAS instruction: atomic compare-and-swap.
 +
 +''<fc #800000>CAS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ atomically compare the value of [X2] with X0. If [X2]==X0 then [X2]=X1, otherwise tha data in the memory are left unchanged </fc>''
 +
 +Similarly, the <fc #800000>CAS</fc> instruction performs a data swap. 
 +
 +<codeblock code_label>
 +<caption>CAS istruction pseudocode</caption>
 +<code>
 +SWP X3, X4, [X5]
 +oldValue = [X5]
 +[X5] = X4
 +X3 = oldValue
 +</code>
 +</codeblock>
 +
 +Atomic addressing mode does not have immediate offset, register offset, or pre-indexing/post-indexing options. Basically, the atomic addressing uses simple memory addressing.
en/multiasm/paarm/chapter_5_5.1727470787.txt.gz · Last modified: 2024/09/27 20:59 by pczekalski
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0