Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:paarm:chapter_5_5 [2025/12/03 17:55] – [Other addressing modes] eriks.klavinsen:multiasm:paarm:chapter_5_5 [2025/12/03 20:03] (current) – [Complex addressing modes] eriks.klavins
Line 60: Line 60:
 {{ :en:multiasm:paarm:ldrbw0.svg |}} {{ :en:multiasm:paarm:ldrbw0.svg |}}
 ====Complex addressing modes ==== ====Complex addressing modes ====
-This is not a real addressing mode, but some instructions give possibility to make the addressing a bit complex. +This is not a real addressing mode, but some instructions allow the addressing to be a bit more complex. Loading from memory (or storing in it) data into the vector register. the LD1, LD2, LD3 and LD4 instructions loads vector register:
  
 +''<fc #800000>LD1</fc> {<fc #008000>V1</fc>.<fc #808000>16b</fc>}, [<fc #008000>X1</fc>] <fc #6495ed>@ Load 16 bytes (128 bits) from memory address X1 into the vector register V1.</fc>''
 +
 +''<fc #800000>LD1</fc> {<fc #008000>V0</fc>.<fc #808000>4s</fc>, <fc #008000>V1</fc>.<fc #808000>4s</fc>}, [<fc #008000>X1</fc>], <fc #ffa500>#32</fc> <fc #6495ed>@ Load first 16 bytes in the V0 register and next 16 bytes into the V1 register (32 bytes in total). X1 is incremented by 32 after the load.</fc>''
 +
 +''<fc #800000>LD1</fc> {<fc #008000>V0</fc>.<fc #808000>4s</fc>}, [<fc #008000>X1</fc>, <fc #008000>X2</fc>, <fc #800080>LSL</fc> <fc #ffa500>#4</fc>], <fc #ffa500>#32</fc> <fc #6495ed>@ Load 16 bytes into V0 register with register offset: the efective address used to load the data is = x1 + (X2 <<4) </fc>''
 +
 +** Unprivileged addressing mode**
 +
 +The unprivileged addressing mode simulates EL0 memory access even when the CPU is running at EL1 exception level. Such an addressing mode is used to copy the data between different exception levels.
 +
 +''<fc #800000>LDTR  </fc> <fc #008000>W0</fc>, [<fc #008000>X1</fc>]''
 +
 +''<fc #800000>STTR   </fc> <fc #008000>W2</fc>, [<fc #008000>X1</fc>]''
 +
 +These two instruction examples load/store a 32-bit word from/to memory at address X1, but the data access is performed using EL0 permissions even if the CPU is currently running in EL1. If <fc #008000>X1</fc> register points to invalid user memory, the load instruction will fail with a fault.
 +
 +** Atomic/exclusive addressing**
 +
 +Exclusive addressing mode allows the processor to update shared memory between processes without data races. Exclusive operations use a load–reserve/store–conditional technique. Instruction ''<fc #800000>LDXR</fc>'' reads a value from memory and marks the address so the core can detect interference from other processes that may also read the exact memory location. A matching ''<fc #800000>STXR</fc>'' only commits the new value if no conflicting write has occurred. If another process (or core) has changed the location in the meantime, the store fails.
 +
 +''<fc #800000>LDXR</fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ Load 64-bit value from memory address pointed by X1 and mark the address in the CPU exclusive monitor.</fc>''
 +
 +Such instructions can be used as part of an atomic read-modify-write operation.
 +
 +''<fc #800000>STXR</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ Try to store X1 into memory at [X2] address. If no other process or CPU writes to this address since the LDXR instruction, the register X0 = 0 and the store succeeds. Otherwise the store has failed and X0 = 1</fc>''
 +
 +Complete example of atomic read-modify-write technique:
 +
 +''rmw:''\\
 + ''<fc #800000>LDXR </fc> <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ load the current value and set exclusive monitor</fc>''\\
 + ''<fc #800000>ADD </fc> <fc #008000>X0</fc>, <fc #008000>X0</fc>, <fc #ffa500>#1</fc> <fc #6495ed>@ compute new value</fc>''\\
 + ''<fc #800000>STXR </fc> <fc #008000>W2</fc>, <fc #008000>X0</fc>, [<fc #008000>X1</fc>] <fc #6495ed>@ try to store the new value atomically</fc>''\\
 + ''<fc #800000>CBNZ</fc> <fc #008000>W2</fc>, rmw <fc #6495ed>@ if store failed, retry</fc>''
 +
 +Atomic read-modify-write instructions such as <fc #800000>LDADD</fc>, <fc #800000>CAS</fc>, and <fc #800000>SWP</fc> perform the entire update in a single step. All these operations use a simple [<fc #008000>Xn</fc>] addressing form to keep behaviour predictable. This model supports locks, counters, and other shared data structures without relying on heavier synchronisation mechanisms.
 +
 +''<fc #800000>LDADD</fc> <fc #008000>W0</fc>, <fc #008000>W1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ atomically load the data, modify the value and write back to the memory. </fc>''\\
 +Analogically, this instruction can be described with pseudocode:\\
 +<codeblock code_label>
 +<caption>LDADD istruction pseudocode</caption>
 +<code>
 +oldValue = [X2]
 +[X2] = oldValue + W0
 +W1 = oldValue
 +</code>
 +</codeblock>
 +
 +Another example of the CAS instruction: atomic compare-and-swap.
 +
 +''<fc #800000>CAS</fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>, [<fc #008000>X2</fc>] <fc #6495ed>@ atomically compare the value of [X2] with X0. If [X2]==X0 then [X2]=X1, otherwise tha data in the memory are left unchanged </fc>''
 +
 +Similarly, the <fc #800000>CAS</fc> instruction performs a data swap. 
 +
 +<codeblock code_label>
 +<caption>CAS istruction pseudocode</caption>
 +<code>
 +SWP X3, X4, [X5]
 +oldValue = [X5]
 +[X5] = X4
 +X3 = oldValue
 +</code>
 +</codeblock>
 +
 +Atomic addressing mode does not have immediate offset, register offset, or pre-indexing/post-indexing options. Basically, the atomic addressing uses simple memory addressing.
en/multiasm/paarm/chapter_5_5.1764784527.txt.gz · Last modified: 2025/12/03 17:55 by eriks.klavins
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0