Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:multiasm:paarm:chapter_5_15 [2025/12/04 22:53] – [Barriers(instruction synchronization / data memory / data synchronization / one way BARRIER)] eriks.klavinsen:multiasm:paarm:chapter_5_15 [2025/12/04 23:12] (current) – [Power saving] eriks.klavins
Line 84: Line 84:
 ===== Conditional instructions ===== ===== Conditional instructions =====
  
 +Meanwhile, speculative instruction execution consumes power. If the speculation gave the correct result, the power wasn’t wasted; otherwise, it was. Power consumption must be taken into account when designing the program code. Not only is power consumption essential, but so is data safety. The Cortex-A76 on the Raspberry Pi 5 has a very advanced branch predictor, but mispredictions still cause wasted instructions, pipeline flushes, and energy spikes. It’s better to avoid unpredictable branches inside loops. Unfortunately, that is mostly impossible.
 +
 +The use of the conditional select (''<fc #800000>CSEL</fc>'') instruction may help with small, unpredictable branches.\\
 +''<fc #800000>CMP    </fc> <fc #008000>X0</fc>, <fc #008000>X1</fc>''\\
 +''<fc #800000>CSEL   </fc> <fc #008000>X2</fc>, <fc #008000>X3</fc>, <fc #008000>X4</fc>, <fc #9400d3>LT  </fc> <fc #6495ed>@ more predictable and avoids branch flush</fc>''
 +
 +This conditional instruction writes the value of the first source register to the destination register if the condition is TRUE. If the condition is FALSE, it writes the value of the second source register to the destination register. So, if the ''<fc #008000>X0</fc>'' value is less than the value in the ''<fc #008000>X1</fc>'' register, the ''<fc #800000>CSEL</fc>'' instruction will write the ''<fc #008000>X3</fc>'' register value into the ''<fc #008000>X2</fc>'' register; otherwise, the ''<fc #008000>X4</fc>'' register value will be written into the ''<fc #008000>X2</fc>'' register.
 +
 +Other conditional instructions can be used similarly:
 +
 +{{:en:multiasm:paarm:conditionainst.jpg|}}
 +
 +These conditional instructions are helpful in branchless conditional checks. Taking into account that these instructions can also be executed speculatively, this execution won't waste power compared to branching if branch prediction fails.
 ===== Power saving ===== ===== Power saving =====
 +
 +Some special instructions are meant to put the processor into sleep modes and wait for an event to occur. The processor can be woken up by an interrupt or by an event. In these modes, the code may be explicitly created to initialise interrupts and events, and to handle them. After that, the processor may be put into sleep mode and remain asleep unless an event or interrupt occurs. The following code example can be used only in bare-metal mode – without an OS.
 +<codeblock code_label>
 +<caption>IDLE loop</caption>
 +<code>
 +.global idle_loop
 +idle_loop:
 +1:  WFI             @ Wait For Interrupt, core goes to low-power
 +    B   1b          @ After the interrupt, go back and sleep again
 +</code>
 +</codeblock>
 +<note>Note that interrupt handling and initialisation must also be implemented in the code; otherwise, the CPU may encounter an error that may force a reboot. </note>
 +The example only waits for interrupts to occur. To wait for events and interrupts, the ''<fc #800000>WFI</fc>'' instruction must be replaced with the ''<fc #800000>WFE</fc>'' instruction. Another CPU core may execute an ''<fc #800000>SEV</fc>'' instruction that signals an event to all cores.
 +
 +On a Raspberry Pi 5 running Linux, it is not observable whether the CPU enters these modes, because the OS generates many events between CPU cores and also handles many interrupts from communication interfaces and other Raspberry Pi components.
 +Another way to save more energy while running the OS on the Raspberry Pi is to reduce the CPU clock frequency. There is a scheme called dynamic voltage and frequency scaling (DVFS), the same technique used in laptops, that reduces power consumption and thereby increases battery life. On the internet, there is a paper named “Cooling a Raspberry Pi Device ”. The paper includes one chapter explaining how to reduce the CPU clock frequency. The Linux OS exposes CPU frequency scaling through sysfs, e.g.:
 +  * ”/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor”
 +  * “/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq”
 +
 +It is possible to use syscalls in assembler to open and write specific values into them. 
 +<codeblock code_label>
 +<caption>Power saving</caption>
 +<code>
 +.global _start
 +.section .text
 +_start:
 +    @ openat(AT_FDCWD, path, O_WRONLY, 0)
 +    mov     x0, #-100               @ AT_FDCWD
 +    ldr     x1, =gov_path           @ const char *pathname
 +    mov     x2, #1                  @ O_WRONLY
 +    mov     x3, #0                  @ mode (unused)
 +    mov     x8, #56                 @ sys_openat
 +    svc     #0
 +    mov     x19, x0                 @ save fd
 +
 +    @ write(fd, "powersave\n", 10)
 +    mov     x0, x19
 +    ldr     x1, =gov_value
 +    mov     x2, #10                 @ length of "powersave\n"
 +    mov     x8, #64                 @ sys_write
 +    svc     #0
 +
 +    @ close(fd)
 +    mov     x0, x19
 +    mov     x8, #57                 @ sys_close
 +    svc     #0
 +
 +    @ exit(0)
 +    mov     x0, #0
 +    mov     x8, #93                 @ sys_exit
 +    svc     #0
 +
 +.section .rodata
 +gov_path:
 +    .asciz "/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor"
 +gov_value:
 +    .asciz "powersave\n"
 +</code>
 +</codeblock>
 +Similar things can be done with CPU frequencies, or even by turning off a separate core. This is just one example template that can be used to put the processor into a specific power mode. By changing the stored path in //gov_path// variable and //gov_value// value. The main idea is to use the OS's system call functions. The OS will do the rest
 +
en/multiasm/paarm/chapter_5_15.1764888815.txt.gz · Last modified: 2025/12/04 22:53 by eriks.klavins
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0