The integration of assembler code with applications written in high-level languages brings benefits in particular scenarios, such as implementing complex mathematical algorithms and real-time tasks that require efficient, compact code. No one uses an assembler to implement a graphical user interface (GUI) anymore, as there is no reason to do so. Modern desktop operating systems are designed to provide a rich user experience, supporting languages such as C#, C++, and Python for implementing user interfaces (UIs) through libraries. While those UI generation functions can be executed from the assembler level, there is virtually no reason to do it. A more effective approach is to have the main application is written in a high-level language and execute assembly code as needed to perform backend operations efficiently.
In the case of multi-tier web applications, assembly code is usually hidden in the backend, oriented towards efficient computation, and it is wrapped in a high-level API library, such as. e.g. ASP.NET Core Web API or many other REST libraries.
Windows OS has historically supported unmanaged code written primarily in C++. This kind of code runs directly on the CPU, but divergence in hardware platforms, such as the introduction of ARM-core-based platforms running Windows, causes incompatibility issues. Since the introduction of the .NET framework, Windows has provided developers with a safer way to execute their code, called “managed code”. The difference is that managed code, typically written in C#, is executed by a .NET framework interpreter rather than being compiled into machine code, as unmanaged code is. The use of managed code brings multiple advantages for developers, including automated memory management and code isolation from the operating system. This, however, raises several challenges when integrating managed code and assembly code. In any case, the integration model is common: the assembler implements functions (usually stateless) that are later called from the high-level language and return data to it. There are significant differences between x86 (32-bit) and x64 (64-bit) code, mostly in the scope of integration methods. As we're at a very low level of programming, there are no shorts, and all program flow from higher-level code to assembler code and the opposite must follow strict calling conventions. Details are presented in chapter Procedures, Functions and Calls in Windows and Linux.
It is possible to write an application for Windows solely in assembler. While the reason to do it is doubtful, some hints presented below, such as calling system functions, may be helpful.
Calls to the Windows system functions is possible via classical call, and require explicit declaration of the functions as external, and linking kernel32.lib and user32.lib. Use of legacy_stdio_definitions.lib and legacy_stdio_wide_specifiers.lib may be helpful when using advanced stdio functions and enumerations.
A common approach to development is to start with a stub command-line C++ application and manually convert it to assembler requirements. Visual Studio Community (https://visualstudio.microsoft.com/vs/community/) is a free version and the first choice for developing apps written in pure assembler, for Windows OSes. A great feature of such an integrated IDE is that, properly configured, it enables debugging of assembler code, even in high-level integration scenarios.
A template of the typical pure assembler, command-line application for Windows is as follows:
... .code hello_world_asm PROC push rbp ; save frame pointer mov rbp, rsp ; fix stack pointer sub rsp, 8 * (4 + 2) .... ; here comes your code mov rsp, rbp pop rbp ret hello_world_asm ENDP END
The name hello_world_asm must be specified to the compiler as the so-called entry point.
Calling system functions, such as the system message box, requires understanding the arguments passed to them. As there is no direct assembler help, documentation of the Windows system API for C++ is helpful. Code below presents the necessary components of the assembler app to call system functions (library includes are configured on the project level):
.data STD_INPUT_HANDLE = -10 STD_OUTPUT_HANDLE = -11 STD_ERROR_HANDLE = -12 handler dq 0 hello_msg db "Hello world", 0 info_msg db "Info", 0 ... includelib kernel32.lib includelib user32.lib EXTERN MessageBoxA: PROC ... WINUSERAPI int WINAPI MessageBoxA( ; RCX => _In_opt_ HWND hWnd, ; RDX => _In_opt_ LPCSTR lpText, ; R8 => _In_opt_ LPCSTR lpCaption, ; R9 => _In_ UINT uType); mov rcx, handler mov rdx, offset hello_msg mov r8, offset info_msg mov r9, 0 ; 0 is MB_OK and rsp, not 8 call MessageBoxA ...
The majority of standard library functions accept ASCII strings and must be terminated with a 0 byte (0 is a value), so they do not require passing the string length.
The and rsp, not 8 instruction causes stack alignment to be required before application leave or before following system function calls.
Using dynamic memory management at the level of the assembler code is troublesome: allocating and releasing memory require calls to the hosting operating system. It is possible, but complex. Moreover, there is no dynamic, automated memory management, as in .NET, Java, and Python, so the developer is on their own, similar to programming in C++. For this reason, it is common to allocate adequate memory resources on the high-level code, e.g., the GUI front-end and pass them to the assembler code as pointers. Note, however, that for some higher-level languages, such as C#, it is necessary to follow a strict pattern to ensure correct and persistent memory allocation, as described in the following chapters.
Programming for applications written in unmanaged code
Continue here
Programming for applications written in managed code In the case of managed code, things get more complex. The .NET framework features automated memory management, which automatically releases unused memory (e.g., objects for which there are no more references) and optimises variable locations for improved performance. It is known as a .NET Garbage Collector (GC). GC instantly traces references and, in the event of an object relocation in memory, updates all references accordingly. It also releases objects that are no longer referenced. This automated mechanism, however, applies only across managed code apps. The problem arises when developers integrate a front-end application written in managed code with assembler libraries written in unmanaged code. All pointers and references passed to the assembler code are not automatically traced by the GC. Using dynamically allocated variables on the .NET side and accessing them from the assembler code is a very common scenario. GC cannot “see” any reference to the object (variable, memory) made in the unmanaged code; thus, it may release memory or relocate it without updating the reference address. It causes very hard-to-debug errors that occur randomly and are very serious (e.g. null pointer exception). Luckily, there is a strict set of rules that must be followed when integrating managed and unmanaged code. We discuss it below.