Exploring the WebAssembly Text Format
Learn the basic syntax of WebAssembly text format.
We'll cover the following...
Machines understand a bunch of 1s and 0s. We optimize the binary to make it run faster and more efficiently. The more concise and optimal the instructions, the more efficient and performant the machine will be. But for people, it’s difficult to contextually analyze and understand a huge blob of 1s and 0s. That’s the very reason why we started abstracting and creating high-level programming languages.
In the WebAssembly world, we convert human-readable programming languages (such as Rust, Go, and C/C++) into binary code. These binaries are a bunch of instructions with opcodes and operands. These instructions make the machine highly efficient but contextually difficult for us to understand.
Why should we worry about the readability of the binary generated? Because it helps us to understand the code, which helps while debugging the code.
WebAssembly provides the WebAssembly text format, WAST or WAT. WAST is a human-readable format of the WebAssembly binary. The JavaScript engine (both in the browser and Node.js), when loading the WebAssembly file, can convert the binary into WebAssembly text format. This helps in understanding what’s in the code and debugging. Text editors can show the binary in WebAssembly text format, which is much more readable than its binary counterpart.
Basic WASM in binary format is as follows:
00 61 73 6d 01 00 00 00
This translates to the following:
00 61 73 6d 01 00 00 00\0 a s m 1 0 0 0 (ascii value of the character)| | |----------- version (little endian format)|Magic Header
This basic module has a magic header (\0asm
) followed by the version of WebAssembly (01
).
The textual format is written in an s-expression format. Every instruction/expression in s-expression syntax should live within a pair of parentheses, ()
. S-expressions are commonly used when defining a nested list or structured tree. Many research papers on tree-based data structures use this notation to showcase their code. The s-expression removes all the unnecessary ceremony from XML, providing a concise format.
Note: Does this expression (defining everything within parentheses) look familiar? Have you ever worked with LISP (or the languages that are built and inspired by LISP)?
Module structure and function definition
Modules are the basic building blocks in WASM. Here is a textual representation of basic WASM:
(module)
WASM is made up of a header and zero or more sections. The header starts with a magic header and the version of WASM. Following the header, the WASM may have zero or more of the following sections:
-
Types
-
Functions
-
Tables
-
Memories
-
Globals
-
Element
-
Data
-
Start function
-
Exports
-
Imports
All these sections are optional in WASM. The structure of WASM appears as follows:
module ::= {types vec<funcType>,funcs vec<func>,tables vec<table>,mems vec<mem>,globals vec<global>,elem vec<elem>,data vec<data>,start start,imports vec<import>,exports vec<export>}
Every section inside the WASM is a vector (array) that contains zero or more values of the respective types, except for start
. We’ll explore the start
section later in this course. For now, start
holds an index that references a function in the funcs
...