...

/

Exploring the WebAssembly Text Format

Exploring the WebAssembly Text Format

Learn the basic syntax of WebAssembly text format.

Machines understand a bunch of 1s and 0s. We optimize the binary to make it run faster and more efficiently. The more concise and optimal the instructions, the more efficient and performant the machine will be. But for people, it’s difficult to contextually analyze and understand a huge blob of 1s and 0s. That’s the very reason why we started abstracting and creating high-level programming languages.

In the WebAssembly world, we convert human-readable programming languages (such as Rust, Go, and C/C++) into binary code. These binaries are a bunch of instructions with opcodes and operands. These instructions make the machine highly efficient but contextually difficult for us to understand.

Why should we worry about the readability of the binary generated? Because it helps us to understand the code, which helps while debugging the code.

WebAssembly provides the WebAssembly text format, WAST or WAT. WAST is a human-readable format of the WebAssembly binary. The JavaScript engine (both in the browser and Node.js), when loading the WebAssembly file, can convert the binary into WebAssembly text format. This helps in understanding what’s in the code and debugging. Text editors can show the binary in WebAssembly text format, which is much more readable than its binary counterpart.

Basic WASM in binary format is as follows:

Press + to interact
00 61 73 6d 01 00 00 00

This translates to the following:

Press + to interact
00 61 73 6d 01 00 00 00
\0 a s m 1 0 0 0 (ascii value of the character)
| | |
----------- version (little endian format)
|
Magic Header

This basic module has a magic header (\0asm) followed by the version of WebAssembly (01).

The textual format is written in an s-expression format. Every instruction/expression in s-expression syntax should live within a pair of parentheses, (). S-expressions are commonly used when defining a nested list or structured tree. Many research papers on tree-based data structures use this notation to showcase their code. The s-expression removes all the unnecessary ceremony from XML, providing a concise format.

Note: Does this expression (defining everything within parentheses) look familiar? Have you ever worked with LISP (or the languages that are built and inspired by LISP)?

Module structure and function definition

Modules are the basic building blocks in WASM. Here is a textual representation of basic WASM:

Press + to interact
(module)

WASM is made up of a header and zero or more sections. The header starts with a magic header and the version of WASM. Following the header, the WASM may have zero or more of the following sections:

  • Types

  • Functions

  • Tables

  • Memories

  • Globals

  • Element

  • Data

  • Start function

  • Exports

  • Imports

All these sections are optional in WASM. The structure of WASM appears as follows:

Press + to interact
module ::= {
types vec<funcType>,
funcs vec<func>,
tables vec<table>,
mems vec<mem>,
globals vec<global>,
elem vec<elem>,
data vec<data>,
start start,
imports vec<import>,
exports vec<export>
}

Every section inside the WASM is a vector (array) that contains zero or more values of the respective types, except for start. We’ll explore the start section later in this course. For now, start holds an index that references a function in the funcs ...