WebAssembly Advanced Data Types & Structures: Strings, Arrays, and the Serialization Nightmare

Imagine, You’ve just shipped your slick new C++ image processing engine, hand-tuned and neatly compiled to WebAssembly. It crunches numbers at breakneck speed. But then it hits a wall—a JavaScript wall.

Your browser UI wants to send an array of objects. Maybe you need to pass a handful of nested structs, or just a plain old string. But WebAssembly doesn’t “speak” JavaScript arrays, objects, or Unicode strings. Arrays become flat memory blobs. Strings turn into byte puzzles. Complex data is suddenly a puzzle box with no instructions. And if you get the translation wrong? Bugs. Garbage data. Maybe even mysterious browser crashes.

Why is moving data between JavaScript and WebAssembly so hard—and why does it matter so much? Because without mastering this translation, all that C++ or Rust performance sits locked behind a brittle, bottlenecked interface. No fast image filters. No high-speed physics. No real-time game logic. It’s not just about knowing your i32 from your f64 anymore; it’s about building the bridges between two worlds.

And the reality is, most blogs only scratch the surface. They wave at simple numbers or “hello world” strings. But when you need to pass arrays of structs, build custom objects, or serialize trees and graphs? You’re on your own.

Let’s fix that.

The Strange World of Wasm Data Types

You know the basics: WebAssembly speaks just four number types (i32, i64, f32, f64), plus some limited references if your toolchain supports them. Everything else—arrays, structs, strings—must be encoded into one or more of these primitives before crossing the JS-Wasm boundary.

That’s the first paradox: The richer your data in JS or C++, the flatter and dumber it becomes in Wasm memory. Imagine squeezing a bookshelf into an envelope.

Let’s see how you can do this without tearing your hair out.

String Handling: Taming the Byte Beast

Strings are where many developers crash first. Why so tricky? WebAssembly only knows sequences of bytes—it doesn’t have a built-in notion of “string.” In C and C++, you’re used to char*, maybe UTF-8. In JS, strings are more like arrays of UTF-16 code units.

So, how do you send "Hello, 世界" from JS to WebAssembly—and back—without garbling the data?

Marshaling a String from JS to Wasm (C Example)

Step 1: Encode Your String in JS

Use TextEncoder to get a UTF-8 byte array.

const text = "Hello, 世界";
const encoder = new TextEncoder();
const utf8 = encoder.encode(text); // Uint8Array

Step 2: Allocate Wasm Memory for String

Typically, your C/C++ code exposes a function to allocate memory.

// C function in your Wasm module
uint8_t* alloc_buffer(size_t length);

In JS, call this function via your Wasm instance, then write the bytes:

const ptr = wasmInstance.exports.alloc_buffer(utf8.length);
const memory = new Uint8Array(wasmInstance.exports.memory.buffer);
memory.set(utf8, ptr);

Step 3: Pass Pointer and Length into Wasm Function

Call your actual C function, passing the pointer and length.

wasmInstance.exports.process_string(ptr, utf8.length);

C Side: Receiving the String

void process_string(uint8_t* ptr, size_t len) {
    // ptr points to UTF-8 bytes in linear memory, length 'len'
    printf("Received string: %.*s\\n", (int)len, ptr);
    // Now you can use it as a C string!
}

Back to JS?

To return a string from Wasm to JS, reverse the process:

  • Write the result to memory,
  • Pass back a pointer and length,
  • Use TextDecoder to reconstruct the JS string.

Tips:

  • Always agree on encoding (usually UTF-8).
  • Watch for embedded null bytes in non-ASCII strings.

Arrays and Nested Structures: Packing Order from Chaos

Now, let’s say you want to pass an array of structs—an array of {id: number, value: float}—from JS to your Wasm module.

How do you flatten these objects into something Wasm recognizes?

1. Establish a Memory Layout

Every struct must have a fixed binary layout. For our example:

typedef struct {
    int32_t id;
    float value;
} Item;

Suppose you want to send an array of 3 Items from JS. The memory layout (assuming 4-byte alignment) will look like:

| id (4B) | value (4B) | id | value | id | value |

Total size: 3 (structs) * 8 (bytes per struct) = 24 bytes.

2. Serialize Data in JS

Flatten your JS array of objects:

const items = [
    {id: 1, value: 13.37},
    {id: 2, value: 42.0},
    {id: 3, value: -7.2},
];

const buf = new ArrayBuffer(24);
const view = new DataView(buf);

items.forEach((item, i) => {
    view.setInt32(i * 8, item.id, true);  // offset, value, little-endian
    view.setFloat32(i * 8 + 4, item.value, true);
});

Allocate buffer in Wasm and copy:

const ptr = wasmInstance.exports.alloc_buffer(buf.byteLength);
new Uint8Array(wasmInstance.exports.memory.buffer).set(new Uint8Array(buf), ptr);
wasmInstance.exports.process_items(ptr, items.length);

3. Deserialize in C

void process_items(Item* items, size_t count) {
    for (size_t i = 0; i < count; ++i) {
        printf("Item %d: id=%d, value=%.2f\\n", (int)i, items[i].id, items[i].value);
    }
}

Key Principle:

You control the encoding. JS and Wasm must agree exactly on the byte layout, alignment, and type sizes.

Struct Serialization: Handling the Monsters (Nested, Complex Structures)

What if you’ve got a struct with a child struct, or an array of arrays?

Example:

typedef struct {
    int32_t id;
    float value;
} Item;

typedef struct {
    int32_t group_id;
    Item items[2];
} Group;

Now, your serialization gets trickier—nested objects mean you need to recursively flatten your data, keeping byte layout and padding in mind. For anything big or variable-length (like a struct with a string field or a dynamically sized array), you’ll need to:

  • Write "header" data (e.g. lengths or offsets),
  • Store variable data somewhere else in memory,
  • Track pointers/offsets carefully.

Serialization Libraries?

Toolchains like wasm-bindgen (for Rust), Emscripten Embind, or Protocol Buffers can help—but when using C or C++, you’ll often need to roll your own.

Debugging Gotchas

  • Always check your offsets and alignment.
  • Use tools like WABT or Chrome’s Memory Inspector to peek into Wasm linear memory and verify data layout.
  • Log memory slices on both sides (JS and Wasm) for comparison.

Data Marshalling in the Wild: Real-World Patterns

Why should you care about all this boilerplate? Because this is where most WebAssembly integrations break down in production.

Case: Real-Time Graphics Pipeline

A web-based simulation engine in C++ wants to render a thousand moving particles per frame. Each particle: {x, y, vx, vy, color}.

  • Bad approach: Ship each as an object through an exported function—huge overhead.
  • Good approach: Write the entire array as a packed buffer, then send one pointer. WASM reads and processes in bulk.

Or Machine Learning Inference

  • JS collects array of float inputs.
  • Flattens, encodes, pushes into Wasm.
  • Calls exported function—works with GB-sized datasets.

The more complex your data, the more critical efficient serialization/marshalling is for performance and correctness.

Debugging and Profiling: Don’t Fly Blind

Debugging data structures in Wasm isn’t for the faint of heart. Memory bugs hide behind every pointer.

Your Allies:

  • Chrome DevTools > Memory Inspector: View and poke linear memory directly.
  • Emscripten Stack Traces, AddressSanitizer: Catch overflows or uninitialized fields.
  • WABT (wasm-objdump, wasm2wat): Reverse-engineer module layout, inspect types.
  • Custom Memory Dumps: Print slices of Wasm heap to JS console, so you can “diff” input/output or catch weird endian issues.

If something looks off—walk through your memory byte by byte. Don’t trust assumptions; verify layouts.

Mastering Wasm Data Structures: Why All This Matters

Let’s be blunt: If your data structures are wrong, your performance will be, too. Slow encoding, misaligned buffers, or excess copying will crush the very speed that drew you to WebAssembly.

But get it right? Suddenly, advanced image filters, physics engines, and ML models run nearly as fast in the browser as on the desktop. Your code becomes a bridge, not a bottleneck.

Nail struct layout, string marshaling, array packing—and you’re not just moving data. You’re unlocking whole new classes of apps for the web.

What’s Next: From Structs to Speed—Performance Optimization

We’ve just navigated the maze of advanced data types in WebAssembly—how to handle strings, pack arrays, and marshal custom structs between JS and Wasm. You’ve learned the why and the how.

But there’s one big piece left: Performance. Do data marshaling tricks actually matter for speed? How can you avoid common bottlenecks? What are the most effective patterns for high-performance Wasm? In the next post, we’ll crack open real-world benchmarks, explore memory layout best practices, and dive into micro-optimizations that squeeze every last drop out of your data structures.

Because when your data flows—so will your performance.

Ready to make your Wasm modules run at warp speed? Stay tuned for the deep dive into WebAssembly performance optimization…