Chapter 1

A few months ago, the Wasmer team announced a Web Assembly (aka Wasm) interpreter that could be embedded into rust programs. This is particularly exciting for anyone looking to add plugins to their project and since Rust provides a way to directly compile programs to wasm, it seems like a perfect option. In this series of blog posts we are going to investigate what building a plugin system using wasmer and rust would take.

The Setup

Before we really dig into the specifics, we should have a layout in mind for our project. That way if you want to follow along on your own computer, you can and if your not, nothing will seem like magic. To do this we are going to take advantage of cargo's workspace feature which allows us to collect a bunch of related projects in one parent project. You can also find a github repo with all of the code here, each branch will represent a different state of this series. The basic structure we are going to shoot for would look something like this.

wasmer-plugin-example
├── Cargo.toml
├── crates
│   ├── example-macro
│   │   ├── Cargo.toml
│   │   └── src
│   │       └── lib.rs
│   ├── example-plugin
│   │   ├── Cargo.toml
│   │   └── src
│   │       └── lib.rs
│   └── example-runner
│       ├── Cargo.toml
│       └── src
│           └── main.rs
└── src
    └── lib.rs
  • wasmer-plugin-example - A rust library, the details of which we will cover in detail in one of the next parts
    • crates - The folder that will house all of our other projects
      • example-plugin - The plugin we will use to test that everything is working as expected
      • example-runner - A Binary project that will act as our plugin host
      • example-macro - A proc_macro library that we will be creating in one of the next parts

To set this up we are going to start by creating the parent project.

cargo new --lib wasmer-plugin-example
cd wasmer-plugin-example

Once that has been created we can move into that directory and in your editor of choice you would then open the Cargo.toml. We need to add a [workspace] table to the configuration and point to the 3 projects in the crates folder from above.

[package]
name = "wasmer-plugin-example"
version = "0.1.0"
authors = ["freemasen <r@wiredforge.com>"]
edition = "2018"

[dependencies]


[workspace]
members = [
    "./crates/example-macro",
    "./crates/example-plugin",
    "./crates/example-runner",
]

Now we can make that crates folder and the projects that will live inside it.

mkdir ./crates
cd ./crates
cargo new --lib example-plugin
cargo new --lib example-macro
cargo new example-runner

With that we have our workspace setup. This will allow us to use cargo commands from any of the directories inside our project and target activity in any other project in our workspace. We tell cargo which project we want an action to apply to with the -p argument. If we wanted to build the example-plugin project for instance we would use the following command.

cargo build -p example-plugin

With our workspace all setup, we should take a moment and get our development environment in order. First and for most we need to have the rust compiler, cargo and rustup. If you need those head over to rustup.rs. With all that installed we are going to need the web assembly target from rustup.

rustup target add wasm32-unknown-unknown

In addition to are rust requirements, we will also need a few things for wasmer. The full guide is available here, for most system you just need to make sure cmake is installed, for windows it is slightly more complicated but there are links on dependency guide.

Our First Plugin

With that out of the way, we should talk about the elephant in the room, the Web Assembly specification only allows for the existence of numbers. Thankfully the web assembly target for rust can already handle this inside of a single program for us but any function in a plugin we want to call from our runner will need to only take numbers as arguments and only return numbers. With that in mind let's start with a very simple example. I will note that the examples in this part will not be very useful but I promise we will slowly build up the ability to do much more interesting things.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs
#[no_mangle]
pub fn add(one: i32, two: i32) -> i32 {
    one + two
}
#}

The above is an extremely naive and uninteresting example of what a plugin might look like but it fits our requirement that it only deals with numbers. Now to get this to compile to Web Assembly, we need to set one more thing up in our Cargo.toml.

# ./crates/example-plugin/Cargo.toml
[package]
name = "example-plugin"
version = "0.1.0"
authors = ["freemasen <r@wiredforge.com>"]
edition = "2018"

[dependencies]


[lib]
crate-type = ["cdylib"]

The key here is the crate-type = ["cdylib"], which says that we want this crate to be compiled as a C dynamic library. Now we can compile it with the following command

cargo build --target wasm32-unknown-unknown

At this point we should have a file in ./target/wasm32-unknown-unknown/debug/example_plugin.wasm. Now that we have that, let's build a program that will run this, first we will get our dependencies all setup.

Our First Runner

# ./crates/example-runner/Cargo.toml
[package]
name = "example-runner"
version = "0.1.0"
authors = ["freemasen <r@wiredforge.com>"]
edition = "2018"

[dependencies]
wasmer_runtime = "0.3.0"

Here we are adding the wamer_runtime crate which we will use to interact with our web assembly module.

// ./crates/example-runner/src/main.rs
use wasmer_runtime::{
    imports,
    instantiate,
};
// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

fn main() {
    // Instantiate the web assembly module
    let instance = instantiate(Wasm, &imports!{}).expect("failed to instantiate Wasm module");
    // Bind the add function from the module
    let add = instance.func::<(i32, i32), i32>("add").expect("failed to bind function add");
    // execute the add function
    let three = add.call(1, 2).expect("failed to execute add");
    println!("three: {}", three); // "three: 3"
}

First, we have our use statement, there was are just grabbing 2 things; the imports macro for easily defining our import object and the instantiate function for converting bytes into a web assembly module instance. We are going to use the include_bytes! macro for now to read our bytes but eventually we will want to make this a little more flexible. Inside of our main we are going to call instantiate with the Wasm bytes as the first argument and an empty imports object as the second. Next we are going to use the func method on instance to bind the function add giving it the arguments types of two i32s and a return value of an i32. At this point we can use the call method on the function add, and then print the result to the terminal. When we cargo run it should successfully print three: 3 in the terminal.

Huzzah, success! but that isn't super useful. Let's investigate what we would need to make it more useful.

Digging Deeper

Our requirements

  1. Access to the Wasm Memory before our function runs
  2. A way to insert a more complicated data structure into that memory
  3. A method to communicate where and what the data is to the Wasm module
  4. A system for extracting the update information from the Wasm memory after the plugin is executed

First we need a way to initialize some value into the Wasm module's memory before we run our function. Thankfully wasmer_runtime gives us a way to do exactly that. Let's update our example to take in a string and return the length of that string, this isn't going to be much more useful than our last example but... baby steps.

Bill Murray everyone...

Our Second Plugin


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs

/// This is the actual code we would 
/// write if this was a pure rust
/// interaction
pub fn length(s: &str) -> u32 {
    s.len() as u32
}

/// Since it isn't we need a way to
/// translate the data from wasm
/// to rust
#[no_mangle]
pub fn _length(ptr: i32, len: u32) -> u32 {
    // Extract the string from memory.
    let value = unsafe { 
        let slice = ::std::slice::from_raw_parts(ptr as _, len as _);
        String::from_utf8_lossy(slice)
    };
    //pass the value to `length` and return the result
    length(&value)
}
#}

There is quite a bit more that we needed to do this time around, let's go over what is happening. First we have defined a function length, this is exactly what we would want to if we were using this library from another rust program. Since we are using this library as a Wasm module, we need to add a helper that will deal with all of the memory interactions. This may seem like an odd structure but doing it this way allows for additional flexibility which will become more clear as we move forward. The _length function is going to be that helper. First, we need the arguments and return values to match what is available when crossing the Wasm boundary (only numbers). Our arguments then will describe the shape of our string, ptr is the start of the string and len is the length. Since we are dealing with raw memory, we need to do the conversion inside of an unsafe block (I know that is a bit scary but we are going to make sure that there actually is a string there in the runner). Once we pull the string out of memory, we can pass it over to length just like normal, returning the result. Go ahead and build it just like before.

cargo build --target wasm32-unknown-unknown

Now let's cover how we would set this up in the runner.

// ./crates/example-runner/src/main.rs
use wasmer_runtime::{
    imports,
    instantiate,
};

// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

fn main() {
    let instance = instantiate(&Wasm, &imports!{}).expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // This is the string we are going to pass into wasm
    let s = "supercalifragilisticexpialidocious".to_string();
    // This is the string as bytes
    let bytes = s.as_bytes();
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[1..len + 1].iter().zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let length = instance.func::<(i32, u32), u32>("_length").expect("Failed to bind _length");
    let wasm_len = match length.call(1 as i32, len as u32) {
        Ok(l) => l,
        Err(e) => panic!("{}\n\n{:?}", e, e),
    }; //.expect("Failed to execute _length");
    println!("original: {}, wasm: {}", len, wasm_len); // original: 34, wasm: 34
}

Ok, there is quite a bit more going on this time around. The first few lines are going to be exactly the same, we are going to read in the Wasm and then instantiate it. Once that is done, we are going to get a view into the Wasm memory, we do this by first getting the Ctx (context) from the module instance. Once we have the context we can pull out the memory by calling memory(0), web assembly only has one memory currently so in the short term this will always take the value 0 but moving forward there may be more than one memory allowed. One last step to actually get the raw memory is to call the view() method, we are finally at a stage where we can modify the module's memory. The type of view is Vec<Cell<u8>>, so we have a vector of bytes but each of the bytes is wrapped in a Cell. A Cell according to the documentation is a way to allow mutating one part of an immutable value, in our case it is essentially saying "I'm not going to make this memory any longer or shorter, just change what its values are".

Now we define the string we want to pass into the Wasm memory and convert that to bytes. We also want to keep track of the byte length of that string so we capture that as len. To put the string bytes into the memory bytes we are going to use the Zip iterator, which just lets us loop over two things at one time. In each iteration of our loop, we are going to stop at both the cell and the string byte in the same index, in the loop body we are setting the value of the Wasm memory byte to the value of the string's byte. Notice that we started at index 1 in the view, that means our ptr parameter is going to be 1 and our byte length is going to be the len parameter.

cargo run
original: 34, wasm: 34

Huzzah! Success again! But alas, still pretty useless. It does however give us a good foundation to build upon for working with more complicated data. We saw how to interact with the Wasm memory on both sides of the equation which we will exploit in part 2.

part two

If you haven't seen it yet, you may want to checkout part one where we went over the basics of using wasmer. In this post we are going to cover how we could pass more complicated data from the Wasm module back to the runner.

Yet Another Plugin

To start we are going to create another plugin, this one will take a string as an argument and return that string doubled. Here is what that plugin would look like.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs

/// This is the actual code we would 
/// write if this was a pure rust
/// interaction
pub fn double(s: &str) -> String {
    s.repeat(2)
}

/// Since it isn't we need a way to
/// translate the data from wasm
/// to rust
#[no_mangle]
pub fn _double(ptr: i32, len: u32) -> i32 {
    // Extract the string from memory.
    let value = unsafe { 
        let slice = ::std::slice::from_raw_parts(ptr as _, len as _);
        String::from_utf8_lossy(slice)
    };
    // pass the value to `double` and 
    // return the result as a pointer
    double(&value).as_ptr() as i32
}
#}

Most of what is going on here is exactly what we did the last time, the only difference is in that last line it has .as_ptr() added to it and the return value is now i32. as_ptr is a method that will return the byte index in memory of a value, which normally would be a pretty scary thing to deal with but I promise that we are going to survive. So how would we use this new plugin?

// ./crates/example-runner/src/main.rs
use wasmer_runtime::{
    imports,
    instantiate,
};

// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

fn main() {
    let instance = instantiate(&Wasm, &imports!{}).expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // This is the string we are going to pass into wasm
    let s = "supercalifragilisticexpialidocious".to_string();
    // This is the string as bytes
    let bytes = s.as_bytes();
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[1..len + 1].iter().zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let double = instance.func::<(i32, u32), i32>("_double").expect("Failed to bind _double");
    // Call the helper function an store the start of the returned string
    let start = double.call(1 as i32, len as u32).expect("Failed to execute _double") as usize;
    // Calculate the end as the start + twice the length
    let end = start + (len * 2);
    // Capture the string as bytes 
    // from a fresh view of the Wasm memory
    let string_buffer: Vec<u8> = memory
                                    .view()[start..end]
                                    .iter()
                                    .map(|c|c.get())
                                    .collect();
    // Convert the bytes to a string
    let wasm_string = String::from_utf8(string_buffer)
                            .expect("Failed to convert Wasm memory to string");
    println!("doubled: {}", wasm_string);
}

Again, almost all of this is going to be reused from the last example. We need to change the type arguments for func ever so slightly and the name of the function. Next we are going to call the func just like we did the last time, this time the return value is going to represent the index for the start of our new string. Since we will only ever double the string we can calculate the end by adding twice the original length plus the start, with both the start and the end we can capture the bytes as a slice. If you have the bytes as a slice you can try and convert it into a string using the String::from_utf8 method. If we were to run this we should see the following.

cargo run
doubled: supercalifragilisticexpialidocioussupercalifragilisticexpialidocious

Huzzah! Success... though the situations where you would know the size of any data after a plugin ran is going to be too small to be useful. Now the big question becomes, if web assembly functions can only return 1 value how could we possibly know both the start and the length of any value coming back? One solution would be to reserve a section of memory that the Wasm module could put the length in and then get the length when it's done.

Two values from one function

Let's keep the same basic structure of our last plugin, this time though, we are going to get the length from a reserved part of memory.


# #![allow(unused_variables)]
#fn main() {
pub fn double(s: &str) -> String {
    s.repeat(2)
}

#[no_mangle]
pub fn _double(ptr: i32, len: u32) -> i32 {
    // Extract the string from memory.
    let value = unsafe { 
        let slice = ::std::slice::from_raw_parts(ptr as _, len as _);
        String::from_utf8_lossy(slice)
    };
    // Double it
    let ret = double(&value);
    // Capture the length
    let len = ret.len() as u32;
    // write the length to byte 1 in memory
    unsafe {
        ::std::ptr::write(1 as _, len);
    }
    // return the start index
    ret.as_ptr() as _
}
#}

This time in our plugin we have one change, the call to ::std::ptr::write, which will write to any place in memory you tell it to any value you want. This is a pretty dangerous thing to do, it is important that we have all our ducks in a row or we may corrupt some existing memory. This is going to write the 4 bytes that make up the variable len into memory at index 1, 2, 3, and 4. The key to making that work is that we are going to need to leave those 4 bytes empty when we insert our value from the runner.

Let's build that.

cargo -p example-plugin --target wasm32-unknown-unknown

Now we can get started on the runner.

// ./crates/example-runner/src/main.rs
use wasmer_runtime::{
    imports,
    instantiate,
};

// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

fn main() {
    let instance = instantiate(&Wasm, &imports!{}).expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // Zero our the first 4 bytes of memory
    for cell in view[1..5].iter() {
        cell.set(0);
    }
    // This is the string we are going to pass into wasm
    let s = "supercalifragilisticexpialidocious".to_string();
    // This is the string as bytes
    let bytes = s.as_bytes();
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[5..len + 5].iter().zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let double = instance.func::<(i32, u32), i32>("_double").expect("Failed to bind _double");
    // Call the helper function an store the start of the returned string
    let start = double.call(5 as i32, len as u32).expect("Failed to execute _double") as usize;
    // Get an updated view of memory
    let new_view = memory.view::<u8>();
    // Setup the 4 bytes that will be converted
    // into our new length
    let mut new_len_bytes = [0u8;4];
    for i in 0..4 {
        // attempt to get i+1 from the memory view (1,2,3,4)
        // If we can, return the value it contains, otherwise
        // default back to 0
        new_len_bytes[i] = new_view.get(i + 1).map(|c| c.get()).unwrap_or(0);
    }
    // Convert the 4 bytes into a u32 and cast to usize
    let new_len = u32::from_ne_bytes(new_len_bytes) as usize;
    // Calculate the end as the start + new length
    let end = start + new_len;
    // Capture the string as bytes 
    // from the new view of the Wasm memory
    let string_buffer: Vec<u8> = new_view[start..end]
                                    .iter()
                                    .map(|c|c.get())
                                    .collect();
    // Convert the bytes to a string
    let wasm_string = String::from_utf8(string_buffer)
                            .expect("Failed to convert Wasm memory to string");
    println!("doubled: {}", wasm_string);
}

Ok, a few more things are going on in this one. First we immediately update the memory's bytes 1 through 4 to be set to 0, this is where we are going to put the new length. We continue normally until after we call _double. This time through we are going to pull those first 4 bytes out of the Wasm memory into a 4 byte array and convert that to a u32. We need to cast this u32 to a usize because we are going to be using it in as an index later. We can now update our end to use this new value instead of the old one. From that point on we keep going the same way. If we were to run this we should see the following.

cargo run
doubled: supercalifragilisticexpialidocioussupercalifragilisticexpialidocious

Huzzah! Success... and it is far more robust that before. If we executed a Wasm module that exported _double that actually tripled a string or cut the string in half, we would still know the correct length. Now that we can pass arbitrary sets of bytes from rust to Wasm and back again that means we have to tools to pass more complicated data. All we need now is a way to turn any struct into bytes and then back again, for that we can use something like bincode which is a binary serialization format used by WebRender and Servo's ipc-channel. It implements the traits defined by the serde crate which greatly opens our options.

Since there are a bunch of serde trait implementations for a bunch of standard rust types including strings and tuples, let's leverage that to create a slightly more interesting example.

Slightly More Interesting™

First we want to update the dependencies for both our runner and plugin projects. Update the 2 Cargo.toml files to look like this.

# ./crates/example-runner/Cargo.toml
[package]
name = "example-runner"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
wasmer-runtime = "0.3.0"
bincode = "1"
# ./crates/example-plugin/Cargo.toml
[package]
name = "example-plugin"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
bincode = "1"

[lib]
crate-type = ["cdylib"]

Now we can use bincode both of these projects. This time around, the goal is going to be to create a plugin that will take a tuple of a u8 and a string and return an updated version of that tuple.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs
use bincode::{deserialize, serialize};
/// This is the actual code we would 
/// write if this was a pure rust
/// interaction
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    // create a repeated version of the string
    // based on the u8 provided
    let s = pair.1.repeat(pair.0 as usize);
    // Multiply the u8 by the length
    // of the new string
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}

/// Since it isn't we need a way to
/// translate the data from wasm
/// to rust
#[no_mangle]
pub fn _multiply(ptr: i32, len: u32) -> i32 {
    // Extract the string from memory.
    let slice = unsafe { 
        ::std::slice::from_raw_parts(ptr as _, len as _)
    };
    // deserialize the memory slice
    let pair = deserialize(slice).expect("Failed to deserialize tuple");
    // Get the updated version
    let updated = multiply(pair);
    // serialize the updated value
    let ret = serialize(&updated).expect("Failed to serialize tuple");
    // Capture the length
    let len = ret.len() as u32;
    // write the length to byte 1 in memory
    unsafe {
        ::std::ptr::write(1 as _, len);
    }
    // return the start index
    ret.as_ptr() as _
}
#}

Just like last time time we take in our ptr and len arguments, we pass those along to ::std::slice::from_raw_parts which creates a reference to our bytes. After we get those bytes we can deserialize them into a tuple of a u8 and a string. Now we can pass that tuple along to the multiply function and capture the results as updated. Next we are going to serialize that value into a Vec<u8> and as the variable ret. The rest is going to be exactly like our string example, capture the length, write it to memory index 1 and return the start index of the bytes. Let's build this.

cargo -p example-plugin --target wasm32-unknown-unknown

Now for our runner.

// ./crates/example-runner/src/main.rs
use wasmer_runtime::{
    imports,
    instantiate,
};

use std::time::{
    UNIX_EPOCH,
    SystemTime,
};

use bincode::{
    deserialize,
    serialize,
};

// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

fn main() {
    let instance = instantiate(&Wasm, &imports!{}).expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // Zero our the first 4 bytes of memory
    for cell in view[1..5].iter() {
        cell.set(0);
    }
    // This is the string we are going to pass into wasm
    let s = "supercalifragilisticexpialidocious".to_string();
    let now = SystemTime::now();
    let diff = now.duration_since(UNIX_EPOCH).expect("Failed to calculate timestamp");
    let u = ((diff.as_millis() % 10) + 1) as u8;
    let pair = (u, s);
    let bytes = serialize(&pair).expect("Failed to serialize tuple");
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[5..len + 5].iter().zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let double = instance.func::<(i32, u32), i32>("_multiply").expect("Failed to bind _multiply");
    // Call the helper function an store the start of the returned string
    let start = double.call(5 as i32, len as u32).expect("Failed to execute _multiply") as usize;
    // Get an updated view of memory
    let new_view = memory.view::<u8>();
    // Setup the 4 bytes that will be converted
    // into our new length
    let mut new_len_bytes = [0u8;4];
    for i in 0..4 {
        // attempt to get i+1 from the memory view (1,2,3,4)
        // If we can, return the value it contains, otherwise
        // default back to 0
        new_len_bytes[i] = new_view.get(i + 1).map(|c| c.get()).unwrap_or(0);
    }
    // Convert the 4 bytes into a u32 and cast to usize
    let new_len = u32::from_ne_bytes(new_len_bytes) as usize;
    // Calculate the end as the start + new length
    let end = start + new_len;
    // Capture the string as bytes 
    // from the new view of the Wasm memory
    let updated_bytes: Vec<u8> = new_view[start..end]
                                    .iter()
                                    .map(|c|c.get())
                                    .collect();
    // Convert the bytes to a string
    let updated: (u8, String) = deserialize(&updated_bytes)
                            .expect("Failed to convert Wasm memory to tuple");
    println!("multiply {}: ({}, {:?})", pair.0, updated.0, updated.1);
}

First, we have updated our use statements to include some std::time items and the bincode functions for serializing and deserializing. We are going to use the same string as we did last time and calculate a pseudo random number between 1 and 10 that will serve as the parts of our tuple. Once we have constructed our tuple, we pass that off to bincode::serialize which gets us back to a Vec<u8>. We continue on just like our string example until after we get the new length back from the Wasm module. At this point we are going to build the updated_bytes the same as before and pass those along to bincode::deserialize which should get us back to a tuple.

cargo run
multiply 2: (136, "supercalifragilisticexpialidocioussupercalifragilisticexpialidocious")

Huzzah! Another success! At this point it might be a good idea to address the ergonomics all of this, if we asked another developer to understand all of this, do you think anyone would build a plugin for our system? Probably not. In the next post we are going to cover how to ease that process by leveraging proc_macros.

part three

+++ title = "Using Wasmer for Plugins Part 3" date = 2019-04-22 draft = false [extra] snippet = "Now with more ease" image = "rust-logo-blk.png" date_sort = 20190422 image_desc = "Made by Freepik from www.flaticon.com, licensed by CC-3.0-BY" +++

In the last two posts of this series we covered all of the things we would need to use Wasmer as the base for a plugin system. In part one we went over the basics of passing simple data in and out of a web assembly module, in part two we dug deeper into how you might do the same with more complicated data. In this part we are going to explore how we might ease the experience for people developing plugins for our application.

The majority of this is going to happen in a proc_macro, if you have never built one of these before, it can seem intimidating but we will go slow so don't fret. The first thing to understand is that proc_macros are meta-programming, meaning we are writing code that writes code. Currently there are 3 options to chose from when writing a proc_macro but they all follow the same basic structure; a function that will take TokenStreams as arguments and return a TokenStream. A TokenStream is a collection of rust language parts, for example a keyword like fn or punctuation like {. It is almost like we are getting the text from a source file and returning a modified version of that text, though we get the added benefit of the fact that rustc is going to have validated it at least knows all of the parts in that text and will only let us add parts to it that it knows. To make this whole process a little easier, we are going to lean on a few crates pretty heavily, they are syn, proc-macro2, and quote. syn is going to parse the TokenStream into a structure that has more information, it will help answer questions like 'is this a function?' or 'is this function public?'. Many parts of that's structure are provided by proc-macro2. quote is going to help us create a TokenStream by "quasi-quoting" some rust text, we'll get into what that means in just a moment.

Now that we have our dependencies outlined, let's talk about the three types of proc_macros. First we have a custom derive, if you have ever use the #[derive(Serialize)] attribute, you have used a custom derive. For these, we need to define a function that takes a single TokenStream argument and returns a new TokenStream, this return value will be append to the one passed in. That mean's we can't modify the original code, only augment it with something like an impl block, which makes it great for deriving a trait. Another option is often referred to as function like macros, these look just like the macros created with #[macro_rules] when used but are defined using a similar system to the custom derives. The big difference between custom derives and function like macros is the return value for the latter is going to replace the argument provided, not extend it. Lastly we have the attribute like macros, this is the one we are going to use. Attribute macros work the same as function like macros in that they will replace the code provided. The big difference is that an attribute definition the function we write will take 2 arguments, the first of which will be the contents of the attribute and the second is what that attribute is sitting on top of. To use the example from the rust book


# #![allow(unused_variables)]
#fn main() {
#[route(GET, "/")]
fn index() {

}
#}

The first argument is going to include GET and "/" and the second will contain the function index. With that basic structure defined, let's get started with our example. We are going to be making these edits in the example-macro project we added in part 1. Let's get those dependencies listed in the Cargo.toml.

# ./crates/example-macro/Cargo.toml
[package]
name = "example-macro"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
quote = "0.6"
proc-macro2 = "0.4"
syn = { version = "0.15", features = ["full"] }

[lib]
proc-macro = true

A few things to note here, first syn is pretty heavily feature gated, for this we want to add the "full" feature which will allow us to use all of the different types defined there. The next thing to point out is in the [lib] table we are going to add proc-macro = true to tell cargo that this crate will only contain a proc_macro. Currently proc_macros need to be defined in their own crates. With that out of the way we can get started editing our lib.rs.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-macro/src/lib.rs
extern crate proc_macro;
use proc_macro::TokenStream;

#[proc_macro_attribute]
pub fn plugin_helper(_attr: TokenStream, tokens: TokenStream) -> TokenStream {
    tokens
}
#}

First, we need to declare the use of the proc_macro crate that rust provides. Next we are going to use the TokenStream that is provided there. Our exported function is going to start with the #[proc_macro_attribute] attribute which will mark this function as an attribute with the same name. This function needs to take two arguments, both with the type TokenStream and return a TokenStream, just like we went over before. In this example we are just going to return the same value we were provided. Let's use our example-plugin project to see what it does. First we need to make sure that our macro is in the dependencies.

# ./crates/example-plugin/Cargo.toml
[package]
name = "example-plugin"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
bincode = "1"
example-macro = { path = "../example-macro" }

[lib]
crate-type = ["cdylib"]

Then we can use it like this.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs
use bincode::{deserialize, serialize};
use example_macro::*;
/// This is the actual code we would 
/// write if this was a pure rust
/// interaction
#[plugin_helper]
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    // create a repeated version of the string
    // based on the u8 provided
    let s = pair.1.repeat(pair.0 as usize);
    // Multiply the u8 by the length
    // of the new string
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}
#}

But... how can we see anything about this? We could cargo build to see if that works but that doesn't provide us much information. Thankfully there is a great 3rd party cargo command called cargo-expand that will help us out a ton. This utility relies on the nightly toolchain so we are going to need to get that first via rustup. To make things easier for later, let's also get the Wasm target for the nightly toolchain.

rustup toolchain add nightly
rustup target add wasm32-unknown-unknown --target nightly

With that taken care of we can now install cargo-expand.

cargo install cargo-expand

If we were to run the following command it should print our expanded library to the console.

cd crates/example-plugin
cargo +nightly expand -p example-plugin

As a side note, if you have an older version of cargo-expand installed it may not have the -p flag implemented, you can upgrade your version to current by running cargo install --force cargo-expand or simply run it from crates/example-plugin.


# #![allow(unused_variables)]
#![feature(prelude_import)]
#![no_std]
#fn main() {
#[prelude_import]
use ::std::prelude::v1::*;
#[macro_use]
extern crate std as std;
use bincode::{deserialize, serialize};
use example_macro::*;
#[doc = " This is the actual code we would "]
#[doc = " write if this was a pure rust"]
#[doc = " interaction"]
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    let s = pair.1.repeat(pair.0 as usize);
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}
#}

This is the fully expanded output of our library, not much has change except that we can see a few things that rust will always do to our program like convert out doc comments to attributes. Now let's update our proc_macro to do something a little more interesting.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-macro/src/lib.rs
extern crate proc_macro;
use proc_macro::TokenStream;
use syn::{
   Item as SynItem,
};
use proc_macro2::{
   Ident,
   Span,
};
use quote::quote;

#[proc_macro_attribute]
pub fn plugin_helper(_attr: TokenStream, tokens: TokenStream) -> TokenStream {
    // convert the TokenStream into proc_macro2::TokenStream
    let tokens2 = proc_macro2::TokenStream::from(tokens);
    // parse the TokenStream into a syn::Item
    let parse2 = syn::parse2::<SynItem>(tokens2).expect("Failed to parse tokens");
    // Check if it is a function
    // if not panic
    match parse2 {
        SynItem::Fn(func) => handle_func(func),
        _ => panic!("Only functions are currently supported")
    }
}

fn handle_func(func: syn::ItemFn) -> TokenStream {
    // Copy the function's identifier
    let ident = func.ident.clone();
    // Create a new identifier with a underscore in front of 
    // the original identifier
    let shadows_ident = Ident::new(&format!("_{}", ident), Span::call_site());
    // Generate some rust with the original and new
    // shadowed function
    let ret = quote! {
        #func

        pub fn #shadows_ident() {
            #ident((2, String::from("attributed")));
        }
    };
    ret.into()
}
#}

This time around we are first converting the TokenStream into the proc_macro2::TokenStream which will allow us to parse the tokens. The result of that is a syn::Item which is an enum of all the different types of rust Items and will allow us to determine exactly what our attribute is decorating. For us, we only want this to work on functions, so we match parse2, if it is a fn we pass the inner data off to handle_func if not, we panic with a message about only supporting fns.

Inside of handle_func we first make a copy of the original function's identifier, for our example that would be multiply. Next we are going to use that copy to create a new identifer that will have an underscore at the start: _multiply. To do this we are going to use the proc_macro2::Ident constructor which takes a &str and a Span (the index that this token takes up), we are going to use the format! macro for the first argument and thankfully proc_macro2::Span provides the call_site constructor that we can use which will figure out the index for us. At this point we are going to use the quote::quote macro to generate a new proc_macro2::TokenStream. This is where that quasi quoting happens, we can use the #variable_name syntax to insert variable's values into some raw text representing a rust program. First we want to put the original function as it was defined at the top, then we want to create a new function with our _multiply identifer the body of which will just call the original function with a constant set of arguments. Let's look at the expanded output.

cargo expand -p example-plugin

# #![allow(unused_variables)]
#![feature(prelude_import)]
#![no_std]
#fn main() {
#[prelude_import]
use ::std::prelude::v1::*;
#[macro_use]
extern crate std as std;
// ./crates/example-plugin/src/lib.rs
use bincode::{deserialize, serialize};
use example_macro::*;

#[doc = " This is the actual code we would "]
#[doc = " write if this was a pure rust"]
#[doc = " interaction"]
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    // create a repeated version of the string
    // based on the u8 provided
    let s = pair.1.repeat(pair.0 as usize);
    // Multiply the u8 by the length
    // of the new string
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}
pub fn _multiply() { multiply((2, String::from("attributed"))); }

#}

Another relatively useless transformation but we did successfully generate some code with our macro, now let's get back to our actual goal. If we look back at part 2's last helper function our end goal is going to replicated the following.


# #![allow(unused_variables)]
#fn main() {
#[no_mangle]
pub fn _multiply(ptr: i32, len: u32) -> i32 {
    let slice = unsafe { 
        ::std::slice::from_raw_parts(ptr as _, len as _)
    };
    let pair = deserialize(slice).expect("Failed to deserialize tuple");
    let updated = multiply(pair);
    let ret = serialize(&updated).expect("Failed to serialize tuple");
    let len = ret.len() as u32;
    unsafe {
        ::std::ptr::write(1 as _, len);
    }
    ret.as_ptr() as _
}
#}

We should be able to reproduce that function with our attribute if we just extend the last example a little.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-macro/src/lib.rs
#![recursion_limit="128"]
extern crate proc_macro;
use proc_macro::TokenStream;

use syn::{Item as SynItem, ItemFn};
use quote::quote;
use proc_macro2::{Ident, Span};

#[proc_macro_attribute]
pub fn plugin_helper(_attr: TokenStream, tokens: TokenStream) -> TokenStream {
    let tokens2 = proc_macro2::TokenStream::from(tokens);
    let parse2 = syn::parse2::<SynItem>(tokens2).expect("Failed to parse tokens");
    match parse2 {
        SynItem::Fn(func) => handle_func(func),
        _ => panic!("Only functions are currently supported")
    }
}

fn handle_func(func: ItemFn) -> TokenStream {
    // Check and make sure our function takes
    // only one argument and panic if not
    if func.decl.inputs.len() != 1 {
        panic!("fns marked with plugin_helper can only take 1 argument");
    }
    // Copy this function's identifier
    let ident = func.ident.clone();
    // Create a new identifier with a underscore in front of 
    // the original identifier
    let shadows_ident = Ident::new(&format!("_{}", ident), Span::call_site());
    // Generate some code with the original and new
    // shadowed function
    let ret = quote! {
        #func

        #[no_mangle]
        pub fn #shadows_ident(ptr: i32, len: u32) -> i32 {
            let value = unsafe {
                ::std::slice::from_raw_parts(ptr as _, len as _)
            };
            let arg = deserialize(value).expect("Failed to deserialize argument");
            let ret = #ident(arg);
            let bytes = serialize(&ret).expect("Failed to serialize return value");
            let len = bytes.len();
            unsafe {
                ::std::ptr::write(1 as _, len);
            }
            bytes.as_ptr()
        }
    };
    ret.into()
}
#}

You may notice at the top we need to add the module attribute #![recursion_limit="128"], this is because quote does some seriously deep recursion to work its magics. The next change is to add a check that there is only one argument and panic if not to simplify our plugins. We use the same scheme for generating a new identifier for our new function and then we really just ripped to code right out of the last example, replacing multiply(pair) with #ident(arg). If we run cargo expand on that we get the following.


# #![allow(unused_variables)]
#![feature(prelude_import)]
#![no_std]
#fn main() {
#[prelude_import]
use ::std::prelude::v1::*;
#[macro_use]
extern crate std as std;
// ./crates/example-plugin/src/lib.rs
use bincode::{deserialize, serialize};
use example_macro::*;
#[doc = " This is the actual code we would "]
#[doc = " write if this was a pure rust"]
#[doc = " interaction"]
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    // create a repeated version of the string
    // based on the u8 provided
    let s = pair.1.repeat(pair.0 as usize);
    // Multiply the u8 by the length
    // of the new string
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}
#[no_mangle]
pub fn _multiply(ptr: i32, len: u32) -> i32 {
    let value = unsafe { ::std::slice::from_raw_parts(ptr as _, len as _) };
    let arg = deserialize(value).expect("Failed to deserialize argument");
    let ret = multiply(arg);
    let bytes = serialize(&ret).expect("Failed to serialize return value");
    let len = bytes.len() as u32;
    unsafe { ::std::ptr::write(1 as _, len); }
    bytes.as_ptr() as _
}
#}

Looks a lot like our last example from part 2!

Let's try and compile that to Wasm and execute the runner.

cargo build -p example-plugin --target wasm32-unknown-unknown
cargo run -p example-runner
multiply 10: (72, "supercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocious")

Huzzah! It still works! We are still requiring that plugin developers know a little too much about the inner workings of our system though. Let's use the library we put in the workspace root to take care of this last little hurdle. Instead of importing the macro directly into the plugin, if we were to import it into our library, we would have a more convenient package to provide to plugin developers. We can also take care of our dependencies problem at the same time. Let's update that project to package all of our requirements for the plugin developer, starting with the dependencies.

# ./Cargo.toml
[package]
name = "wasmer-plugin-example"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
serde = "1"
bincode = "1"
example-macro = { path = "./crates/example-macro" }

[workspace]
members = [
    "./crates/example-macro",
    "./crates/example-plugin",
    "./crates/example-runner",
]

Now in that library we can use the pub use keywords to re-export our macro and also define a couple of helper functions.


# #![allow(unused_variables)]
#fn main() {
// ./src/lib.rs
use serde::{Serialize, Deserialize};
use bincode::{serialize, deserialize};

pub use example_macro::plugin_helper;

pub fn convert_data<'a, D>(bytes: &'a [u8]) -> D 
where D: Deserialize<'a> {
    deserialize(bytes).expect("Failed to deserialize bytes")
}

pub fn revert_data<S>(s: S) -> Vec<u8> 
where S: Serialize {
    serialize(s).expect("Failed to serialize data")
}
#}

We are essentially wrapping the bincode functions we are using in identical function. It would probably be smarter to have these return results but for now this will do. The big win here is that our users will only need to import our library and not need to worry about having serde and bincode available. With those defined we can make a small update in the example-macro to use them.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-macro/src/lib.rs
#![recursion_limit="128"]
extern crate proc_macro;
use proc_macro::TokenStream;

use syn::{Item as SynItem, ItemFn};
use quote::quote;
use proc_macro2::{Ident, Span};

#[proc_macro_attribute]
pub fn plugin_helper(_attr: TokenStream, tokens: TokenStream) -> TokenStream {
    let tokens2 = proc_macro2::TokenStream::from(tokens);
    let parse2 = syn::parse2::<SynItem>(tokens2).expect("Failed to parse tokens");
    match parse2 {
        SynItem::Fn(func) => handle_func(func),
        _ => panic!("Only functions are currently supported")
    }
}

fn handle_func(func: ItemFn) -> TokenStream {
    // Check and make sure our function takes
    // only one argument and panic if not
    if func.decl.inputs.len() != 1 {
        panic!("fns marked with plugin_helper can only take 1 argument");
    }
    // Copy this function's identifier
    let ident = func.ident.clone();
    // Create a new identifier with a underscore in front of 
    // the original identifier
    let shadows_ident = Ident::new(&format!("_{}", ident), Span::call_site());
    // Generate some rust with the original and new
    // shadowed function
    let ret = quote! {
        #func

        #[no_mangle]
        pub fn #shadows_ident(ptr: i32, len: u32) -> i32 {
            let value = unsafe {
                ::std::slice::from_raw_parts(ptr as _, len as _)
            };
            let arg = convert_data(value);
            let ret = #ident(arg);
            let bytes = revert_data(&ret);
            let len = bytes.len() as u32;
            unsafe {
                ::std::ptr::write(1 as _, len);
            }
            bytes.as_ptr() as _
        }
    };
    ret.into()
}
#}

Now we need to point our plugin to the workspace root instead of the macro directly which means we can get rid of the bincode dependency.

# ./crates/example-plugin/Cargo.toml
[package]
name = "example-plugin"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
wasmer-plugin-example = { path = "../.." }

[lib]
crate-type = ["cdylib"]

With that updated we can now adjust the use statement to use wasmer_plugin_example::*


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs
use wasmer_plugin_example::*;
/// This is the actual code we would 
/// write if this was a pure rust
/// interaction
#[plugin_helper]
pub fn multiply(pair: (u8, String)) -> (u8, String) {
    // create a repeated version of the string
    // based on the u8 provided
    let s = pair.1.repeat(pair.0 as usize);
    // Multiply the u8 by the length
    // of the new string
    let u = pair.0.wrapping_mul(s.len() as u8);
    (u, s)
}
#}

Let's just double check that we haven't broken anything.

cargo build -p example-plugin --target wasm32-unknown-unknown
cargo run -p example-runner
multiply 5: (82, "supercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocioussupercalifragilisticexpialidocious")

Huzzah! It works and that looks a lot cleaner than before, now plugin developers don't need to worry about how we are doing what we do but instead can just focus on their task. In the next part, we are going to cover a real world example of how you might use this scheme to extend an application. We are going to focus on extending mdbook to allow web assembly plugins for preprocessing.

In the last three posts of this series we covered all of the things we would need to use Wasmer as the base for a plugin system. In part one we went over the basics of passing simple data in and out of a web assembly module, in part two we dug deeper into how you might do the same with more complicated data. In the last part we eased the experience of plugin developers by encapsulating all of our work into a library that exports a procedural macro. In this post we are going to explore what it would take to extend an existing plugin system to allow for Wasm plugins.

Enter MDBook

Before we get started with any code, we should first go over mdbook a little bit. If you are not familiar, mdbook is an application that enables its users to create books using markdown files and a toml file for configuration. You are probably familiar with the format because TRPL is built using it and while HTML is probably the most popular output it has the ability to render into a few other formats. These other formats are provided through a plugin system which has two sides, preprocessors and renderers. Each side is really aptly named, the preprocessors will get the information first then the renderer will get the information last. Both types of plugins communicate with the main mdbook process via stdin and stdout. The basic workflow is that mdbook will read in the book and it's contents from the file system, generate a struct that represents that book and then serializes it to json and pipes it to a child process. If that child process is a preprocessor, it will deserialize, update, re-serialize and then pipe that back, if it is a render it will deserialize and then render that however it likes. At this point, we are going to focus on the preprocessor because Wasm isn't currently a great candidate for dealing with the file system or network and the preprocessor doesn't need any of that.

In the official guide the mdbook team outlined the basic structure as being an struct that implements the trait Preprocessor which requires two methods name, run and allows an optional method supports which by default returns true. The main entry point being the run method, which take a PreprocessorContext and a Book and returns a Result<Book>. While this is a good way to explain what is needed, in actuality a preprocessor would look a little different. First, instead of a struct that implements a trait, it can just be a command line application that can support running with no arguments as well as with the supports argument. If the supports argument is provided, the application should use the exit status code to indicate if it does (0) or does not (1) support a particular renderer. If no argument was provided we would then deserialize the context and book provided from stdin (as a tuple). Once those two values are acquired, you can manipulate the book however you'd like and then serialize the it and send that back out via stdout. Let's quickly look at what a preprocessor might look like if it just updates any "Wasm" strings to "Wasm" (because Wasm isn't an acronym). For this example, we are going to update the runner. First we want to add a few more dependencies, namely mdBook, docopt, serde and serde_derive.

# ./crates/example-runner/Cargo.toml
[package]
name = "mdbook-example-runner"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
wasmer-runtime = "0.3.0"
bincode = "1"
mdbook = { git = "https://github.com/rust-lang-nursery/mdBook" }
docopt = "1"
serde = "1"
serde_derive = "1"
serde_json = "1"

Two things to point out here, first is that we are updating the name of this program to have a prefix of mdbook- this is a requirement of any mdbook preprocessor, the other is that we are using mdbook as a git dependency. As of the writing of this post there is an issue with their handlebars dependency that would make the library fail to compile to Wasm. The next version of mdbook will not include this problem but for now, this example will need to work with the git repository instead of crates.io. We are going to use docopt for command line argument parsing but you could just as easily use clap, structopt or DIY it if you'd prefer.

As a note, this example is going to remove a lot of the wasmer-runtime stuff for readability (you may want to keep some of it around for later if you're typing along).

// ./crates/example-runner/src/main.rs
use docopt::Docopt;
use serde::Deserialize;
use serde_json::{
    from_reader, 
    to_writer,
};
use std::{
    process::exit,
    io::{
        stdin,
        stdout,
    }
};
use mdbook::{
    book::{
        Book,
        BookItem,
    },
    preprocess::PreprocessorContext,
};

static USAGE: &str = "
Usage:
    mdbook-wasm-preprocessor
    mdbook-wasm-preprocessor supports <supports>
";

#[derive(Deserialize)]
struct Opts {
    pub arg_supports: Option<String>,
}

fn main() {
    // Parse and deserialize command line
    // arguments
    let opts: Opts = Docopt::new(USAGE)
                    .and_then(|d| d.deserialize())
                    .unwrap_or_else(|e| e.exit());
    // If the arg supports was include
    // we need to handle that
    if let Some(_renderer_name) = opts.arg_supports {
        // This will always resolve
        // to `true` for mdbook
        exit(0);
    }
    // Parse and deserialize the context and book
    // from stdin
    let (_ctx, book): (PreprocessorContext, Book) = 
        from_reader(stdin())
        .expect("Failed to deserialize context and book");
    // Update the book's contents
    let updated = preprocess(book)
        .expect("Failed to preprocess book");
    // serialize and write the updated book
    // to stdout
    to_writer(stdout(), &updated)
        .expect("Failed to serialize/write book");
}

/// Update the book's contents so that all Wasms are
/// replaced with Wasm
fn preprocess(mut book: Book) -> Result<Book, String> {
    // Iterate over the book's sections assigning
    // the updated items to the book we were passed
    book.sections = book.sections.into_iter().map(|s| {
        // each section could be a chapter
        // or a seperator
        match s {
            // if its a chapter, we want to update that
            BookItem::Chapter(mut ch) => {
                // replace all Wasms with Wasms
                ch.content = ch.content.replace("Wasm", "Wasm");
                // Wrap the contents back up into a Chapter
                BookItem::Chapter(ch)
            },
            _ => s,
        }
    }).collect();
    // Return the updated book
    Ok(book)
}

If you have never used docopt, it essentially uses command line usage text as a serialization format. To start we are going to define our usage. With that done we can declare the struct that will represent the deserialized command line arguments. Docopt uses a prefix scheme for flags vs sub-commands vs arguments, we want to have a field arg_supports that will be an optional string. Now we can actually get into the execution, first we pass the usage off to docopt and exit early if it fails to parse. Next we want to check if the caller provided the supports argument, if so we are just going to exit early with 0 which just says yes, we support this format. Once we are through that we can use the serde_json function deserialize_from to both read stdin and also serialize it into a tuple with a context first and the book second. Now that we have those two items we are going to pass them along to the function preprocess.

For this preprocessor, we are going loop over all of the sections in the book and any chapters we find and update the contents of those to replace any "Wasm"s with "Wasm"s returning the updated book. We are going to use the serde_json function serialize_to to serialize the returned book to json and write that to stdout. As you can see, this is both a powerful system but also one that requires plugin developers to know quite a bit about how everything works. After building a preprocessor myself and then hearing about wasmer-runtime it seemed like a perfect opportunity to make this whole thing easier.

If we wanted to test our first example out we would need mdbook installed and an actual book to run it against. To install mdbook, you have a few options but for this example we will use cargo install mdbook. With that installed we can create a book with the following.

mdbook init ./example-book

Do you want a .gitignore to be created? (y/n)
n
What title would you like to give the book? 
Example Book

As an example, the repo has one defined with the contents of this series, with all the wasms capitalized. Now, we need to tell mdbook to run our preprocessor, we do that in the book.toml file.

# ./example-book/book.toml
[book]
authors = ["rfm"]
multilingual = false
src = "src"
title = "Example Book"

[preprocessor.example-runner]

We are almost there, the last thing we need to do is install our plugin, we do that with the following command.

cargo install --path ./crates/example-runner

Cargo will compile that for us and put it in our path. We can now run mdbook build ./example-book, which will generate a bunch of files in the ./example-book/book directory, any of the html files should have their Wasms updated to Wasms.

One of the really nice things about there being an existing plugin system is that we don't need to be maintainers to realize our vision. We could define our own scheme for running Wasm plugins that interfaces with mdbook via the old system. Let's say that we want our plugin developers to provide a functions preprocess(mut book: Book) -> Book. Since this takes a single argument and return a single argument, we can use the same scheme to execute it as we have previously. Let's take the Wasm to Wasm part from above and move that into our example plugin, to do that we need to update the dependencies.

# ./crates/example-plugin/Cargo.toml
[package]
name = "example-plugin"
version = "0.1.0"
authors = ["rfm <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
wasmer-plugin-example = { path = "../.." }

[dependencies.mdbook]
git = "https://github.com/rust-lang-nursery/mdBook"
default-features = false 

[lib]
crate-type = ["cdylib"]

Adding a dependency with a toml table like we're doing for mdbook is a nice way to make it clearer what is happening. Again we are going to point to the git repository, we also need to make sure that the default-features are turned off. The mdbook default features are primarily for the binary application, avoiding them is the other key to allowing this to compile to Wasm. With that out of the way we can update our code.


# #![allow(unused_variables)]
#fn main() {
// ./crates/example-plugin/src/lib.rs
use wasmer_plugin_example::*;
use mdbook::{
    book::{
        Book,
        BookItem,
    },
    preprocess::PreprocessorContext,
};
#[plugin_helper]
pub fn preprocess(mut book: Book) -> Book {
    // Iterate over the book's sections assigning
    // the updated items to the book we were passed
    book.sections = book.sections.into_iter().map(|s| {
        // each section could be a chapter
        // or a seperator
        match s {
            // if its a chapter, we want to update that
            BookItem::Chapter(mut ch) => {
                // replace all Wasms with Wasms
                ch.content = ch.content.replace("Wasm", "Wasm");
                // Wrap the contents back up into a Chapter
                BookItem::Chapter(ch)
            },
            _ => s,
        }
    }).collect();
    // Return the updated book
    book
}
#}

Here we have updated the library to export a function called preprocess annotated with the #[plugin_helper] attribute which means we should be able to use it just like we did before. Now we can update our runner, we are going to be passing what we have deserialized from the command line to the Wasm module.

// ./crates/example-runner/src/main.rs
use docopt::Docopt;
use serde::Deserialize;
use serde_json::{
    from_reader, 
    to_writer,
};
use std::{
    process::exit,
    io::{
        stdin,
        stdout,
    }
};
use mdbook::{
    book::Book,
    preprocess::PreprocessorContext,
};
use bincode::{
    serialize,
    deserialize,
};
use wasmer_runtime::{
    instantiate,
    imports,
};

// For now we are going to use this to read in our Wasm bytes
static Wasm: &[u8] = include_bytes!("../../../target/wasm32-unknown-unknown/debug/example_plugin.wasm");

static USAGE: &str = "
Usage:
    mdbook-wasm-preprocessor
    mdbook-wasm-preprocessor supports <supports>
";

#[derive(Deserialize)]
struct Opts {
    pub arg_supports: Option<String>,
}

fn main() {
    // Parse and deserialize command line
    // arguments
    let opts: Opts = Docopt::new(USAGE)
                    .and_then(|d| d.deserialize())
                    .unwrap_or_else(|e| e.exit());
    // If the arg supports was include
    // we need to handle that
    if let Some(_renderer_name) = opts.arg_supports {
        // This will always resolve
        // to `true` for mdbook
        exit(0);
    }
    // Parse and deserialize the context and book
    // from stdin
    let (_ctx, book): (PreprocessorContext, Book) = 
        from_reader(stdin())
        .expect("Failed to deserialize context and book");
    // Update the book's contents
    let updated = preprocess(book)
        .expect("Failed to preprocess book");
    // serialize and write the updated book
    // to stdout
    to_writer(stdout(), &updated)
        .expect("Failed to serialize/write book");
}

/// Update the book's contents so that all Wasms are
/// replaced with Wasm
fn preprocess(book: Book) -> Result<Book, String> {
    let instance = instantiate(&Wasm, &imports!{})
        .expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // Zero our the first 4 bytes of memory
    for cell in view[1..5].iter() {
        cell.set(0);
    }
    let bytes = serialize(&book)
        .expect("Failed to serialize tuple");
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[5..len + 5]
                .iter()
                .zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let wasm_preprocess = instance.func::<(i32, u32), i32>("_preprocess")
        .expect("Failed to bind _preprocess");
    // Call the helper function an store the start of the returned string
    let start = wasm_preprocess.call(5 as i32, len as u32)
        .expect("Failed to execute _preprocess") as usize;
    // Get an updated view of memory
    let new_view = memory.view::<u8>();
    // Setup the 4 bytes that will be converted
    // into our new length
    let mut new_len_bytes = [0u8;4];
    for i in 0..4 {
        // attempt to get i+1 from the memory view (1,2,3,4)
        // If we can, return the value it contains, otherwise
        // default back to 0
        new_len_bytes[i] = new_view
            .get(i + 1)
            .map(|c| c.get())
            .unwrap_or(0);
    }
    // Convert the 4 bytes into a u32 and cast to usize
    let new_len = u32::from_ne_bytes(new_len_bytes) as usize;
    // Calculate the end as the start + new length
    let end = start + new_len;
    // Capture the string as bytes 
    // from the new view of the Wasm memory
    let updated_bytes: Vec<u8> = new_view[start..end]
                                    .iter()
                                    .map(|c|c.get())
                                    .collect();
    // Convert the bytes to a string
    deserialize(&updated_bytes)
        .map_err(|e| format!("Error deserializing after Wasm update\n{}", e))
}

A lot of what we see in preprocess should look familiar to our previous runner examples, the only real change being that pair will now just be book and the name of the function we are calling has changed. At this point, to test if this is working we would need to rebuild the plugin and then re-install the runner before we can build our book.

cargo build -p example-plugin
cargo install --path ./crates/example-runner --force
mdbook build example-book

When we run that the output in example-book/book should now have to content we expect. One last thing to cover is that we are still using the include_bytes macro to get our Wasm. If this was a real plugin system we would need a method for getting that in a more dynamic way. Let's assume that we want our users to put any pre-compiled Wasm preprocessors into a new sub-directory of the book's root called preprocessors. For this example we can just move our last example plugin into this new folder.

mkdir ./example-book/preprocessors
cp ./target/wasm32-unknown-unknown/debug/example-plugin.Wasm ./example-book/preprocessors

Now we can update our runner to look in that directory instead of compiling the bytes into the binary file.

// ./crates/example-runner/src/main.rs
use docopt::Docopt;
use serde::Deserialize;
use serde_json::{
    from_reader, 
    to_writer,
};
use std::{
    process::exit,
    io::{
        stdin,
        stdout,
        Read,
    },
    fs::File,
};
use mdbook::{
    book::Book,
    preprocess::PreprocessorContext,
};
use bincode::{
    serialize,
    deserialize,
};
use wasmer_runtime::{
    instantiate,
    imports,
};

static USAGE: &str = "
Usage:
    mdbook-wasm-preprocessor
    mdbook-wasm-preprocessor supports <supports>
";

#[derive(Deserialize)]
struct Opts {
    pub arg_supports: Option<String>,
}

fn main() {
    // Parse and deserialize command line
    // arguments
    let opts: Opts = Docopt::new(USAGE)
                    .and_then(|d| d.deserialize())
                    .unwrap_or_else(|e| e.exit());
    // If the arg supports was include
    // we need to handle that
    if let Some(renderer_name) = opts.arg_supports {
        // This will always resolve
        // to `true` for mdbook
        exit(0);
    }
    // Parse and deserialize the context and book
    // from stdin
    let (ctx, book): (PreprocessorContext, Book) = 
        from_reader(stdin())
        .expect("Failed to deserialize context and book");
    // Update the book's contents
    let updated = run_all_preprocessors(ctx, book)
        .expect("Failed to preprocess book");
    // serialize and write the updated book
    // to stdout
    to_writer(stdout(), &updated)
        .expect("Failed to serialize/write book");
}

fn run_all_preprocessors(ctx: PreprocessorContext, mut book: Book) -> Result<Book, String> {
    // ctx.root will tell us where our book lives
    let dir = ctx.root.join("preprocessors");
    // loop over all of the preprocessors files there
    for entry in dir.read_dir()
        .map_err(|e| format!("Error reading preprocessors directory {}", e))? {
        // safely unwrap the dir entry
        let entry = entry
            .map_err(|e| format!("Error reading entry {}", e))?;
        // pull out the path we are working on
        let path = entry.path();
        // Check if the path ends with .wasm
        if let Some(ext) = path.extension() {
            if ext == "wasm" {
                // if it does we want to read all the bytes into
                // a buffer
                let mut buf = Vec::new();
                let mut f = File::open(&path)
                    .map_err(|e| format!("Error opening file {:?}, {}", path, e))?;
                f.read_to_end(&mut buf)
                    .map_err(|e| format!("Error reading file {:?}, {}", path, e))?;
                // We can now pass this off to our original preprocess
                book = preprocess(buf.as_slice(), book)?;
            }
        }
    }
    Ok(book)
}

/// Update the book's contents so that all Wasms are
/// replaced with Wasm
fn preprocess(bytes: &[u8], book: Book) -> Result<Book, String> {
    // instantiate the Wasm module with the bytes provided
    let instance = instantiate(bytes, &imports!{})
        .expect("failed to instantiate Wasm module");
    // The changes start here
    // First we get the module's context
    let context = instance.context();
    // Then we get memory 0 from that context
    // web assembly only supports one memory right
    // now so this will always be 0.
    let memory = context.memory(0);
    // Now we can get a view of that memory
    let view = memory.view::<u8>();
    // Zero our the first 4 bytes of memory
    for cell in view[1..5].iter() {
        cell.set(0);
    }
    let bytes = serialize(&book)
        .expect("Failed to serialize tuple");
    // Our length of bytes
    let len = bytes.len();
    // loop over the Wasm memory view's bytes
    // and also the string bytes
    for (cell, byte) in view[5..len + 5]
                .iter()
                .zip(bytes.iter()) {
        // set each Wasm memory byte to 
        // be the value of the string byte
        cell.set(*byte)
    }
    // Bind our helper function
    let wasm_preprocess = instance.func::<(i32, u32), i32>("_preprocess")
        .expect("Failed to bind _preprocess");
    // Call the helper function an store the start of the returned string
    let start = wasm_preprocess.call(5 as i32, len as u32)
        .expect("Failed to execute _preprocess") as usize;
    // Get an updated view of memory
    let new_view = memory.view::<u8>();
    // Setup the 4 bytes that will be converted
    // into our new length
    let mut new_len_bytes = [0u8;4];
    for i in 0..4 {
        // attempt to get i+1 from the memory view (1,2,3,4)
        // If we can, return the value it contains, otherwise
        // default back to 0
        new_len_bytes[i] = new_view
            .get(i + 1)
            .map(|c| c.get())
            .unwrap_or(0);
    }
    // Convert the 4 bytes into a u32 and cast to usize
    let new_len = u32::from_ne_bytes(new_len_bytes) as usize;
    // Calculate the end as the start + new length
    let end = start + new_len;
    // Capture the string as bytes 
    // from the new view of the Wasm memory
    let updated_bytes: Vec<u8> = new_view[start..end]
                                    .iter()
                                    .map(|c|c.get())
                                    .collect();
    // Convert the bytes to a string
    deserialize(&updated_bytes)
        .map_err(|e| format!("Error deserializing after Wasm update\n{}", e))
}

The big changes here is that we are passing the context and book off to run_all_preprocessors instead of just preprocess. In this new function we are going to first construct the path that will contain our Wasm preprocessors. The context will have a root field that will tell us where our book lives, we can append "preprocessors" on to that with join. Now that we have our path we want to loop over each of the files in that directory and if then end in .Wasm we want to pass those bytes off to preprocess with the book. The result of that should replace our previous book and we will return the updated book after all of the Wasm files have been run. All in all we seem to have a pretty viable plugin runner. There may be a few places that could use some tweaks to increase resiliency or reduce the memory footprint but at least it should be enough to get started.

If your interested I have built a less educational version of this plugin system which you can find here. My hope is that I can add a few more niceties in the coming months that will focus on the plugin runner side of things (like extracting more data from errors in Wasm or wrapping up the instantiate/serialize/inject/execute/extract/deserialize cycle). If you have any comments, questions, suggestions or gripes feel free to shoot me an email at r [at] robertmasen.com or find me on twitter @freemasen.