An Overview

$web-only$ To get started building development tools using the Rust programming language, we are going to be utilizing 3 crates. The first is a crate called ress or Rusty ECMAScript Scanner, this crate is used to convert JavaScript text into a series of Tokens. Next is ressa or Rusty ECMAScript Syntax Analyzer, this crate will take that series of Tokens and build an Abstract Syntax Tree or AST. This AST is provided by a third crate resast. Either of these tools will be useful for building development tools however since the output of ress is essentially flat it means we can only build a much simpler kind of tool. Over the course of this book we will cover the basics of how to build a development tool with either of these crates. $web-only-end$ $slides-only$

  • What is RESS
    • Overview
    • Demo Project
  • What is RESSA
    • Overview
    • Demo Project
  • What is RESW (maybe)
    • Overview $slides-only-end$

RESS

$slides-only$

  • impl Iterator for Scanner
  • Converts text into Tokens
  • Flat Structure $slides-only-end$ $web-only$ Before we start on any examples let's dig a little into what ress does. The job of a scanner (sometimes called a tokenizer or lexer) in the parsing process is to convert raw text or bytes into logically separated parts called tokens and ress does just that. It reads your JavaScript text and then tells you what a given word or symbol might represent. It does this through the Scanner interface, to construct a scanner you pass it the text you would like it to tokenize.

$web-only-end$


#![allow(unused)]
fn main() {
    let js = "var i = 0;";
    let scanner = Scanner::new(js);
}

$web-only$

Now that you have prepared a scanner, how do we use it? Well, the Scanner implements Iterator so we can actually use it in a for loop like so.


#![allow(unused)]
fn main() {
    for token in scanner {
        println!("{:#?}", token);
    }
}

If we were to run the above program it would print to the terminal the following. $web-only-end$

Item {
    token: Keyword(
        Var,
    ),
    span: Span {
        start: 0,
        end: 3,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 1,
        },
        end: Position {
            line: 1,
            column: 4,
        },
    },
}
Item {
    token: Ident(
        Ident(
            "i",
        ),
    ),
    span: Span {
        start: 4,
        end: 5,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 5,
        },
        end: Position {
            line: 1,
            column: 6,
        },
    },
}
Item {
    token: Punct(
        Equal,
    ),
    span: Span {
        start: 6,
        end: 7,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 7,
        },
        end: Position {
            line: 1,
            column: 8,
        },
    },
}
Item {
    token: Number(
        Number(
            "0",
        ),
    ),
    span: Span {
        start: 8,
        end: 9,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 9,
        },
        end: Position {
            line: 1,
            column: 10,
        },
    },
}
Item {
    token: Punct(
        SemiColon,
    ),
    span: Span {
        start: 9,
        end: 10,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 10,
        },
        end: Position {
            line: 1,
            column: 11,
        },
    },
}
Item {
    token: EoF,
    span: Span {
        start: 10,
        end: 10,
    },
    location: SourceLocation {
        start: Position {
            line: 1,
            column: 11,
        },
        end: Position {
            line: 1,
            column: 11,
        },
    },
}

$web-only$ The scanner's ::next() method returns an Result<Item, Error> the Ok variant has 3 properties token, span and location. The span is the byte index that starts and ends the token, the location property is the human readable location of the token, the token property is going to be one variant of the Token enum which has the following variants.

  • Token::Boolean(BooleanLiteral) - The text true or false
  • Token::Ident(Ident) - A variable, function, or class name
  • Token::Null - The text null
  • Token::Keyword(Keyword) - One of the 42 reserved words e.g. function, var, delete, etc
  • Token::Numeric(Number) - A number literal, this can be an integer, a float, scientific notation, binary notation, octal notation, or hexadecimal notation e.g. 1.5e9, 0xfff, etc
  • Token::Punct(Punct) - One of the 52+ reserved symbols or combinations of symbols e.g. *, &&, =>, etc
  • Token::String(StringLit) - Either a double or single quoted string
  • Token::RegEx(RegEx) - A Regular Expression literal e.g. /.+/g
  • Token::Template(Template) - A template string literal e.g. one ${2} three
  • Token::Comment(Comment) - A single line, multi-line or html comment

For a more in depth look at these tokens, take a look at the Appendix

Overall the output of our scanner isn't going to provide any context for these tokens, that means when we are building our development tools it is going to be a little harder to figure out what is going on with any given token. One way we could take that is to just build a tool that is only concerned with the token level of information. Say you work on a team of JavaScript developers that need to adhere to a strict code style because the organization needs their website to be usable in Internet Explorer 8. With that restriction there are a large number of APIs that are off the table, looking over this list we can see how big that really is. It could be useful to have a linter that will check for the keywords and identifiers that are not available in IE8. let's try and build one.

$web-only-end$

Building an IE8 Linter

$web-only$ To get started we need to add ress to our dependencies. This project is also going to need serde, serde_derive and toml because it will rely on a .toml file to make the list of unavailable tokens configurable.

[package]
name = "lint-ie8"
version = "0.1.0"
authors = ["Robert Masen <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
ress = "0.7"
serde = "1"
serde_derive = "1"
toml = "0.5"
atty = "0.2"

Next we want to use the Scanner and Token from ress, we can do this by importing all the contents of the prelude.


#![allow(unused)]
fn main() {
use ress::prelude::*;
}

Since we are using a .toml file to provide the list of banned tokens, let's create a struct that will represent our configuration.


#![allow(unused)]
fn main() {
#[derive(Deserialize)]
struct BannedTokens {
    idents: Vec<String>,
    keywords: Vec<String>,
    puncts: Vec<String>,
    strings: Vec<String>,
}
}

The toml file we are going to use is pretty big so but if you want to see what it looks like you can check it out here. Essentially it is a list of identifiers, strings, punctuation, and keywords that would cause an error when trying to run in IE8.

To start we need to deserialize that file, we can do that with the std::fs::read_to_string and toml::from_str functions.


#![allow(unused)]
fn main() {
    let config_text = ::std::fs::read_to_string("banned_tokens.toml").expect("failed to read config");
    let banned: BannedTokens = from_str(&config_text).expect("Failed to deserialize banned tokens");
}

Now that we have a list of tokens that should not be included in our javascript, let's get the js text. It would be useful to be able to take a path argument or read the raw js from stdin. This function will check for an argument first and fallback to reading from stdin, it looks something like this.


#![allow(unused)]
fn main() {
fn get_js() -> Result<String, ::std::io::Error> {
    let mut cmd_args = args();
    let _ = cmd_args.next(); //discard bin name
    let js = if let Some(file_name) = cmd_args.next() {
        let js = read_to_string(file_name)?;
        js
    } else {
        let mut std_in = ::std::io::stdin();
        let mut ret = String::new();
        if atty::is(atty::Stream::Stdin) {
            return Ok(ret)
        }
        std_in.read_to_string(&mut ret)?;
        ret
    };
    Ok(js)
}

}

we will call it like this.


#![allow(unused)]
fn main() {
    let js = match get_js() {
        Ok(js) => if js.len() == 0 {
            print_usage();
            std::process::exit(1);
        } else {
            js
        },
        Err(_) => {
            print_usage();
            std::process::exit(1);
        }
    };
    let finder = BannedFinder::new(&js, banned);
}

We want to handle the failure when attempting to get the js, so we will match on the call to get_js. If everything went well we need to check if the text is an empty string, this means no argument was provided but the program was not pipped any text. In either of these failure cases we want to print a nice message about how the command should have been written and then exit with a non-zero status code. print_usage is a pretty simple function that will just print to stdout the two ways to use the program.


#![allow(unused)]
fn main() {
fn print_usage() {
    println!("banned_tokens <infile>
cat <path/to/file> | banned_tokens");
}
}

With that out of the way, we now can get into how we are going to solve the actual problem of finding these tokens in a javascript file. There are many ways to make this work but for this example we are going to wrap the Scanner in another struct that implements Iterator. First here is what that struct is going to look like.


#![allow(unused)]
fn main() {
struct BannedFinder<'a> {
    scanner: Scanner<'a>,
    banned: BannedTokens,
}

}

Before we get into the impl Iterator we should go over an Error implementation that we are going to use. It is relatively straight forward, the actual struct is going to be a tuple struct with three items. The first item is going to be a message that will include the token and type, the second and third are going to be the column/row of the banned token. We need to implement display (Error requires it) which will just create a nice error message for us.


#![allow(unused)]
fn main() {
#[derive(Debug)]
pub struct BannedError(String, usize, usize);

impl ::std::error::Error for BannedError {

}

impl ::std::fmt::Display for BannedError {
    fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        write!(f, "Banned {} found at {}:{}", self.0, self.1, self.2)
    }
}
}

Now we can add a method to BannedFinder that will take an index and return the row/column pair.

Ok, now for the exciting part; we are going to impl Iterator for BannedFinder which will look like this.


#![allow(unused)]
fn main() {
impl<'a> Iterator for BannedFinder<'a> {
    type Item = Result<(), BannedError>;
    fn next(&mut self) -> Option<Self::Item> {
        if let Some(item) = self.scanner.next() {
            match item {
                Ok(item) => {
                    Some(match &item.token {
                        Token::Ident(ref id) => {
                            let id = id.to_string();
                            if self.banned.idents.contains(&id) {
                                Err(BannedError(format!("identifier {}", id), item.location.start.line, item.location.start.column))
                            } else {
                                Ok(())
                            }
                        },
                        Token::Keyword(ref key) => {
                            if self.banned.keywords.contains(&key.to_string()) {
                                Err(BannedError(format!("keyword {}", key.to_string()), item.location.start.line, item.location.start.column))
                            } else {
                                Ok(())
                            }
                        },
                        Token::Punct(ref punct) => {
                            if self.banned.puncts.contains(&punct.to_string()) {
                                Err(BannedError(format!("punct {}", punct.to_string()), item.location.start.line, item.location.start.column))
                            } else {
                                Ok(())
                            }
                        },
                        Token::String(ref lit) => {
                            match lit {
                                StringLit::Double(inner)
                                | StringLit::Single(inner) => {
                                    if self.banned.strings.contains(&inner.to_string()) {
                                        Err(BannedError(format!("string {}", lit.to_string()), item.location.start.line, item.location.start.column))
                                    } else {
                                        Ok(())
                                    }
                                }
                            }
                        },
                        _ => Ok(()),
                    })
                },
                Err(_) => {
                    None
                }
            }
        } else {
            None
        }
    }
}

}

First we need to define what the Item for our Iterator is. It is going to be a Result<(), BannedError>, this will allow the caller to check if an item passed inspection. Now we can add the fn next(&mut self) -> Option<Self::Item> definition. Inside that we first want to make sure that the Scanner isn't returning None, if it is we can just return None. If the scanner returns and Result<Item, Error> we first need to check that it is Ok, in this example we are just going to ignore the Err case. Once we have an actual Item we want to check what kind of token it is, we can do that by matching on &item.token. We only care if the token is a Keyword, Ident, Punct or String, other wise we can say that the token passed. For each of these tokens we are going to check if the actual text is included in any of the Vec<String> properties of self.banned, if it is included we return a BannedError where the first property is a message containing the name of the token type and the text that token represents.

Now that we have all of the underlying infrastructure setup, let's use the BannedFinder in our main.


#![allow(unused)]
fn main() {
    let finder = BannedFinder::new(&js, banned);
    for item in finder {
        match item {
            Ok(_) => (),
            Err(msg) => println!("{}", msg),
        }
    }
}

That is pretty much it. If you wanted to see the full project you can find it in the lint-ie8 folder of this book's github repository.

$web-only-end$ $slides-only$

Demo

$slides-only-end$

RESSA

$slides-only$

  • impl Iterator for Parser
  • Converts stream of Tokens into AST
  • Significantly more context $slides-only-end$ $web-only$ Before we get into how to use ressa, It is a good idea to briefly touch on the scope of a parser or syntax analyzer. The biggest thing to understand is that we still are not dealing with the semantic meaning of the program. That means ressa itself won't discover things like assigning to undeclared variables or attempting to call undefined functions because that would require more context. To that end, ressa's true value isn't realized until it is embedded into another program that provide that context.

With that said ressa is providing a larger context as compared to what is provided by ress. It achieves that by wrapping the Scanner in a struct called Parser. Essentially Parser provides a way to keep track of what any given set of Tokens might mean. Parser also implements Iterator over the enum Result<ProgramPart, Error>, the Ok variant has 3 cases representing the 3 different top level JavaScript constructs.

  • Decl - a variable/function/class declaration
    • Var - A top level variable declaration e.g. let x = 0;
    • Class - A named class definition at the top level
    • Func - A named function definition at the top level
    • Import - An ES Module import statement
    • Export - An ES Module export statement
  • Dir - A script directive, pretty much just 'use strict'
  • Stmt - A catch all for all other statements
    • Block - A collection of statements wrapped in curly braces
    • Break - A break statement will exit a loop or labeled statement early
    • Continue - A continue statement will short circuit a loop
    • Debugger - the literal text debugger
    • DoWhile - A do loop executes the body before testing whether to continue
    • Empty - A single semicolon
    • Expr - A catch-all for everything else
    • For - A c-style for loop e.g. for (var i = 0; i < 100; i++) ;
    • ForIn - A for loop that assigns the key of an enumerable at the top of each iteration
    • ForOf - A for loop that assigns the value of an iterable at the top of each iteration
    • If - A set of if/else if/else statements
    • Labeled - A statement that has been named by an attached identifier
    • Return - The return statement that resolves a function's value
    • Switch - A test Expression and a collection of CaseStmts
    • Throw - The throw keyword followed by an Expression
    • Try - A try/catch/finally block for catching Thrown items
    • Var - A non-top level variable declaration
    • While - A loop which continues based on a test Expression
    • With - An antiquated statement that changes the order of identifier resolution

Stmt being the real work-horse of the group, while a top level function definition would be a Decl, a non-top level function definition would be a Stmt. Both Decl and Stmt themselves are enums representing the different possible variations. Looking further into the Stmt variants, you may notice there is another catch all in the Expr variant which contains an Expr (expression) enum which defines an even more granular set of program parts.

  • Expr
    • Assign - Assigning a value to a variable, this includes any update & assign operations e.g. x = 1, x +=1, etc
    • Array - An array literal e.g. [1,2,3,4]
    • ArrowFunc - An arrow function expression
    • Await - Any expression preceded by the await keyword
    • Call - Calling a function or method
    • Class - A class expression is a class definition with an optional identifier that is assigned to a variable or used as an argument in a Call expression
    • Conditional - Also known as the "ternary" operator e.g. test ? consequent : alternate
    • Func - A function expression is a function definition with an optional identifier that is either self executing, assigned to a variable or used as a Call argument
    • Ident - The identifier of a variable, call argument, class, import, export or function
    • Lit - A primitive literal
    • Logical - Two expressions separated by && or ||
    • Member - Accessing a sub property on something. e.g. [0,1,2][1] or console.log
    • MetaProp - Currently the only MetaProperty is in a function body you can check new.target to see if something was called with the new keyword
    • New - A Call expression preceded by the new keyword
    • Obj - An object literal e.g. {a: 1, b: 2}
    • Seq - Any sequence of expressions separated by commas
    • Spread - the ... operator followed by an expression
    • Super - The super pseudo-keyword used for accessing properties of a super class
    • TaggedTemplate - An identifier followed by a template literal see MDN for more info
    • This - The this pseudo-keyword used for accessing instance properties
    • Unary - An operation (that is not an update) that requires on expression as an argument e.g. delete x, !true, etc
    • Update - An operation that uses the ++ or -- operator
    • Yield - the yield contextual keyword followed by an optional expression for use in generator function

Most of the Expr, Stmt, and Decl variants have associated values, to see more information about them check out the documentation. There should be an example and description provided for each of the possible combinations.

With that long winded explanation of the basic structure we are working with let's take a look at how we would use the Parser. In this example we have a javascript snippet that defines a function 'Thing', it will assign the first argument stuff to a property of the function this.stuff. $web-only-end$

use ressa::*;

static JS: &str = "
function Thing(stuff) {
    this.stuff = stuff;
}
";

fn main() {
    let parser = Parser::new(JS).expect("Failed to create parser");
    for part in parser {
        let part = part.expect("Failed to parse part");
        println!("{:?}", part);
    }
}

$web-only$ If we were to run the above we would get the following output. $web-only-end$

Decl(
    Func(
        Func {
            id: Some(
                Ident {
                    name: "Thing",
                },
            ),
            params: [
                Pat(
                    Ident(
                        Ident {
                            name: "stuff",
                        },
                    ),
                ),
            ],
            body: FuncBody(
                [
                    Stmt(
                        Expr(
                            Assign(
                                AssignExpr {
                                    operator: Equal,
                                    left: Expr(
                                        Member(
                                            MemberExpr {
                                                object: This,
                                                property: Ident(
                                                    Ident {
                                                        name: "stuff",
                                                    },
                                                ),
                                                computed: false,
                                            },
                                        ),
                                    ),
                                    right: Ident(
                                        Ident {
                                            name: "stuff",
                                        },
                                    ),
                                },
                            ),
                        ),
                    ),
                ],
            ),
            generator: false,
            is_async: false,
        },
    ),
)

$web-only$ If we walk through the output, we start by seeing that the

  1. This program consists of a single part which is a ProgramPart::Decl
  2. Inside of that is a Decl::Func
  3. Inside of that is a Func
    1. It has an id, which is an optional Ident, with the name of Some("Thing")
    2. It has a one item vec of Pats in params
      1. Which is a Pat::Identifier
      2. Inside of that is an Identifier with the value of "stuff"
    3. It has a body that is a one item vec of ProgramParts
      1. The item is a ProgramPart::Stmt
      2. Which is a Stmt::Expr
      3. Inside of that is an Expr::Assign
      4. Inside of that is an AssignExpr
        1. Which has an operator of Equal
        2. The left hand side is an Expr::Member
        3. Inside of that is a MemberExpr
          1. The object being Expr::This
          2. The property being Expr::Ident with the name of "stuff"
        4. The right hand side is an Expr::Ident with the name of "stuff"
        5. computed is false
    4. It is not a generator
    5. is_async is false

Phew! That is quite a lot of information! A big part of why we need to be that verbose is because of the "you can do anything" nature of JavaScript. Let's use the MemberExpr as an example, below are a collection of ways to write a MemberExpr in JavaScript.

console.log;//member expr
console['log']; //member expr
const logVar = 'log';
console[logVar];//member expr
console[['l','o','g'].join('')];//member expr
class Log {
    toString() {
        return 'log';
    }
}
const logToString = new Log();
console[logToString];//member expr
function logFunc() {
    return 'log';
}
console[logFunc()];//member expr
function getConsole() {
    return console
}
getConsole()[logFunc()];//member expr
getConsole().log;//member expr

And with the way JavaScript has evolved this probably isn't an exhaustive list of ways to construct a MemberExpr. With the level of information ressa provides we have enough to truly understand the syntactic meaning of the text. This will enable us to build more powerful tools to analyze and/or manipulate any given JavaScript program. With the pervasiveness of print debugging, wouldn't it be nice if we had a tool that would automatically insert a console.log at the top of every function and method in a program? We could make it print the name of that function and also each of the arguments, let's try and build one. $web-only-end$

Building a Debug Helper

$slides-only$

Demo

$slides-only-end$ $web-only$ To simplify things, we are just going to lift the technique for getting the JavaScript text from the ress example, so we won't be covering that again.

With that out of the way let's take a look at the Cargo.toml and use statements for our program.

[package]
name = "console_logify"
version = "0.1.0"
authors = ["Robert Masen <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
ressa = "0.7.0-beta-7"
atty = "0.2"
resw = "0.4.0-beta-1"
resast = "0.4"

#![allow(unused)]
fn main() {
use ressa::Parser;
use resw::Writer;
use resast::prelude::*;
}

This will make sure that all of the items we will need from ressa and resast are in scope. Now we can start defining our method for inserting the debug logging into any functions that we find. To start we are going to create a function that will generate a new ProgramPart::Stmt that will represent our call to console.log which might look like this.


#![allow(unused)]
fn main() {
pub fn console_log<'a>(args: Vec<Expr<'a>>) -> ProgramPart<'a> {
    ProgramPart::Stmt(Stmt::Expr(Expr::Call(
        CallExpr {
            callee: Box::new(Expr::Member(
                MemberExpr {
                    computed: false,
                    object: Box::new(Expr::ident_from("console")),
                    property: Box::new(Expr::ident_from("log")),
                }
            )),
            arguments: args,
        }
    )))
}
}

This signature might look a little intimidating with all the lifetime annotations, the reason they need to be there is that at the heart of every resast node is a Cow (Clone On Write) slice of the originally javascript string. By putting it in a Cow that makes it possible to more easily manipulate the tree without having to pay the cost of allocating a new string for every node at parse time. The lifetime annotations just tell the compiler that our argument and our return value will live the same lifetime, since our arguments are going to be embedded in our return value. We will end up using this pattern quite often in this example, now let's go over what is actually happening here. We will take in the args to supplu the arguments passed into console.log as our only argument. Now we are going to build the tree that represents the javascript, which will look like this:

  • ProgramPart
    • Stmt
      • Expr
        • CallExpr
          • callee
            • Expr
              • MemberExpr
                • computed: false
                • object
                  • Expr
                    • Ident
                      • name: "console"
                • property
                  • Expr
                    • Ident
                      • name: "log"
          • arguments
            • Vec<Expr>

It might be easier to start from the inner most structure, the MemberExpr, this represents the console.log portion of the desired output. First, we want to set the computed property to false, this means we are using a . instead of [], next we need to define the object which will be the identifier console and the property which will be the identifer log. We nest this inside of a CallExpr as the callee, this represents everything up to the opening parenthesis. The second property arguments will, as the name suggests, represent the the arguments, we'll simply assign that to the args provided by the caller. Moving up the tree we wrap the CallExpr in a Expr, and a Stmt and a ProgramPart.

Next, let's work on a few more helper functions, first up is one that will insert a ProgramPart to the top of a FuncBody.


#![allow(unused)]
fn main() {
fn insert_expr_into_func_body<'a>(expr: ProgramPart<'a>, body: &mut FuncBody<'a>) {
    body.0.insert(0, expr);
}
}

This one is pretty straight forward, we take the part and a mutable reference to the body we are modifying. A FuncBody is a tuple struct that wraps a Vec<ProgrgramPart>, this means we can use the insert method on Vec to add the new item to the first position.

Another useful utility would be a way to convert an Ident into a StringLit, it is something that we will be doing quite often.


#![allow(unused)]
fn main() {
fn ident_to_string_lit<'a>(i: &Ident<'a>) -> Expr<'a> {
    Expr::Lit(Lit::String(StringLit::Single(i.name.clone())))
}
}

This one is also pretty straight forwrard, we take a reference to an Ident and clone the name property into a StringLit::Single, we want to wrap that up into an Expr, to do that we need to wrap it in a Lit::String first.

To continue that theme, let's build another function that takes in an expression and returns that expression's representation as a StringLit. To start, let's build a function that converts an Expr into a rust String. The problem is that not all Exprs can be easily converted into a rust String. This will be a good opportunity to use the Option type to filter out any of the expressions we might not want to pass into console.log.


#![allow(unused)]
fn main() {
fn expr_to_string(expr: &Expr) -> Option<String> {
    match expr {
        Expr::Ident(ref ident) => Some(ident.name.to_string()),
        Expr::This => Some("this".to_string()),
        Expr::Member(ref mem) => {
            let prefix = expr_to_string(&mem.object)?;
            let suffix = expr_to_string(&mem.property)?;
            Some(if mem.computed {
                format!("{}[{}]", prefix, suffix)
            } else {
                format!("{}.{}", prefix, suffix)
            })
        },
        Expr::Lit(lit) => {
            match lit {
                Lit::String(s) => Some(s.clone_inner().to_string()),
                Lit::Number(n) => Some(n.to_string()),
                Lit::Boolean(b) => Some(b.to_string()),
                Lit::RegEx(r) => Some(format!("/{}/{}", r.pattern, r.flags)),
                Lit::Null => Some("null".to_string()),
                _ => None,
            }
        },
        _ => None,
    }
}
}

This function is just a match expressions, the first case is the Ident that we simply make a copy of the the name property by calling to_string. Next is the This case, which we jsut create a new string and return that. for a member expression, we ant to return the object property converted to a string and the property property converted to a string seperated by a ., if either of these two can't be converted to a string, we just return None. The last case that we want to attempt to convert is the literal case, for that we simply extract the inner string in most cases. For the regex case, we reconstruct that by putting the pattern between two slashes and flags at the end. For the null case we just return that as a new string. The last case we might handle is Template which would be a little more complicated to re-construct for this example so we will just return None in that case. For any other expressions we want to return None as it would be far more complicated and pretty uncommon to come up in our use case.

Now, we want to wrap the result of this new function into an Expr just like we did for our identifier.


#![allow(unused)]
fn main() {
fn expr_to_string_lit<'a>(e: &Expr<'a>) -> Option<Expr<'a>> {
    let inner = expr_to_string(e)?;
    Some(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(inner)))))
}
}

Because modern javascript allows for patterns as function arguments, we are going to need a couple of helper's to handle these possiblities. Let's take this js as an example.

function Thing({a, b = 0}, [c, d, e]) {

}

Our goal would be to add a call to this function that looks like this.

console.log('Thing', a, b, c, d, e);

Before we get into these pattern arguments, we want to have an easy way to clone an Expr but only when it is an Ident.


#![allow(unused)]
fn main() {
fn clone_ident_from_expr<'a>(expr: &Expr<'a>) -> Option<Expr<'a>> {
    if let Expr::Ident(_) = expr {
        Some(expr.clone())
    } else {
        None
    }
}
}

Here we are just using an if let to test for the an Ident and cloning if there is a match. Now let's dig into the Pat argument conversion.


#![allow(unused)]
fn main() {
fn extract_idents_from_pat<'a>(pat: &Pat<'a>) -> Vec<Option<Expr<'a>>> {
    match pat {
        Pat::Ident(i) => {
            vec![Some(Expr::Ident(i.clone()))]
        },
        Pat::Obj(obj) => {
            obj.iter().map(|part| {
                match part {
                    ObjPatPart::Rest(pat) => {
                        extract_idents_from_pat(pat)
                    },
                    ObjPatPart::Assign(prop) => {
                        match prop.key {
                            PropKey::Pat(ref pat) => {
                                extract_idents_from_pat(pat)
                            },
                            PropKey::Expr(ref expr) => {
                                vec![clone_ident_from_expr(expr)]
                            },
                            PropKey::Lit(ref lit) => {
                                vec![Some(Expr::Lit(lit.clone()))]
                            }
                        }
                    },
                }
            }).flatten().collect()
        },
        Pat::Array(arr) => {
            arr.iter().map(|p| {
                match p {
                    Some(ArrayPatPart::Expr(expr)) => {
                        vec![clone_ident_from_expr(expr)]
                    },
                    Some(ArrayPatPart::Pat(pat)) => {
                        extract_idents_from_pat(pat)
                    },
                    None => vec![],
                }
            }).flatten().collect()
        },
        Pat::RestElement(pat) => {
            extract_idents_from_pat(pat)
        },
        Pat::Assign(assign) => {
            extract_idents_from_pat(&assign.left)
        },
    }
}
}

Because pattern's like the object or array pattern can contain multiple arguments, in our example a and b would be in the same pattern, we want to return a Vec of the optional identifiers. First, let's cover the simplest pattern the Ident case. In this case we simply want to create a new Vec with a clone of the inner wrapped up in an Expr as its only contents. Next we get something a little more complicated the Obj case. Inside of a Pat::Obj is a Vec of an enum called ObjPatPart which has 2 cases the normal Assign and the Rest (preceded by ...). The nice thing about the Rest case is that we can simply use recursion to get the ident's out of the inner Pat. The Assign case has a data scructure called Prop, in this situation we only really care about the key property, since that is where our identifier would live. A propety key can be either a Pat, Expr or Lit, in the first case we can use the same recursive call to get the identifiers it contains. For the expression case we are going to use that helper function we just wrote to get the ident out if it is an ident, finally we are going to just clone the liter into a new Expr. Since we need to do this for each of the ObjPatParts in the object pattern we are going to use the Iterator trait's map to do the first step in the process, this will convert each element into a Vec of optional Exprs, to get that back down to a single Vec we can use the flatten method. Finally we will collect the iterator back together. Next we have the Array, this is going to look very similar. First we are going to map the inner ArrayPatParts into our identifiers, this enum has 3 cases the Expr which we can pass off to our helper just like before, the Pat which we will use recursion for again and finally a None case which we can just return an empty Vec. The RestElement works just like the object pattern version, we just recurse with the inner value. Finally we have the Assign case, this one we want to use the same recursion method but only on the left property. Whew, that one was a bit of a doozy!

We are just now starting to dig into the meat of this project, getting through this complicated mappings now is going to greatly simplify things for us later. Since we arre going to be primarily working with the FuncArgs in any given Func or ArrowFunc, we should have a function that maps any list of arguments to a new list of identifiers and literals.


#![allow(unused)]
fn main() {
fn extract_idents_from_args<'a>(args: &[FuncArg<'a>]) -> Vec<Expr<'a>> {
    let mut ret = vec![];
    for arg in args {
        match arg {
            FuncArg::Expr(expr) => ret.push(clone_ident_from_expr(expr)),
            FuncArg::Pat(pat) => ret.extend(extract_idents_from_pat(pat)),
        }
    }
    ret.into_iter().filter_map(|e| e).collect()
}
}

In this function we are going to liberally use the last to helpers we put together. a FuncArg can be either a Pat or and Expr, in the former we are dealing with a possible list of many new elements but for the latter there would be only one. With that in mind we are going to use the Vec method push for one element and extend for possibly many. Once we have gone through each of the arguments provided we want to filter out any of the None cases by using the filter_map which will filter out any Nones and unwrap and Somes for us automatically. We can then collect up the result to return.

Last in our helper functions is going to be a way to go from an AssignLeft into an Expr with a StringLit inside. For this we are going to use the expr_to_string_lit helper in the Expr case and we are going to match on the Pat case, returning a call to the ident_to_string_lit helper.

Armed with these helpers it is time to write our first mapping function. A pattern that will be true of all of our mapping functions is that they will always take a Vec of Exprs as the first argument. This how we are going to track the prefix of any log we want to write. We are going to start with the Class, which is primarily a collection of Funcs wrapped up in Props so let's start at the property level.


#![allow(unused)]
fn main() {
fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> {
    match prop.kind {
        PropKind::Ctor => {
            args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new"))));
        },
        PropKind::Get => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("get")))
            );
        },
        PropKind::Set => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("set")))
            );
        },
        _ => (),
    };
    match &prop.key {
        PropKey::Expr(ref expr) => match expr {
            Expr::Ident(ref i) => {
                if i.name != "constructor" {
                    args.push(ident_to_string_lit(i));
                }
            }
            _ => (),
        },
        PropKey::Lit(ref l) => match l {
            Lit::Boolean(_)
            | Lit::Number(_)
            | Lit::RegEx(_)
            | Lit::String(_) => {
                args.push(Expr::Lit(l.clone()))
            }
            Lit::Null => {
                args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null"))))));
            }
            _ => (),
        },
        PropKey::Pat(ref p) => {
            match p {
                Pat::Ident(ref i) => args.push(ident_to_string_lit(i)),
                _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)),
            }
        },
    }
    if let PropValue::Expr(expr) = prop.value {
        prop.value = PropValue::Expr(map_expr(args, expr));
    }
    prop
}
}

To start, we want to look at the kind property, there are 3 kinds that are important for us here. The first is Ctor (short for constructor), if we find one of those we want to put the new just before the class name, which should be the last element in the args. To make sure we don't run into any big problems later we should use the saturation_sub method on usize to do the subtraction. Next are the Get and Set accessors, if we find one of those we just want to append this keyword to the end of the current args.

Now that we have that, we need to start digging into the ProgramPart to identify anything we want to modify. Since Parser implements Iterator and its Item is Result<ProgramPart, Error> we first need to use filter_map to extract the ProgramPart from the result. It would probably be good to handle the error case here but for the sake of simplicity we are going to skip any errors. Now that we have an Iterator over ProgramParts we can use map to update each part.

fn main() {
    let js = get_js().expect("Unable to get JavaScript");
    let parser = Parser::new(&js).expect("Unable to construct parser");
    for part in parser.filter_map(|p| p.ok()).map(map_part) {
        //FIXME: Write updated program part to somewhere
    }
}

With that in mind the entry point is going to be a function that takes a ProgramPart and returns a new ProgramPart. It might look like this


#![allow(unused)]
fn main() {
fn map_part<'a>(args: Vec<Expr<'a>>, part: ProgramPart<'a>) -> ProgramPart<'a> {
    match part {
        ProgramPart::Decl(decl) => ProgramPart::Decl(map_decl(args, decl)),
        ProgramPart::Stmt(stmt) => ProgramPart::Stmt(map_stmt(args, stmt)),
        ProgramPart::Dir(_) => part,
    }
}

}

We are going to match on the part provided and either return that part if it is a Directive or if it isn't we need to investigate further to discover if it is a function or not. We do that in two places map_decl and map_stmt both of which are going to utilize similar method for digging further into the tree.


#![allow(unused)]
fn main() {
fn map_decl<'a>(mut args: Vec<Expr<'a>>, decl: Decl<'a>) -> Decl<'a> {
    match decl {
        Decl::Func(f) => Decl::Func(map_func(args, f)),
        Decl::Class(class) => Decl::Class(map_class(args, class)),
        Decl::Var(kind, del) => {
            Decl::Var(kind, del.into_iter().map(|part| {
                if let Pat::Ident(ref ident) = part.id {
                    args.push(ident_to_string_lit(ident));
                }
                VarDecl {
                    id: part.id,
                    init: part.init.map(|e| map_expr(args.clone(), e))
                }
            }).collect())
        }
}

There are two ways for a Decl to resolve into a function or method and that is with the Function and Class variants while a Stmt can end up there if it is an Expr. When we include map_expr we see that there are cases for both Function and Class in the Expr enum. That means once we get past those we will be handling the rest in the exact same way.


#![allow(unused)]
fn main() {
        _ => decl.clone(),
    }
}

fn map_stmt<'a>(args: Vec<Expr<'a>>, stmt: Stmt<'a>) -> Stmt<'a> {
    match stmt {
        Stmt::Expr(expr) => Stmt::Expr(map_expr(args, expr)),
        _ => stmt.clone(),
}

Finally we are going to start manipulating the AST in map_func.

The first thing we are going to do is to clone the func to give us a mutable version. Next we are going to check if the id is Some, if it is we can add that name to our console.log arguments. Now function arguments can be pretty complicated, to try and keep things simple we are going to only worry about the ones that are either Expr::Ident or Pat::Identifier. To build something more robust it might be good to include destructured arguments or arguments with default values but for this example we are just going to keep it simple.

First we are going to filter_map the func.params to only get the items that ultimately resolve to Identifers, at that point we can wrap all of these identifiers in an Expr::Ident and add them to the console.log args. Now we can simply insert the result of passing those args to console_log at the first position of the func.body. Because functions can appear in the body of other functions we also want to map all of the func.body program parts. Once that has completed we can return the updated func to the caller.

The next thing we are going to want to deal with is Class, we want to insert console.log into the top of each method on a class. This is a bit unique because we also want to provide the name of that class (if it exists) as the first argument to console.log. That might look like this.


#![allow(unused)]

fn main() {
fn map_func<'a>(mut args: Vec<Expr<'a>>, mut func: Func<'a>) -> Func<'a> {
    if let Some(ref id) = &func.id {
        args.push(ident_to_string_lit(id));
    }
    let local_args = extract_idents_from_args(&func.params);
    func.body = FuncBody(func.body.0.into_iter().map(|p| map_part(args.clone(), p)).collect());
    insert_expr_into_func_body(console_log(args.clone().into_iter().chain(local_args.into_iter()).collect()), &mut func.body);
    func
}

fn map_arrow_func<'a>(mut args: Vec<Expr<'a>>, mut f: ArrowFuncExpr<'a>) -> ArrowFuncExpr<'a> {
    args.extend(extract_idents_from_args(&f.params));
    match &mut f.body {
        ArrowFuncBody::FuncBody(ref mut body) => {
            insert_expr_into_func_body(console_log(args), body)
        },
        ArrowFuncBody::Expr(expr) => {
            f.body = ArrowFuncBody::FuncBody(FuncBody(vec![
                console_log(args),
                ProgramPart::Stmt(
                    Stmt::Return(
                        Some(*expr.clone())
                    )
                )
            ]))
        }
    }
    f
}

fn map_class<'a>(mut args: Vec<Expr<'a>>, mut class: Class<'a>) -> Class<'a> {
    if let Some(ref id) = class.id {
        args.push(ident_to_string_lit(id))
    }
    let mut new_body = vec![];
    for item in class.body.0 {
        new_body.push(map_class_prop(args.clone(), item))
    }
    class.body = ClassBody(new_body);
    class
}

fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> {
    match prop.kind {
        PropKind::Ctor => {
            args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new"))));
        },
        PropKind::Get => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("get")))
            );
        },
        PropKind::Set => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("set")))
            );
        },
        _ => (),
    };
    match &prop.key {
        PropKey::Expr(ref expr) => match expr {
            Expr::Ident(ref i) => {
                if i.name != "constructor" {
                    args.push(ident_to_string_lit(i));
                }
            }
            _ => (),
        },
        PropKey::Lit(ref l) => match l {
            Lit::Boolean(_)
            | Lit::Number(_)
            | Lit::RegEx(_)
            | Lit::String(_) => {
                args.push(Expr::Lit(l.clone()))
            }
            Lit::Null => {
                args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null"))))));
            }
            _ => (),
        },
        PropKey::Pat(ref p) => {
            match p {
                Pat::Ident(ref i) => args.push(ident_to_string_lit(i)),
                _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)),
            }
        },
    }
    if let PropValue::Expr(expr) = prop.value {
        prop.value = PropValue::Expr(map_expr(args, expr));
    }
    prop
}

fn assign_left_to_string_lit<'a>(left: &AssignLeft<'a>) -> Option<Expr<'a>> {
    match left {
        AssignLeft::Expr(expr) => expr_to_string_lit(expr),
        AssignLeft::Pat(pat) => {
            match pat {
                Pat::Ident(ident) => Some(ident_to_string_lit(ident)),
                _ => None,
            }
        }
    }
}


}

Here we have two functions, the first pulls out the id from the provided class or uses an empty string of it doesn't exist. We then just pass that off to map_class_prop which will handle all of the different types of properties a class can have. The first thing this does is map the prefix into the right format, so a call to new Thing() would print new Thing, or a get method would print Thing get before the method name. Next we take a look at the property.key, this will provide us with the name of our function, but according to the specification a class property key can be an identifier, a literal value, or a pattern, so we need to figure out what the name of this method is by digging into that value. First in the case that it is an ident we want to add it to the args, unless it is the value constructor because we already put the new keyword in that one. Next we can pull out the literal values and add those as they appear. Lastly we will only handle the pattern case when it is a Pat::Identifier otherwise we will just skip it. Now to get the parameter names from the method definition we need to look at the property.value which should always be an Expr::Function. Once we match on that we simply repeat the process of map_function pulling the args out but only when they are Idents and then passing that along to console_log and inserting that Expr at the top of the function body.

At this point we have successfully updated our AST to include a call to console.log at the top of each function and method in our code. Now the big question is how do we write that out to a file. This problem is not a small one, in the next section we are going to cover a third crate resw that we can use to finish this project. $web-only-end$

RESW

$web-only$ While ress and ressa consume text and generate data structures, resw is going to consume data structures and write out text. This means it can do the heavy lifting when solving the problem our debug logging project left us with. However instead of just sweeping that under the rug, we are going to go over how resw works. Because the nature of JavaScript, resw makes some style decisions that might not work for everyone, by going over the project in detail the hope is that other's will feel enabled to either contribute a configuration option into resw or even implement their own project that consumes ressa's AST and generates text.

If you are just interested in seeing how we are going to finish the project from the last chapter, feel free to move ahead.

Similar to the structure of ressa, resw exposes a struct that will keep track of the context for us called Writer. There are 2 methods for constructing a Writer, the first is the ::new method the second is the ::builder method that utilizes the builder pattern to customize some options. Those options include

  • New line character (default \n)
  • Quote (default to use origin quotation mark)
    • Setting this to any value will force all of the string literals in the provided JavaScript to be re-written with the provided quotes
  • Indent (default 4 spaces)

Either method you are going to need to provide the destination, this can be anything that implements the std::io::Write trait. For testing purposes the crate provides an implementor of Write in WriteString, we are not going to cover that here but a more detailed explanation can be found in the appendix.

Once a Writer is constructed, it provides an API surface that should cover most of the ressa AST. The primary entry-point for is going to be either write_program or write_part. For the most part, the primary role of the writer is going to be incrementally move down the AST until we find something that we are confident in exactly what to write. Let's take the following js as an example.

function Thing(stuff) {
    this.stuff = stuff;
}
let thing = new Thing('argument');

If we run that that through the ressa::Parser, we would see the following AST.

Decl(
    Function(
        Function {
            id: Some(
                "Thing"
            ),
            params: [
                Pat(
                    Identifier(
                        "stuff"
                    )
                )
            ],
            body: [
                Stmt(
                    Expr(
                        Assignment(
                            AssignmentExpr {
                                operator: Equal,
                                left: Expr(
                                    Member(
                                        MemberExpr {
                                            object: ThisExpr,
                                            property: Ident(
                                                "stuff"
                                            ),
                                            computed: false
                                        }
                                    )
                                ),
                                right: Ident(
                                    "stuff"
                                )
                            }
                        )
                    )
                )
            ],
            generator: false,
            is_async: false
        }
    )
)
Decl(
    Variable(
        Let,
        [
            VariableDecl {
                id: Identifier(
                    "thing"
                ),
                init: Some(
                    New(
                        NewExpr {
                            callee: Ident(
                                "Thing"
                            ),
                            arguments: [
                                Literal(
                                    String(
                                        "\'argument\'"
                                    )
                                )
                            ]
                        }
                    )
                )
            }
        ]
    )
)

Using that, let's take a look at how resw would generate the text to represent our AST. First we would enter at write_part with the first ProgramPart.


#![allow(unused)]
fn main() {
pub fn write_part(&mut self, part: &ProgramPart) -> Res {
    self.at_top_level = true;
    self._write_part(part)?;
    self.write_new_line()?;
    Ok(())
}
}

Interestingly enough, write_part is really more concerned with maintaining a context flag for if we are at the top level or not, this becomes important when trying to determine if any expression needs to be wrapped in parentheses. Almost all of the work is going to be passed off to an internal private function _write_part.


#![allow(unused)]
fn main() {
fn _write_part(&mut self, part: &ProgramPart) -> Res {
    self.write_leading_whitespace()?;
    match part {
        ProgramPart::Decl(decl) => self.write_decl(decl)?,
        ProgramPart::Dir(dir) => self.write_directive(dir)?,
        ProgramPart::Stmt(stmt) => self.write_stmt(stmt)?,
    }
    Ok(())
}
}

The first thing we want to do is make sure that any leading whitespace is included with write_leading_whitespace.


#![allow(unused)]
fn main() {
pub fn write_leading_whitespace(&mut self) -> Res {
    self.write(&self.indent.repeat(self.current_indent))?;
    Ok(())
}
}

This is achieved by looking at the current_indent and writing the configurable property indent to the destination repeated the for our current indent level, so if our indent was \t and we were at level 2 it would write "\t\t". Internally the write method just writes a single &str to the destination. After we write our leading whitespace, we can start to descend the AST, we do that by matching on the part. You can see that there is a branch for each of the possible enum variants, looking back at the example, we know the next step would be to head to write_decl.


#![allow(unused)]
fn main() {
pub fn write_decl(&mut self, decl: &Decl) -> Res {
    match decl {
        Decl::Variable(ref kind, ref decls) => self.write_variable_decls(kind, decls)?,
        Decl::Class(ref class) => {
            self.at_top_level = false;
            self.write_class(class)?;
            self.write_new_line()?;
        },
        Decl::Function(ref func) => {
            self.at_top_level = false;
            self.write_function(func)?;
            self.write_new_line()?;
        },
        Decl::Export(ref exp) => self.write_export_decl(exp)?,
        Decl::Import(ref imp) => self.write_import_decl(imp)?,
    };
    Ok(())
}
}

Moving further down we simply match on the the declaration handling each variant as needed. For our example we would move into the Decl::Function branch. The first step in that branch is to set the context flag at_top_level to false and then move into the write_function method.


#![allow(unused)]
fn main() {
pub fn write_function(&mut self, func: &Function) -> Res {
    if func.is_async {
        self.write("async ")?;
    }
    self.write("function")?;
    if let Some(ref id) = func.id {
        self.write(" ")?;
        if func.generator {
            self.write("*")?;
        }
        self.write(id)?;
    } else if func.generator {
        self.write("*")?;
    }
    self.write_function_args(&func.params)?;
    self.write(" ")?;
    self.write_function_body(&func.body)
}
}

Here we are going to actually start writing some information out to our destination. First is we check the flag on Function to see if we need to write the async keyword, next we write the keyword function followed by a check to see if the id is Some. If so we need to check the flag on Function to see if that function is a generator, if it is we need to add a * before the id, and Lastly we write the id

Now that we have gotten though that we can start to look at the parameters and body. First we are going to pass off the parameters to write_function_args.


#![allow(unused)]
fn main() {
/// Write the arguments of a function or method definition
/// ```js
/// function(arg1, arg2) {
/// }
/// ```
pub fn write_function_args(&mut self, args: &[FunctionArg]) -> Res {
    self.write("(")?;
    let mut after_first = false;
    for ref arg in args {
        if after_first {
            self.write(", ")?;
        } else {
            after_first = true;
        }
        self.write_function_arg(arg)?;
    }
    self.write(")")?;
    Ok(())
}
}

The first step here is to write the open parenthesis, next we are going to use a flag after_first to help with handing if a comma should be written before the argument. This is the first place that we have seen where resw is making a style choice, all function parameters will not include a trailing comma. Ideally style choices will be configurable in the future but currently this one is not. Now that we have handled the comma situation we can pass the argument off to write_function_arg.


#![allow(unused)]
fn main() {
pub fn write_function_arg(&mut self, arg: &FunctionArg) -> Res {
    match arg {
        FunctionArg::Expr(ref ex) => self.write_expr(ex)?,
        FunctionArg::Pat(ref pa) => self.write_pattern(pa)?,
    }
    Ok(())
}
}

Here we see another function that simply move us further down the AST. Function arguments can be either expressions or patterns so we need to handle both. For our example we are going to head down the Pat branch with write_pattern.


#![allow(unused)]
fn main() {
pub fn write_pattern(&mut self, pattern: &Pat) -> Res {
    match pattern {
        Pat::Identifier(ref i) => self.write(i),
        Pat::Object(ref o) => self.write_object_pattern(o),
        Pat::Array(ref a) => self.write_array_pattern(a.as_slice()),
        Pat::RestElement(ref r) => self.write_rest_element(r),
        Pat::Assignment(ref a) => self.write_assignment_pattern(a),
    }
}
}

Most of the options here are simply going to continue branching down our AST, however for our example we are going to head down the first match arm with Pat::Identifer and just write that string out to our destination.

Moving back up we only had one parameter for our function signature so we finish out write_function_args with a closing parenthesis. That then leads us to write_function_body.


#![allow(unused)]
fn main() {
pub fn write_function_body(&mut self, body: &FunctionBody) -> Res {
    if body.len() == 0 {
        self.write("{ ")?;
    } else {
        self.write_open_brace()?;
        self.write_new_line()?;
    }
    for ref part in body {
        self._write_part(part)?;
    }
    if body.len() == 0 {
        self.write("}")?;
    } else {
        self.write_close_brace()?;
    }
    Ok(())
}
}

The first thing we need to do is take a look at the &FunctionBody which is a type alias for Vec<ProgramPart>. We check to see if this function has any body, if not we just write a single open curly brace, if it does we want to write the curly brace using write_open_brace, this is a convenience method for writing the character and also incrementing the current_indent, lastly we write a new line. Now we loop over each of the ProgramParts in body and pass that off to _write_body. For our example there is only going to be one part. This part is a ProgramPart::Stmt which would be handled by write_stmt.


#![allow(unused)]
fn main() {
pub fn write_stmt(&mut self, stmt: &Stmt) -> Res {
    let mut semi = true;
    let mut new_line = true;
    let cached_state = self.at_top_level;
    match stmt {
        Stmt::Empty => {
            new_line = false;
        },
        Stmt::Debugger => self.write_debugger_stmt()?,
        Stmt::Expr(ref stmt) => {
            let wrap = match stmt {
                Expr::Literal(_)
                | Expr::Object(_)
                | Expr::Function(_) 
                | Expr::Binary(_) => true,
                _ => false,
            };
            if wrap {
                self.write_wrapped_expr(stmt)?
            } else {
                self.write_expr(stmt)?
            }
        },
        Stmt::Block(ref stmt) => {
            self.at_top_level = false;
            self.write_block_stmt(stmt)?;
            semi = false;
            new_line = false;
            self.at_top_level = cached_state;
        }
        Stmt::With(ref stmt) => {
            self.write_with_stmt(stmt)?;
            semi = false;
        }
        Stmt::Return(ref stmt) => self.write_return_stmt(stmt)?,
        Stmt::Labeled(ref stmt) => {
            self.write_labeled_stmt(stmt)?;
            semi = false;
        }
        Stmt::Break(ref stmt) => self.write_break_stmt(stmt)?,
        Stmt::Continue(ref stmt) => self.write_continue_stmt(stmt)?,
        Stmt::If(ref stmt) => {
            self.write_if_stmt(stmt)?;
            semi = false;
        }
        Stmt::Switch(ref stmt) => {
            self.at_top_level = false;
            self.write_switch_stmt(stmt)?;
            semi = false;
        }
        Stmt::Throw(ref stmt) => self.write_throw_stmt(stmt)?,
        Stmt::Try(ref stmt) => {
            self.write_try_stmt(stmt)?;
            semi = false;
        }
        Stmt::While(ref stmt) => {
            new_line = self.write_while_stmt(stmt)?;
            semi = false;
        }
        Stmt::DoWhile(ref stmt) => self.write_do_while_stmt(stmt)?,
        Stmt::For(ref stmt) => {
            self.at_top_level = false;
            new_line = self.write_for_stmt(stmt)?;
            semi = false;
        }
        Stmt::ForIn(ref stmt) => {
            self.at_top_level = false;
            new_line = self.write_for_in_stmt(stmt)?;
            semi = false;
        }
        Stmt::ForOf(ref stmt) => {
            self.at_top_level = false;
            new_line = self.write_for_of_stmt(stmt)?;
            semi = false;
        }
        Stmt::Var(ref stmt) => self.write_var_stmt(stmt)?,
    };
    if semi {
        self.write_empty_stmt()?;
    }
    if new_line {
        self.write_new_line()?;
    }
    self.at_top_level = cached_state;
    Ok(())
}
}

That is a pretty big match statement! Before we enter that we have a couple of context flags to help us with formatting write_semi and new_line, both with a default value of true. Looking at our example, we would enter the Stmt::Expr arm of the match which handles handles the possible requirement that this statement be wrapped in parentheses. Primitive literals, object literals, functions, and binary operations would require parentheses when not part of a larger statement. There is a convenience method called write_wrapped_expr that just writes parentheses around a call to write_expr.


#![allow(unused)]
fn main() {
pub fn write_expr(&mut self, expr: &Expr) -> Res {
    let cached_state = self.at_top_level;
    match expr {
        Expr::Literal(ref expr) => self.write_literal(expr)?,
        Expr::This => self.write_this_expr()?,
        Expr::Super => self.write_super_expr()?,
        Expr::Array(ref expr) => self.write_array_expr(expr)?,
        Expr::Object(ref expr) => self.write_object_expr(expr)?,
        Expr::Function(ref expr) => {
            self.at_top_level = false;
            self.write_function(expr)?;
            self.at_top_level = cached_state;
        }
        Expr::Unary(ref expr) => self.write_unary_expr(expr)?,
        Expr::Update(ref expr) => self.write_update_expr(expr)?,
        Expr::Binary(ref expr) => self.write_binary_expr(expr)?,
        Expr::Assignment(ref expr) => {
            self.at_top_level = false;
            self.write_assignment_expr(expr)?
        },
        Expr::Logical(ref expr) => self.write_logical_expr(expr)?,
        Expr::Member(ref expr) => self.write_member_expr(expr)?,
        Expr::Conditional(ref expr) => self.write_conditional_expr(expr)?,
        Expr::Call(ref expr) => self.write_call_expr(expr)?,
        Expr::New(ref expr) => self.write_new_expr(expr)?,
        Expr::Sequence(ref expr) => self.write_sequence_expr(expr)?,
        Expr::Spread(ref expr) => self.write_spread_expr(expr)?,
        Expr::ArrowFunction(ref expr) => {
            self.at_top_level = false;
            self.write_arrow_function_expr(expr)?;
            self.at_top_level = cached_state;
        }
        Expr::Yield(ref expr) => self.write_yield_expr(expr)?,
        Expr::Class(ref expr) => {
            self.at_top_level = false;
            self.write_class(expr)?;
            self.at_top_level = cached_state;
        }
        Expr::MetaProperty(ref expr) => self.write_meta_property(expr)?,
        Expr::Await(ref expr) => self.write_await_expr(expr)?,
        Expr::Ident(ref expr) => self.write_ident(expr)?,
        Expr::TaggedTemplate(ref expr) => self.write_tagged_template(expr)?,
        _ => unreachable!(),
    }
    Ok(())
}
}

The first step here is to keep a copy of the previous at_top_level flag so that we can revert back to it after writing, some of the arms are going to change it. Next we enter another very large match statement. Our example would take the Expr::Assignment arm, passing further work off to write_assignment_expr.


#![allow(unused)]
fn main() {
pub fn write_assignment_expr(&mut self, assignment: &AssignmentExpr) -> Res {
    let wrap_self = match &assignment.left {
        AssignmentLeft::Expr(ref e) => match &**e {
            Expr::Object(_) 
            | Expr::Array(_) => true,
            _ => false,
        }, 
        AssignmentLeft::Pat(ref p) => match p {
            Pat::Array(_) => true,
            Pat::Object(_) => true,
            _ => false,
        }
    };
    if wrap_self {
        self.write("(")?;
    }
    match &assignment.left {
        AssignmentLeft::Expr(ref e) => self.write_expr(e)?,
        AssignmentLeft::Pat(ref p) => self.write_pattern(p)?,
    }
    self.write(" ")?;
    self.write_assignment_operator(&assignment.operator)?;
    self.write(" ")?;
    self.write_expr(&assignment.right)?;
    if wrap_self {
        self.write(")")?;
    }
    Ok(())
}
}

Here we are first we need to determine if the whole assignment expression needs to be wrapped in parentheses which would only be true if the left hand side was an object or array literal. Next we test the assignment.left property since it can be either an Expr or a Pat, our example would take us back to the write_expr method. This would take us back up through write_expr but this time we would pass into the Expr::Member arm which passes its work off to write_member_expr.


#![allow(unused)]
fn main() {
pub fn write_member_expr(&mut self, member: &MemberExpr) -> Res {
    match &*member.object {
        Expr::Assignment(_) 
        | Expr::Literal(Literal::Number(_))
        | Expr::Conditional(_)
        | Expr::Logical(_) 
        | Expr::Function(_)
        | Expr::ArrowFunction(_)
        | Expr::Object(_)
        | Expr::Binary(_) 
        | Expr::Unary(_)
        | Expr::Update(_) => self.write_wrapped_expr(&member.object)?,
        _ => self.write_expr(&member.object)?,
    }
    if member.computed {
        self.write("[")?;
    } else {
        self.write(".")?;
    }
    self.write_expr(&member.property)?;
    if member.computed {
        self.write("]")?;
    }
    Ok(())
}
}

Here we first check to see if the object property is required to be wrapped in parentheses for us though we just want to pass that along to write_expr. This time though there we are going to end up at Expr::ThisExpr which just writes out the literal word this. Next we are going to look at the flag on MemberExpr "computed" to see if this was written originally with the bracket notation (this['stuff']) or the dot notation (this.stuff), writing the appropriate character. Now we are again going to pass some work back to write_expr, this time with the property property. This would end on the branch for Expr::Ident which just writes that value to the destination. If the member expression was computed we would need to write the ] but for our example it is not.

At this point we are back up at write_assignment_expr where we are going to write a single space and then pass the assignment.operator off to write_assignment_operator.


#![allow(unused)]
fn main() {
pub fn write_assignment_operator(&mut self, op: &AssignmentOperator) -> Res {
    let s = match op {
        AssignmentOperator::AndEqual => "&=",
        AssignmentOperator::DivEqual => "/=",
        AssignmentOperator::Equal => "=",
        AssignmentOperator::LeftShiftEqual => "<<=",
        AssignmentOperator::MinusEqual => "-=",
        AssignmentOperator::ModEqual => "%=",
        AssignmentOperator::OrEqual => "|=",
        AssignmentOperator::PlusEqual => "+=",
        AssignmentOperator::PowerOfEqual => "**=",
        AssignmentOperator::RightShiftEqual => ">>=",
        AssignmentOperator::TimesEqual => "*=",
        AssignmentOperator::UnsignedRightShiftEqual => ">>>=",
        AssignmentOperator::XOrEqual => "^=",
    };
    self.write(s)?;
    Ok(())
}
}

This is a relatively straight forward process of looking at which operator was provided and then writing out the text that represents that operator. For our example it would be =, we then need to write a single space. The last step in write_assignment_expr is to handle the assignment.right which is also an Expr so we pass that off to write_expr. Our example will head to the Expr::Ident match arm and then just write to the destination. With that we have now reached the last step in write_function_body which is to write_close_brace similar to write_open_brace here we are decrementing the current_indent context property. That also brings us to the end of write_function, write_decl, and _write_part. The last thing we do in write_part is to add a trailing new line, another style choice.

As our example continues we would then start again at write_part with the next part. This is going to move though _write_part the same as before, however when we get to write_decl we have a new branch to head down. This is the Decl::Variable arm which passes its work off to write_variable_decls.


#![allow(unused)]
fn main() {
pub fn write_variable_decls(&mut self, kind: &VariableKind, decls: &[VariableDecl]) -> Res {
    self.write_variable_kind(kind)?;
    let mut after_first = false;
    for decl in decls {
        if after_first {
            self.write(", ")?;
        } else {
            after_first = true;
        }
        self.write_variable_decl(decl)?;
    }
    self.write_empty_stmt()?;
    self.write_new_line()
}
}

As you might expect the first thing we want to do is to write the variable kind. We pass off the kind variable to write_variable_kind.


#![allow(unused)]
fn main() {
pub fn write_variable_kind(&mut self, kind: &VariableKind) -> Res {
    let s = match kind {
        VariableKind::Const => "const ",
        VariableKind::Let => "let ",
        VariableKind::Var => "var ",
    };
    self.write(s)
}
}

Similar to our examination of write_assignment_operator we are going to simply look at which keyword was used and then write that out, with a trailing space.

Next we need to keep track of two flags after_first which should be familiar from write_function_args. In our loop, we pass of each of the declarations to write_variable_decl.


#![allow(unused)]
fn main() {
pub fn write_variable_decl(&mut self, decl: &VariableDecl) -> Res {
    self.write_pattern(&decl.id)?;
    if let Some(ref init) = decl.init {
        self.write(" = ")?;
        self.write_expr(init)?;
    }
    Ok(())
}
}

Here we first write out the id of this variable by passing it off to write_pattern. Thankfully our example is pretty simple so we are again going to take that first branch for Pat::Ident and write the identifer to our destination. After that we want to check if this variable is initialized, ours is, and if so we would write the " = " and then write the expression by passing that off to write_expr. For this pass through write_expr we are going to travel down the Expr::New arm which passes its work off to write_new_expr.


#![allow(unused)]
fn main() {
pub fn write_new_expr(&mut self, new: &NewExpr) -> Res {
    self.write("new ")?;
    match &*new.callee {
        Expr::Assignment(_) 
        | Expr::Call(_) => self.write_wrapped_expr(&new.callee)?,
        _ => self.write_expr(&new.callee)?,
    }
    self.write_sequence_expr(&new.arguments)?;
    Ok(())
}
}

At this point we want to first write the new keyword followed by a space. Next we want to write out what the new.callee is which would again bring us to write_expr. Our example would travel to the Expr::Ident arm which just writes that out. Next we need to write an open parenthesis followed by the provided arguments. This time we are going to use the write_sequence_expr method to do that.


#![allow(unused)]
fn main() {
pub fn write_sequence_expr(&mut self, sequence: &[Expr]) -> Res {
    let mut after_first = false;
    self.write("(")?;
    for ref e in sequence {
        if after_first {
            self.write(", ")?;
        }
        self.write_expr(e)?;
        after_first = true;
    }
    self.write(")")?;
    Ok(())
}
}

At this point the structure of this function's body should look familiar, we are going to loop over the provide expressions and write them out with a comma and space before all but the first one. For our example we are going only hit this once so no comma, then we are going to pass that off to write_expr. This time as we pass through the match in write_expr we are going to hit the Expr::Literal arm which passes its work off to write_literal.


#![allow(unused)]
fn main() {
pub fn write_literal(&mut self, lit: &Literal) -> Res {
    match lit {
        Literal::Boolean(b) => self.write_bool(*b),
        Literal::Null => self.write("null"),
        Literal::Number(n) => self.write(&n),
        Literal::String(s) => self.write_string(s),
        Literal::RegEx(r) => self.write_regex(r),
        Literal::Template(t) => self.write_template(t),
    }
}
}

Here we see another match statement, our example will take us down the Literal::String arm which passes off work to write_string. You may be wondering why that is, since writing strings is all we have really been doing. The answer is that this is one of the few style preferences that is currently configurable as you'll see.


#![allow(unused)]
fn main() {
pub fn write_string(&mut self, s: &str) -> Res {
    if let Some(c) = self.quote {
        self.re_write_string(s, c)?;
    } else {
        self.write(s)?;
    }
    Ok(())
}
}

We first check to see if the self.quote property has been set, this would indicate that the user has a quote preference. If it is set then we want to re-write the string to use this quote, this involves re-writing any internal escaped quotes for the old quote and escaping the new quote that might appear in the contents. If that property is None then we would just write it out normally as the ressa::node::Literal::String preserves the original quotation mark.

After that we are again back at write_new_expr where the last thing to do is write the closing parenthesis, after which we are at the bottom of write_variable_decl. When we move up again to the write_variable_decls we would write a semi-colon and new line to close that out. This brings us to the bottom of write_decl, _write_part, and write_part, it also brings us to the end of our example JavaScript. While we didn't touch every part of how resw works, there is a lot of surface area to cover, hopefully it has provided enough information for you feel confident in how it works. For more information you can check out the ressa docs and the resw docs.

Up next we are going to see how you would use resw to complete our debug log helper.

$web-only-end$

$slides-only$

  • Writer takes ProgramParts
  • Somewhat Configurable
  • Writes to impl Write $slides-only$

Building a Writer

$web-only$ Thankfully because of the existence of resw completing the console.log debugging tool is going to be trivial. The primary entry point for resw is the Writer struct, which has a method write_part that will take a &mut self and &ProgramPart, so we can use that in our for loop to write out the parts as they are parsed. That might look like this.

};

fn main() {
    let mut args = ::std::env::args();
    let _ = args.next();
    let file_name = args
        .next()
        .unwrap_or(String::from("./examples/insert_logging.js"));
    let js = read_to_string(file_name).expect("Unable to find js file");

With that complete we can see how well it works for us. Let's use the following example JavaScript.

function Thing(stuff) {
    this.stuff = stuff;
}
let x = new Thing('argument');

Just as a simple test we could enter the following into our terminal

$ echo "function Thing(stuff) {
    this.stuff = stuff;
}
let x = new Thing('argument');
" | console_logify
function Thing(stuff) {
    console.log('Thing', stuff);
    this.stuff = stuff;
}

let x = new Thing('argument!');

That looks exactly like the output we were looking for. Let's double check that it will behave as expected by piping the output to node

$ echo "function Thing(stuff) {
    this.stuff = stuff;
}
let x = new Thing('argument');
" | console_logify | node -
Thing argument

It worked!

$web-only-end$ $slides-only$

Demo

$slides-only-end$

Conclusion

$web-only$ Hopefully now you have the all you to get started building your JavaScript development tools using Rust. If you do create one please open an issue on this project's GitHub issues page with the project's name, a short description, and a link, and it will be added to the appendix.

If you run into any problems in any crates (including typos in this book) it would be wonderful of you to open an issue on GitHub.

If you want to get involved, there are probably a few open issues that could use some help. Each project does provide contributing guidelines.

$web-only-end$ $slides-only$

  • Annotated version of this presentation
    • https://FreeMasen.github.io/rusty-ecma-book
  • Where to find me
    • email: r.f.masen@gmail.com
    • website: https://WiredForge.com
    • twitter/github: @FreeMasen $slides-only-end$

Appendix

  1. Tokens
  2. banned_tokens.toml
  3. AST
  4. StringWriter
  5. Projects

Tokens

Here is a list of all of the possible tokens ress provides

  • Token
    • EoF
    • Boolean - enum BooleanLiteral
      • True
      • False
    • Ident - struct Ident(String)
    • Keyword - enum Keyword
      • Await
      • Break
      • Case
      • Catch
      • Class
      • Const
      • Continue
      • Debugger
      • Default
      • Delete
      • Do
      • Else
      • Enum
      • Export
      • Finally
      • For
      • Function
      • If
      • Implements
      • Import
      • In
      • InstanceOf
      • Interface
      • Let
      • New
      • Package
      • Private
      • Protected
      • Public
      • Return
      • Static
      • Super
      • Switch
      • This
      • Throw
      • Try
      • TypeOf
      • Var
      • Void
      • While
      • With
      • Yield
    • Null
    • Numeric - struct Number(String)
      • 0
      • .0
      • 0.0
      • 0.0e1
      • 0.0E1
      • .0e1
      • .0E1
      • 0xfff
      • 0Xfff
      • 0o777
      • 0O777
      • 0b111
      • 0B111
    • Punct - enum Punct
      • And - &
      • Assign - =
      • Asterisk - *
      • BitwiseNot - ~
      • Caret - ^
      • CloseBrace - }
      • CloseBracket - ]
      • CloseParen - )
      • Colon - :
      • Comma - ,
      • ForwardSlash - /
      • GreaterThan - >
      • LessThan - <
      • Minus - -
      • Modulo - %
      • Not - !
      • OpenBrace - {
      • OpenBracket - [
      • OpenParen - (
      • Period - .
      • Pipe - |
      • Plus - +
      • QuestionMark - ?
      • SemiColon - ;
      • Spread - ...
      • UnsignedRightShiftAssign - >>>=
      • StrictEquals - ===
      • StrictNotEquals - !==
      • UnsignedRightShift - >>>
      • LeftShiftAssign - <<=
      • RightShiftAssign - >>=
      • ExponentAssign - **=
      • LogicalAnd - &&
      • LogicalOr - ||
      • Equal - ==
      • NotEqual - !=
      • AddAssign - +=
      • SubtractAssign - -=
      • MultiplyAssign - *=
      • DivideAssign - /=
      • Increment - ++
      • Decrement - --
      • LeftShift - <<
      • RightShift - >>
      • BitwiseAndAssign - &=
      • BitwiseOrAssign - |=
      • BitwiseXOrAssign - ^=
      • ModuloAssign - %=
      • FatArrow - =>
      • GreaterThanEqual - >=
      • LessThanEqual - `<=
      • Exponent - **
    • String - enum StringLit
      • Single(String)
      • Double(String)
    • Regex - struct Regex
      • body - String
      • flags - Option<String>
    • Template - enum Template,
      • NoSub(String)
      • Head(String)
      • Middle(String)
      • Tail(String)
    • Comment - struct Comment
      • kind - enum Kind
        • Single - //comment
        • Multi - /* comment */
        • Html - <!-- comment --> trailing content
      • content - String
      • tail_content - Option<String>

banned_tokens.toml

idents = [
    "Int8Array",
    "Uint8Array",
    "Uint8ClampedArray",
    "Int16Array",
    "Uint16Array",
    "Int32Array",
    "Uint32Array",
    "Float32Array",
    "Float64Array",
    "Promise",
    "Proxy",
    "async",
    "padStart",
    "padEnd",
    "includes",
    "find",
    "getComputedStyle",
    "FontFace",
    "FontFaceSet",
    "FontFaceSetLoadEvent",
    "MediaSource",
    "sourceBuffers",
    "activeSourceBuffers",
    "readyState",
    "duration",
    "onsourceclose",
    "onsourceended",
    "addSourceBuffer",
    "removeSourceBuffer",
    "endOfStream",
    "setLiveSeekableRange",
    "clearLiveSeekableRange",
    "isTypeSupported",
    "TouchEvent",
    "Touch",
    "TouchList",
    "onpointerover",
    "onpointerenter",
    "onpointerdown",
    "onpointermove",
    "onpointerup",
    "onpointercancel",
    "onpointerout",
    "onpointerleave",
    "ongotpointercapture",
    "onlostpointercapture",
    "setPointerCapture",
    "releasePointerCapture",
    "MutationObserver",
]
keywords = [
    "let",
    "const",
    "class",
    "await",
    "import",
    "export",
    "yield",
]
puncts = [
    "=>",
    "**",
    "...",
    "`",
]
strings = [
    "use strict",
    "sourceopen",
    "touchstart",
    "touchend",
    "touchmove",
    "touchcancel",
    "pointerenter",
    "pointerdown",
    "pointermove",
    "pointerup",
    "pointercancel",
    "pointerout",
    "pointerleave",
    "gotpointercapture",
    "lostpointercapture",
    "pointerover",
]

AST

While it may be a bit of a cop-out, it seems silly to duplicate the AST docs provided by cargo-doc. In the future this page may include some more introspective information but for now please refer to the link below.

resast docs

StringWriter

When building resw it became clear that the only way to validate the output would be to write a bunch of files to disk and then read them back which didn't seem like the correct option. Because of this resw includes an public module called write_str. In it you will find two structs WriteString and ChildWriter. The basic idea here is that you can use this to simply write the values to a buffer that the resw::Writer hasn't taken ownership over and then read them back after the Writer is done. Below is an example of how you might use that.


#![allow(unused)]
fn main() {
fn test_round_trip() {
    let original = "let x = 0";
    let dest = WriteString::new();
    let parser = ressa::Parser::new(original).expect("Failed to create parser");
    let writer = resw::Writer::new(dest.generate_child());
    for part in parser {
        let part = part.expect("failed to parse part");
        writer.write_part(part).expect("failed to write part");
    }
    assert_eq!(dest.get_string_lossy(), original.to_string());
}
}

Projects

namedescriptionwebsite
console_loggerA utility that will insert console.log to the top of all of your function bodiesrepo
lint-ie8A utility that will check for any javascript that would fail when executed by Internet Explorer 8repo

RESS Scanners

In the initial implementation of the ress scanner, it was more important to get something working correctly than to have something blazing fast. To that end, the original Scanner performs a significant amount of memory allocation, which slows everything down quite a bit. To improve upon that ress offers a section option the RefScanner, which is a bit unfortunately named as it doesn't actually use any references. The RefScanner provides almost the same information as the Scanner but it does so without making any copies from the original javascript string, it the has the option to request the String for any Item giving the control to the user. Here is an example of the two approaches.

Example JS

function things() {
    return [1,2,3,4];
}

Example Rust

use ress::{
    Scanner
};
fn main() {
    let js = include_str!("../example.js");
    let scanner = Scanner::new(js);
    for (i, item) in scanner.enumerate() {
        let item = item.unwrap();
        let prefix = if i < 10 {
            format!(" {}", i)
        } else {
            format!("{}", i)
        };
        println!("{} token: {:?}", prefix, item.token);
    }
}

#[cfg(test)]
mod test {
    use ress::*;
    #[test]
    fn chapter_1_1() {
        let js = "var i = 0;";
        let scanner = Scanner::new(js);
        for token in scanner {
            println!("{:#?}", token.unwrap());
        }
    }
    use ressa::Parser;
    #[test]
    fn ressa_ex1() {
        static JS: &str = "
function Thing(stuff) {
    this.stuff = stuff;
}
";
        let parser = Parser::new(JS).expect("Failed to create parser");
        for part in parser {
            let part = part.expect("Failed to parse part");
            println!("{:#?}", part);
        }
    }
}

Output


running 1 test
Decl(
    Func(
        Func {
            id: Some(
                Ident {
                    name: "Thing",
                },
            ),
            params: [
                Pat(
                    Ident(
                        Ident {
                            name: "stuff",
                        },
                    ),
                ),
            ],
            body: FuncBody(
                [
                    Stmt(
                        Expr(
                            Assign(
                                AssignExpr {
                                    operator: Equal,
                                    left: Expr(
                                        Member(
                                            MemberExpr {
                                                object: This,
                                                property: Ident(
                                                    Ident {
                                                        name: "stuff",
                                                    },
                                                ),
                                                computed: false,
                                            },
                                        ),
                                    ),
                                    right: Ident(
                                        Ident {
                                            name: "stuff",
                                        },
                                    ),
                                },
                            ),
                        ),
                    ),
                ],
            ),
            generator: false,
            is_async: false,
        },
    ),
)
test test::ressa_ex1 ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out

Let's look at token 7, the original token is Token::Numeric(Number(String::From("1"))) while the ref token is Token::Numeric(Number::Dec), both give similar information but the ref token doesn't allocate a new string for the text being represented, instead just informing the user that it is a decimal number. If you wanted to know what that string was, you could use the RefScanner::string_for method by passing it RefItem.span, this will return an Option<String> and so long as your span doesn't overflow the length of the js provided, it will have the value you are looking for.