Building a Debug Helper

$slides-only$

Demo

$slides-only-end$ $web-only$ To simplify things, we are just going to lift the technique for getting the JavaScript text from the ress example, so we won't be covering that again.

With that out of the way let's take a look at the Cargo.toml and use statements for our program.

[package]
name = "console_logify"
version = "0.1.0"
authors = ["Robert Masen <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
ressa = "0.7.0-beta-7"
atty = "0.2"
resw = "0.4.0-beta-1"
resast = "0.4"

#![allow(unused)]
fn main() {
use ressa::Parser;
use resw::Writer;
use resast::prelude::*;
}

This will make sure that all of the items we will need from ressa and resast are in scope. Now we can start defining our method for inserting the debug logging into any functions that we find. To start we are going to create a function that will generate a new ProgramPart::Stmt that will represent our call to console.log which might look like this.


#![allow(unused)]
fn main() {
pub fn console_log<'a>(args: Vec<Expr<'a>>) -> ProgramPart<'a> {
    ProgramPart::Stmt(Stmt::Expr(Expr::Call(
        CallExpr {
            callee: Box::new(Expr::Member(
                MemberExpr {
                    computed: false,
                    object: Box::new(Expr::ident_from("console")),
                    property: Box::new(Expr::ident_from("log")),
                }
            )),
            arguments: args,
        }
    )))
}
}

This signature might look a little intimidating with all the lifetime annotations, the reason they need to be there is that at the heart of every resast node is a Cow (Clone On Write) slice of the originally javascript string. By putting it in a Cow that makes it possible to more easily manipulate the tree without having to pay the cost of allocating a new string for every node at parse time. The lifetime annotations just tell the compiler that our argument and our return value will live the same lifetime, since our arguments are going to be embedded in our return value. We will end up using this pattern quite often in this example, now let's go over what is actually happening here. We will take in the args to supplu the arguments passed into console.log as our only argument. Now we are going to build the tree that represents the javascript, which will look like this:

  • ProgramPart
    • Stmt
      • Expr
        • CallExpr
          • callee
            • Expr
              • MemberExpr
                • computed: false
                • object
                  • Expr
                    • Ident
                      • name: "console"
                • property
                  • Expr
                    • Ident
                      • name: "log"
          • arguments
            • Vec<Expr>

It might be easier to start from the inner most structure, the MemberExpr, this represents the console.log portion of the desired output. First, we want to set the computed property to false, this means we are using a . instead of [], next we need to define the object which will be the identifier console and the property which will be the identifer log. We nest this inside of a CallExpr as the callee, this represents everything up to the opening parenthesis. The second property arguments will, as the name suggests, represent the the arguments, we'll simply assign that to the args provided by the caller. Moving up the tree we wrap the CallExpr in a Expr, and a Stmt and a ProgramPart.

Next, let's work on a few more helper functions, first up is one that will insert a ProgramPart to the top of a FuncBody.


#![allow(unused)]
fn main() {
fn insert_expr_into_func_body<'a>(expr: ProgramPart<'a>, body: &mut FuncBody<'a>) {
    body.0.insert(0, expr);
}
}

This one is pretty straight forward, we take the part and a mutable reference to the body we are modifying. A FuncBody is a tuple struct that wraps a Vec<ProgrgramPart>, this means we can use the insert method on Vec to add the new item to the first position.

Another useful utility would be a way to convert an Ident into a StringLit, it is something that we will be doing quite often.


#![allow(unused)]
fn main() {
fn ident_to_string_lit<'a>(i: &Ident<'a>) -> Expr<'a> {
    Expr::Lit(Lit::String(StringLit::Single(i.name.clone())))
}
}

This one is also pretty straight forwrard, we take a reference to an Ident and clone the name property into a StringLit::Single, we want to wrap that up into an Expr, to do that we need to wrap it in a Lit::String first.

To continue that theme, let's build another function that takes in an expression and returns that expression's representation as a StringLit. To start, let's build a function that converts an Expr into a rust String. The problem is that not all Exprs can be easily converted into a rust String. This will be a good opportunity to use the Option type to filter out any of the expressions we might not want to pass into console.log.


#![allow(unused)]
fn main() {
fn expr_to_string(expr: &Expr) -> Option<String> {
    match expr {
        Expr::Ident(ref ident) => Some(ident.name.to_string()),
        Expr::This => Some("this".to_string()),
        Expr::Member(ref mem) => {
            let prefix = expr_to_string(&mem.object)?;
            let suffix = expr_to_string(&mem.property)?;
            Some(if mem.computed {
                format!("{}[{}]", prefix, suffix)
            } else {
                format!("{}.{}", prefix, suffix)
            })
        },
        Expr::Lit(lit) => {
            match lit {
                Lit::String(s) => Some(s.clone_inner().to_string()),
                Lit::Number(n) => Some(n.to_string()),
                Lit::Boolean(b) => Some(b.to_string()),
                Lit::RegEx(r) => Some(format!("/{}/{}", r.pattern, r.flags)),
                Lit::Null => Some("null".to_string()),
                _ => None,
            }
        },
        _ => None,
    }
}
}

This function is just a match expressions, the first case is the Ident that we simply make a copy of the the name property by calling to_string. Next is the This case, which we jsut create a new string and return that. for a member expression, we ant to return the object property converted to a string and the property property converted to a string seperated by a ., if either of these two can't be converted to a string, we just return None. The last case that we want to attempt to convert is the literal case, for that we simply extract the inner string in most cases. For the regex case, we reconstruct that by putting the pattern between two slashes and flags at the end. For the null case we just return that as a new string. The last case we might handle is Template which would be a little more complicated to re-construct for this example so we will just return None in that case. For any other expressions we want to return None as it would be far more complicated and pretty uncommon to come up in our use case.

Now, we want to wrap the result of this new function into an Expr just like we did for our identifier.


#![allow(unused)]
fn main() {
fn expr_to_string_lit<'a>(e: &Expr<'a>) -> Option<Expr<'a>> {
    let inner = expr_to_string(e)?;
    Some(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(inner)))))
}
}

Because modern javascript allows for patterns as function arguments, we are going to need a couple of helper's to handle these possiblities. Let's take this js as an example.

function Thing({a, b = 0}, [c, d, e]) {

}

Our goal would be to add a call to this function that looks like this.

console.log('Thing', a, b, c, d, e);

Before we get into these pattern arguments, we want to have an easy way to clone an Expr but only when it is an Ident.


#![allow(unused)]
fn main() {
fn clone_ident_from_expr<'a>(expr: &Expr<'a>) -> Option<Expr<'a>> {
    if let Expr::Ident(_) = expr {
        Some(expr.clone())
    } else {
        None
    }
}
}

Here we are just using an if let to test for the an Ident and cloning if there is a match. Now let's dig into the Pat argument conversion.


#![allow(unused)]
fn main() {
fn extract_idents_from_pat<'a>(pat: &Pat<'a>) -> Vec<Option<Expr<'a>>> {
    match pat {
        Pat::Ident(i) => {
            vec![Some(Expr::Ident(i.clone()))]
        },
        Pat::Obj(obj) => {
            obj.iter().map(|part| {
                match part {
                    ObjPatPart::Rest(pat) => {
                        extract_idents_from_pat(pat)
                    },
                    ObjPatPart::Assign(prop) => {
                        match prop.key {
                            PropKey::Pat(ref pat) => {
                                extract_idents_from_pat(pat)
                            },
                            PropKey::Expr(ref expr) => {
                                vec![clone_ident_from_expr(expr)]
                            },
                            PropKey::Lit(ref lit) => {
                                vec![Some(Expr::Lit(lit.clone()))]
                            }
                        }
                    },
                }
            }).flatten().collect()
        },
        Pat::Array(arr) => {
            arr.iter().map(|p| {
                match p {
                    Some(ArrayPatPart::Expr(expr)) => {
                        vec![clone_ident_from_expr(expr)]
                    },
                    Some(ArrayPatPart::Pat(pat)) => {
                        extract_idents_from_pat(pat)
                    },
                    None => vec![],
                }
            }).flatten().collect()
        },
        Pat::RestElement(pat) => {
            extract_idents_from_pat(pat)
        },
        Pat::Assign(assign) => {
            extract_idents_from_pat(&assign.left)
        },
    }
}
}

Because pattern's like the object or array pattern can contain multiple arguments, in our example a and b would be in the same pattern, we want to return a Vec of the optional identifiers. First, let's cover the simplest pattern the Ident case. In this case we simply want to create a new Vec with a clone of the inner wrapped up in an Expr as its only contents. Next we get something a little more complicated the Obj case. Inside of a Pat::Obj is a Vec of an enum called ObjPatPart which has 2 cases the normal Assign and the Rest (preceded by ...). The nice thing about the Rest case is that we can simply use recursion to get the ident's out of the inner Pat. The Assign case has a data scructure called Prop, in this situation we only really care about the key property, since that is where our identifier would live. A propety key can be either a Pat, Expr or Lit, in the first case we can use the same recursive call to get the identifiers it contains. For the expression case we are going to use that helper function we just wrote to get the ident out if it is an ident, finally we are going to just clone the liter into a new Expr. Since we need to do this for each of the ObjPatParts in the object pattern we are going to use the Iterator trait's map to do the first step in the process, this will convert each element into a Vec of optional Exprs, to get that back down to a single Vec we can use the flatten method. Finally we will collect the iterator back together. Next we have the Array, this is going to look very similar. First we are going to map the inner ArrayPatParts into our identifiers, this enum has 3 cases the Expr which we can pass off to our helper just like before, the Pat which we will use recursion for again and finally a None case which we can just return an empty Vec. The RestElement works just like the object pattern version, we just recurse with the inner value. Finally we have the Assign case, this one we want to use the same recursion method but only on the left property. Whew, that one was a bit of a doozy!

We are just now starting to dig into the meat of this project, getting through this complicated mappings now is going to greatly simplify things for us later. Since we arre going to be primarily working with the FuncArgs in any given Func or ArrowFunc, we should have a function that maps any list of arguments to a new list of identifiers and literals.


#![allow(unused)]
fn main() {
fn extract_idents_from_args<'a>(args: &[FuncArg<'a>]) -> Vec<Expr<'a>> {
    let mut ret = vec![];
    for arg in args {
        match arg {
            FuncArg::Expr(expr) => ret.push(clone_ident_from_expr(expr)),
            FuncArg::Pat(pat) => ret.extend(extract_idents_from_pat(pat)),
        }
    }
    ret.into_iter().filter_map(|e| e).collect()
}
}

In this function we are going to liberally use the last to helpers we put together. a FuncArg can be either a Pat or and Expr, in the former we are dealing with a possible list of many new elements but for the latter there would be only one. With that in mind we are going to use the Vec method push for one element and extend for possibly many. Once we have gone through each of the arguments provided we want to filter out any of the None cases by using the filter_map which will filter out any Nones and unwrap and Somes for us automatically. We can then collect up the result to return.

Last in our helper functions is going to be a way to go from an AssignLeft into an Expr with a StringLit inside. For this we are going to use the expr_to_string_lit helper in the Expr case and we are going to match on the Pat case, returning a call to the ident_to_string_lit helper.

Armed with these helpers it is time to write our first mapping function. A pattern that will be true of all of our mapping functions is that they will always take a Vec of Exprs as the first argument. This how we are going to track the prefix of any log we want to write. We are going to start with the Class, which is primarily a collection of Funcs wrapped up in Props so let's start at the property level.


#![allow(unused)]
fn main() {
fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> {
    match prop.kind {
        PropKind::Ctor => {
            args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new"))));
        },
        PropKind::Get => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("get")))
            );
        },
        PropKind::Set => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("set")))
            );
        },
        _ => (),
    };
    match &prop.key {
        PropKey::Expr(ref expr) => match expr {
            Expr::Ident(ref i) => {
                if i.name != "constructor" {
                    args.push(ident_to_string_lit(i));
                }
            }
            _ => (),
        },
        PropKey::Lit(ref l) => match l {
            Lit::Boolean(_)
            | Lit::Number(_)
            | Lit::RegEx(_)
            | Lit::String(_) => {
                args.push(Expr::Lit(l.clone()))
            }
            Lit::Null => {
                args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null"))))));
            }
            _ => (),
        },
        PropKey::Pat(ref p) => {
            match p {
                Pat::Ident(ref i) => args.push(ident_to_string_lit(i)),
                _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)),
            }
        },
    }
    if let PropValue::Expr(expr) = prop.value {
        prop.value = PropValue::Expr(map_expr(args, expr));
    }
    prop
}
}

To start, we want to look at the kind property, there are 3 kinds that are important for us here. The first is Ctor (short for constructor), if we find one of those we want to put the new just before the class name, which should be the last element in the args. To make sure we don't run into any big problems later we should use the saturation_sub method on usize to do the subtraction. Next are the Get and Set accessors, if we find one of those we just want to append this keyword to the end of the current args.

Now that we have that, we need to start digging into the ProgramPart to identify anything we want to modify. Since Parser implements Iterator and its Item is Result<ProgramPart, Error> we first need to use filter_map to extract the ProgramPart from the result. It would probably be good to handle the error case here but for the sake of simplicity we are going to skip any errors. Now that we have an Iterator over ProgramParts we can use map to update each part.

fn main() {
    let js = get_js().expect("Unable to get JavaScript");
    let parser = Parser::new(&js).expect("Unable to construct parser");
    for part in parser.filter_map(|p| p.ok()).map(map_part) {
        //FIXME: Write updated program part to somewhere
    }
}

With that in mind the entry point is going to be a function that takes a ProgramPart and returns a new ProgramPart. It might look like this


#![allow(unused)]
fn main() {
fn map_part<'a>(args: Vec<Expr<'a>>, part: ProgramPart<'a>) -> ProgramPart<'a> {
    match part {
        ProgramPart::Decl(decl) => ProgramPart::Decl(map_decl(args, decl)),
        ProgramPart::Stmt(stmt) => ProgramPart::Stmt(map_stmt(args, stmt)),
        ProgramPart::Dir(_) => part,
    }
}

}

We are going to match on the part provided and either return that part if it is a Directive or if it isn't we need to investigate further to discover if it is a function or not. We do that in two places map_decl and map_stmt both of which are going to utilize similar method for digging further into the tree.


#![allow(unused)]
fn main() {
fn map_decl<'a>(mut args: Vec<Expr<'a>>, decl: Decl<'a>) -> Decl<'a> {
    match decl {
        Decl::Func(f) => Decl::Func(map_func(args, f)),
        Decl::Class(class) => Decl::Class(map_class(args, class)),
        Decl::Var(kind, del) => {
            Decl::Var(kind, del.into_iter().map(|part| {
                if let Pat::Ident(ref ident) = part.id {
                    args.push(ident_to_string_lit(ident));
                }
                VarDecl {
                    id: part.id,
                    init: part.init.map(|e| map_expr(args.clone(), e))
                }
            }).collect())
        }
}

There are two ways for a Decl to resolve into a function or method and that is with the Function and Class variants while a Stmt can end up there if it is an Expr. When we include map_expr we see that there are cases for both Function and Class in the Expr enum. That means once we get past those we will be handling the rest in the exact same way.


#![allow(unused)]
fn main() {
        _ => decl.clone(),
    }
}

fn map_stmt<'a>(args: Vec<Expr<'a>>, stmt: Stmt<'a>) -> Stmt<'a> {
    match stmt {
        Stmt::Expr(expr) => Stmt::Expr(map_expr(args, expr)),
        _ => stmt.clone(),
}

Finally we are going to start manipulating the AST in map_func.

The first thing we are going to do is to clone the func to give us a mutable version. Next we are going to check if the id is Some, if it is we can add that name to our console.log arguments. Now function arguments can be pretty complicated, to try and keep things simple we are going to only worry about the ones that are either Expr::Ident or Pat::Identifier. To build something more robust it might be good to include destructured arguments or arguments with default values but for this example we are just going to keep it simple.

First we are going to filter_map the func.params to only get the items that ultimately resolve to Identifers, at that point we can wrap all of these identifiers in an Expr::Ident and add them to the console.log args. Now we can simply insert the result of passing those args to console_log at the first position of the func.body. Because functions can appear in the body of other functions we also want to map all of the func.body program parts. Once that has completed we can return the updated func to the caller.

The next thing we are going to want to deal with is Class, we want to insert console.log into the top of each method on a class. This is a bit unique because we also want to provide the name of that class (if it exists) as the first argument to console.log. That might look like this.


#![allow(unused)]

fn main() {
fn map_func<'a>(mut args: Vec<Expr<'a>>, mut func: Func<'a>) -> Func<'a> {
    if let Some(ref id) = &func.id {
        args.push(ident_to_string_lit(id));
    }
    let local_args = extract_idents_from_args(&func.params);
    func.body = FuncBody(func.body.0.into_iter().map(|p| map_part(args.clone(), p)).collect());
    insert_expr_into_func_body(console_log(args.clone().into_iter().chain(local_args.into_iter()).collect()), &mut func.body);
    func
}

fn map_arrow_func<'a>(mut args: Vec<Expr<'a>>, mut f: ArrowFuncExpr<'a>) -> ArrowFuncExpr<'a> {
    args.extend(extract_idents_from_args(&f.params));
    match &mut f.body {
        ArrowFuncBody::FuncBody(ref mut body) => {
            insert_expr_into_func_body(console_log(args), body)
        },
        ArrowFuncBody::Expr(expr) => {
            f.body = ArrowFuncBody::FuncBody(FuncBody(vec![
                console_log(args),
                ProgramPart::Stmt(
                    Stmt::Return(
                        Some(*expr.clone())
                    )
                )
            ]))
        }
    }
    f
}

fn map_class<'a>(mut args: Vec<Expr<'a>>, mut class: Class<'a>) -> Class<'a> {
    if let Some(ref id) = class.id {
        args.push(ident_to_string_lit(id))
    }
    let mut new_body = vec![];
    for item in class.body.0 {
        new_body.push(map_class_prop(args.clone(), item))
    }
    class.body = ClassBody(new_body);
    class
}

fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> {
    match prop.kind {
        PropKind::Ctor => {
            args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new"))));
        },
        PropKind::Get => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("get")))
            );
        },
        PropKind::Set => {
            args.push(
                Expr::Lit(Lit::String(StringLit::single_from("set")))
            );
        },
        _ => (),
    };
    match &prop.key {
        PropKey::Expr(ref expr) => match expr {
            Expr::Ident(ref i) => {
                if i.name != "constructor" {
                    args.push(ident_to_string_lit(i));
                }
            }
            _ => (),
        },
        PropKey::Lit(ref l) => match l {
            Lit::Boolean(_)
            | Lit::Number(_)
            | Lit::RegEx(_)
            | Lit::String(_) => {
                args.push(Expr::Lit(l.clone()))
            }
            Lit::Null => {
                args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null"))))));
            }
            _ => (),
        },
        PropKey::Pat(ref p) => {
            match p {
                Pat::Ident(ref i) => args.push(ident_to_string_lit(i)),
                _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)),
            }
        },
    }
    if let PropValue::Expr(expr) = prop.value {
        prop.value = PropValue::Expr(map_expr(args, expr));
    }
    prop
}

fn assign_left_to_string_lit<'a>(left: &AssignLeft<'a>) -> Option<Expr<'a>> {
    match left {
        AssignLeft::Expr(expr) => expr_to_string_lit(expr),
        AssignLeft::Pat(pat) => {
            match pat {
                Pat::Ident(ident) => Some(ident_to_string_lit(ident)),
                _ => None,
            }
        }
    }
}


}

Here we have two functions, the first pulls out the id from the provided class or uses an empty string of it doesn't exist. We then just pass that off to map_class_prop which will handle all of the different types of properties a class can have. The first thing this does is map the prefix into the right format, so a call to new Thing() would print new Thing, or a get method would print Thing get before the method name. Next we take a look at the property.key, this will provide us with the name of our function, but according to the specification a class property key can be an identifier, a literal value, or a pattern, so we need to figure out what the name of this method is by digging into that value. First in the case that it is an ident we want to add it to the args, unless it is the value constructor because we already put the new keyword in that one. Next we can pull out the literal values and add those as they appear. Lastly we will only handle the pattern case when it is a Pat::Identifier otherwise we will just skip it. Now to get the parameter names from the method definition we need to look at the property.value which should always be an Expr::Function. Once we match on that we simply repeat the process of map_function pulling the args out but only when they are Idents and then passing that along to console_log and inserting that Expr at the top of the function body.

At this point we have successfully updated our AST to include a call to console.log at the top of each function and method in our code. Now the big question is how do we write that out to a file. This problem is not a small one, in the next section we are going to cover a third crate resw that we can use to finish this project. $web-only-end$