Building a Debug Helper
$slides-only$
Demo
$slides-only-end$ $web-only$ To simplify things, we are just going to lift the technique for getting the JavaScript text from the ress example, so we won't be covering that again.
With that out of the way let's take a look at the Cargo.toml
and use
statements for our program.
[package]
name = "console_logify"
version = "0.1.0"
authors = ["Robert Masen <r@robertmasen.pizza>"]
edition = "2018"
[dependencies]
ressa = "0.7.0-beta-7"
resw = "0.4.0-beta-1"
resast = "0.4"
#![allow(unused)] fn main() { use ressa::Parser; use resw::Writer; use resast::prelude::*; }
This will make sure that all of the items we will need from ressa
and resast
are in scope. Now we can start defining our method for inserting the debug logging into any functions that we find. To start we are going to create a function that will generate a new ProgramPart::Stmt
that will represent our call to console.log
which might look like this.
#![allow(unused)] fn main() { pub fn console_log<'a>(args: Vec<Expr<'a>>) -> ProgramPart<'a> { ProgramPart::Stmt(Stmt::Expr(Expr::Call( CallExpr { callee: Box::new(Expr::Member( MemberExpr { computed: false, object: Box::new(Expr::ident_from("console")), property: Box::new(Expr::ident_from("log")), } )), arguments: args, } ))) } }
This signature might look a little intimidating with all the lifetime annotations, the reason they need to be there is that at the heart of every resast
node is a Cow
(Clone On Write) slice of the originally javascript string. By putting it in a Cow
that makes it possible to more easily manipulate the tree without having to pay the cost of allocating a new string for every node at parse time. The lifetime annotations just tell the compiler that our argument and our return value will live the same lifetime, since our arguments are going to be embedded in our return value. We will end up using this pattern quite often in this example, now let's go over what is actually happening here. We will take in the args
to supplu the arguments passed into console.log
as our only argument.
Now we are going to build the tree that represents the javascript, which will look like this:
ProgramPart
Stmt
Expr
CallExpr
callee
Expr
MemberExpr
computed
:false
object
Expr
Ident
name
:"console"
property
Expr
Ident
name
:"log"
arguments
Vec<Expr>
It might be easier to start from the inner most structure, the MemberExpr
, this represents the console.log
portion of the desired output. First, we want to set the computed
property to false, this means we are using a .
instead of []
, next we need to define the object
which will be the identifier console
and the property
which will be the identifer log
. We nest this inside of a CallExpr
as the callee
, this represents everything up to the opening parenthesis. The second property arguments
will, as the name suggests, represent the the arguments, we'll simply assign that to the args
provided by the caller. Moving up the tree we wrap the CallExpr
in a Expr
, and a Stmt
and a ProgramPart
.
Next, let's work on a few more helper functions, first up is one that will insert a ProgramPart
to the top of a FuncBody
.
#![allow(unused)] fn main() { fn insert_expr_into_func_body<'a>(expr: ProgramPart<'a>, body: &mut FuncBody<'a>) { body.0.insert(0, expr); } }
This one is pretty straight forward, we take the part and a mutable reference to the body we are modifying. A FuncBody
is a tuple struct that wraps a Vec<ProgrgramPart>
, this means we can use the insert
method on Vec
to add the new item to the first position.
Another useful utility would be a way to convert an Ident
into a StringLit
, it is something that we will be doing quite often.
#![allow(unused)] fn main() { fn ident_to_string_lit<'a>(i: &Ident<'a>) -> Expr<'a> { Expr::Lit(Lit::String(StringLit::Single(i.name.clone()))) } }
This one is also pretty straight forwrard, we take a reference to an Ident
and clone the name
property into a StringLit::Single
, we want to wrap that up into an Expr
, to do that we need to wrap it in a Lit::String
first.
To continue that theme, let's build another function that takes in an expression and returns that expression's representation as a StringLit
. To start, let's build a function that converts an Expr
into a rust String
. The problem is that not all Expr
s can be easily converted into a rust String
. This will be a good opportunity to use the Option
type to filter out any of the expressions we might not want to pass into console.log
.
#![allow(unused)] fn main() { fn expr_to_string(expr: &Expr) -> Option<String> { match expr { Expr::Ident(ref ident) => Some(ident.name.to_string()), Expr::This => Some("this".to_string()), Expr::Member(ref mem) => { let prefix = expr_to_string(&mem.object)?; let suffix = expr_to_string(&mem.property)?; Some(if mem.computed { format!("{}[{}]", prefix, suffix) } else { format!("{}.{}", prefix, suffix) }) }, Expr::Lit(lit) => { match lit { Lit::String(s) => Some(s.clone_inner().to_string()), Lit::Number(n) => Some(n.to_string()), Lit::Boolean(b) => Some(b.to_string()), Lit::RegEx(r) => Some(format!("/{}/{}", r.pattern, r.flags)), Lit::Null => Some("null".to_string()), _ => None, } }, _ => None, } } }
This function is just a match expressions, the first case is the Ident
that we simply make a copy of the the name
property by calling to_string
. Next is the This
case, which we jsut create a new string and return that. for a member expression, we ant to return the object
property converted to a string and the property
property converted to a string seperated by a .
, if either of these two can't be converted to a string, we just return None
. The last case that we want to attempt to convert is the literal case, for that we simply extract the inner string in most cases. For the regex case, we reconstruct that by putting the pattern
between two slashes and flags
at the end. For the null
case we just return that as a new string. The last case we might handle is Template
which would be a little more complicated to re-construct for this example so we will just return None
in that case. For any other expressions we want to return None
as it would be far more complicated and pretty uncommon to come up in our use case.
Now, we want to wrap the result of this new function into an Expr
just like we did for our identifier.
#![allow(unused)] fn main() { fn expr_to_string_lit<'a>(e: &Expr<'a>) -> Option<Expr<'a>> { let inner = expr_to_string(e)?; Some(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(inner))))) } }
Because modern javascript allows for patterns as function arguments, we are going to need a couple of helper's to handle these possiblities. Let's take this js as an example.
function Thing({a, b = 0}, [c, d, e]) {
}
Our goal would be to add a call to this function that looks like this.
console.log('Thing', a, b, c, d, e);
Before we get into these pattern arguments, we want to have an easy way to clone an Expr
but only when it is an Ident
.
#![allow(unused)] fn main() { fn clone_ident_from_expr<'a>(expr: &Expr<'a>) -> Option<Expr<'a>> { if let Expr::Ident(_) = expr { Some(expr.clone()) } else { None } } }
Here we are just using an if let
to test for the an Ident
and cloning if there is a match. Now let's dig into the Pat
argument conversion.
#![allow(unused)] fn main() { fn extract_idents_from_pat<'a>(pat: &Pat<'a>) -> Vec<Option<Expr<'a>>> { match pat { Pat::Ident(i) => { vec![Some(Expr::Ident(i.clone()))] }, Pat::Obj(obj) => { obj.iter().map(|part| { match part { ObjPatPart::Rest(pat) => { extract_idents_from_pat(pat) }, ObjPatPart::Assign(prop) => { match prop.key { PropKey::Pat(ref pat) => { extract_idents_from_pat(pat) }, PropKey::Expr(ref expr) => { vec![clone_ident_from_expr(expr)] }, PropKey::Lit(ref lit) => { vec![Some(Expr::Lit(lit.clone()))] } } }, } }).flatten().collect() }, Pat::Array(arr) => { arr.iter().map(|p| { match p { Some(ArrayPatPart::Expr(expr)) => { vec![clone_ident_from_expr(expr)] }, Some(ArrayPatPart::Pat(pat)) => { extract_idents_from_pat(pat) }, None => vec![], } }).flatten().collect() }, Pat::RestElement(pat) => { extract_idents_from_pat(pat) }, Pat::Assign(assign) => { extract_idents_from_pat(&assign.left) }, } } }
Because pattern's like the object or array pattern can contain multiple arguments, in our example a and b would be in the same pattern, we want to return a Vec
of the optional identifiers. First, let's cover the simplest pattern the Ident
case. In this case we simply want to create a new Vec
with a clone of the inner wrapped up in an Expr
as its only contents. Next we get something a little more complicated the Obj
case. Inside of a Pat::Obj
is a Vec
of an enum called ObjPatPart
which has 2 cases the normal Assign
and the Rest
(preceded by ...
). The nice thing about the Rest
case is that we can simply use recursion to get the ident's out of the inner Pat
. The Assign
case has a data scructure called Prop
, in this situation we only really care about the key
property, since that is where our identifier would live. A propety key can be either a Pat
, Expr
or Lit
, in the first case we can use the same recursive call to get the identifiers it contains. For the expression case we are going to use that helper function we just wrote to get the ident out if it is an ident, finally we are going to just clone the liter into a new Expr
. Since we need to do this for each of the ObjPatPart
s in the object pattern we are going to use the Iterator
trait's map
to do the first step in the process, this will convert each element into a Vec
of optional Expr
s, to get that back down to a single Vec
we can use the flatten
method. Finally we will collect
the iterator back together. Next we have the Array
, this is going to look very similar. First we are going to map the inner ArrayPatPart
s into our identifiers, this enum has 3 cases the Expr
which we can pass off to our helper just like before, the Pat
which we will use recursion for again and finally a None
case which we can just return an empty Vec
. The RestElement
works just like the object pattern version, we just recurse with the inner value. Finally we have the Assign
case, this one we want to use the same recursion method but only on the left
property. Whew, that one was a bit of a doozy!
We are just now starting to dig into the meat of this project, getting through this complicated mappings now is going to greatly simplify things for us later. Since we arre going to be primarily working with the FuncArg
s in any given Func
or ArrowFunc
, we should have a function that maps any list of arguments to a new list of identifiers and literals.
#![allow(unused)] fn main() { fn extract_idents_from_args<'a>(args: &[FuncArg<'a>]) -> Vec<Expr<'a>> { let mut ret = vec![]; for arg in args { match arg { FuncArg::Expr(expr) => ret.push(clone_ident_from_expr(expr)), FuncArg::Pat(pat) => ret.extend(extract_idents_from_pat(pat)), } } ret.into_iter().filter_map(|e| e).collect() } }
In this function we are going to liberally use the last to helpers we put together. a FuncArg
can be either a Pat
or and Expr
, in the former we are dealing with a possible list of many new elements but for the latter there would be only one. With that in mind we are going to use the Vec
method push
for one element and extend
for possibly many. Once we have gone through each of the arguments provided we want to filter out any of the None
cases by using the filter_map
which will filter out any None
s and unwrap and Some
s for us automatically. We can then collect up the result to return.
Last in our helper functions is going to be a way to go from an AssignLeft
into an Expr
with a StringLit
inside. For this we are going to use the expr_to_string_lit
helper in the Expr
case and we are going to match on the Pat
case, returning a call to the ident_to_string_lit
helper.
Armed with these helpers it is time to write our first mapping function. A pattern that will be true of all of our mapping functions is that they will always take a Vec
of Expr
s as the first argument. This how we are going to track the prefix of any log we want to write. We are going to start with the Class
, which is primarily a collection of Func
s wrapped up in Prop
s so let's start at the property level.
#![allow(unused)] fn main() { fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> { match prop.kind { PropKind::Ctor => { args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new")))); }, PropKind::Get => { args.push( Expr::Lit(Lit::String(StringLit::single_from("get"))) ); }, PropKind::Set => { args.push( Expr::Lit(Lit::String(StringLit::single_from("set"))) ); }, _ => (), }; match &prop.key { PropKey::Expr(ref expr) => match expr { Expr::Ident(ref i) => { if i.name != "constructor" { args.push(ident_to_string_lit(i)); } } _ => (), }, PropKey::Lit(ref l) => match l { Lit::Boolean(_) | Lit::Number(_) | Lit::RegEx(_) | Lit::String(_) => { args.push(Expr::Lit(l.clone())) } Lit::Null => { args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null")))))); } _ => (), }, PropKey::Pat(ref p) => { match p { Pat::Ident(ref i) => args.push(ident_to_string_lit(i)), _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)), } }, } if let PropValue::Expr(expr) = prop.value { prop.value = PropValue::Expr(map_expr(args, expr)); } prop } }
To start, we want to look at the kind
property, there are 3 kinds that are important for us here. The first is Ctor
(short for constructor), if we find one of those we want to put the new
just before the class name, which should be the last element in the args. To make sure we don't run into any big problems later we should use the saturation_sub
method on usize
to do the subtraction. Next are the Get
and Set
accessors, if we find one of those we just want to append this keyword to the end of the current args.
Now that we have that, we need to start digging into the ProgramPart
to identify anything we want to modify. Since Parser
implements Iterator
and its Item
is Result<ProgramPart, Error>
we first need to use filter_map
to extract the ProgramPart
from the result. It would probably be good to handle the error case here but for the sake of simplicity we are going to skip any errors. Now that we have an Iterator
over ProgramPart
s we can use map
to update each part.
fn main() { let js = get_js().expect("Unable to get JavaScript"); let parser = Parser::new(&js).expect("Unable to construct parser"); for part in parser.filter_map(|p| p.ok()).map(map_part) { //FIXME: Write updated program part to somewhere } }
With that in mind the entry point is going to be a function that takes a ProgramPart
and returns a new ProgramPart
. It might look like this
#![allow(unused)] fn main() { fn map_part<'a>(args: Vec<Expr<'a>>, part: ProgramPart<'a>) -> ProgramPart<'a> { match part { ProgramPart::Decl(decl) => ProgramPart::Decl(map_decl(args, decl)), ProgramPart::Stmt(stmt) => ProgramPart::Stmt(map_stmt(args, stmt)), ProgramPart::Dir(_) => part, } } }
We are going to match on the part provided and either return that part if it is a Directive
or if it isn't we need to investigate further to discover if it is a function or not. We do that in two places map_decl
and map_stmt
both of which are going to utilize similar method for digging further into the tree.
#![allow(unused)] fn main() { fn map_decl<'a>(mut args: Vec<Expr<'a>>, decl: Decl<'a>) -> Decl<'a> { match decl { Decl::Func(f) => Decl::Func(map_func(args, f)), Decl::Class(class) => Decl::Class(map_class(args, class)), Decl::Var(kind, del) => { Decl::Var(kind, del.into_iter().map(|part| { if let Pat::Ident(ref ident) = part.id { args.push(ident_to_string_lit(ident)); } VarDecl { id: part.id, init: part.init.map(|e| map_expr(args.clone(), e)) } }).collect()) } }
There are two ways for a Decl
to resolve into a function or method and that is with the Function
and Class
variants while a Stmt
can end up there if it is an Expr
. When we include map_expr
we see that there are cases for both Function
and Class
in the Expr
enum. That means once we get past those we will be handling the rest in the exact same way.
#![allow(unused)] fn main() { _ => decl.clone(), } } fn map_stmt<'a>(args: Vec<Expr<'a>>, stmt: Stmt<'a>) -> Stmt<'a> { match stmt { Stmt::Expr(expr) => Stmt::Expr(map_expr(args, expr)), _ => stmt.clone(), }
Finally we are going to start manipulating the AST in map_func
.
The first thing we are going to do is to clone the func
to give us a mutable version. Next we are going to check if the id
is Some
, if it is we can add that name to our console.log
arguments. Now function arguments can be pretty complicated, to try and keep things simple we are going to only worry about the ones that are either Expr::Ident
or Pat::Identifier
. To build something more robust it might be good to include destructured arguments or arguments with default values but for this example we are just going to keep it simple.
First we are going to filter_map
the func.params
to only get the items that ultimately resolve to Identifer
s, at that point we can wrap all of these identifiers in an Expr::Ident
and add them to the console.log
args. Now we can simply insert the result of passing those args to console_log
at the first position of the func.body
. Because functions can appear in the body of other functions we also want to map all of the func.body
program parts. Once that has completed we can return the updated func
to the caller.
The next thing we are going to want to deal with is Class
, we want to insert console.log into the top of each method on a class. This is a bit unique because we also want to provide the name of that class (if it exists) as the first argument to console.log. That might look like this.
#![allow(unused)] fn main() { fn map_func<'a>(mut args: Vec<Expr<'a>>, mut func: Func<'a>) -> Func<'a> { if let Some(ref id) = &func.id { args.push(ident_to_string_lit(id)); } let local_args = extract_idents_from_args(&func.params); func.body = FuncBody(func.body.0.into_iter().map(|p| map_part(args.clone(), p)).collect()); insert_expr_into_func_body(console_log(args.clone().into_iter().chain(local_args.into_iter()).collect()), &mut func.body); func } fn map_arrow_func<'a>(mut args: Vec<Expr<'a>>, mut f: ArrowFuncExpr<'a>) -> ArrowFuncExpr<'a> { args.extend(extract_idents_from_args(&f.params)); match &mut f.body { ArrowFuncBody::FuncBody(ref mut body) => { insert_expr_into_func_body(console_log(args), body) }, ArrowFuncBody::Expr(expr) => { f.body = ArrowFuncBody::FuncBody(FuncBody(vec![ console_log(args), ProgramPart::Stmt( Stmt::Return( Some(*expr.clone()) ) ) ])) } } f } fn map_class<'a>(mut args: Vec<Expr<'a>>, mut class: Class<'a>) -> Class<'a> { if let Some(ref id) = class.id { args.push(ident_to_string_lit(id)) } let mut new_body = vec![]; for item in class.body.0 { new_body.push(map_class_prop(args.clone(), item)) } class.body = ClassBody(new_body); class } fn map_class_prop<'a>(mut args: Vec<Expr<'a>>, mut prop: Prop<'a>) -> Prop<'a> { match prop.kind { PropKind::Ctor => { args.insert(args.len().saturating_sub(1), Expr::Lit(Lit::String(StringLit::single_from("new")))); }, PropKind::Get => { args.push( Expr::Lit(Lit::String(StringLit::single_from("get"))) ); }, PropKind::Set => { args.push( Expr::Lit(Lit::String(StringLit::single_from("set"))) ); }, _ => (), }; match &prop.key { PropKey::Expr(ref expr) => match expr { Expr::Ident(ref i) => { if i.name != "constructor" { args.push(ident_to_string_lit(i)); } } _ => (), }, PropKey::Lit(ref l) => match l { Lit::Boolean(_) | Lit::Number(_) | Lit::RegEx(_) | Lit::String(_) => { args.push(Expr::Lit(l.clone())) } Lit::Null => { args.push(Expr::Lit(Lit::String(StringLit::Single(::std::borrow::Cow::Owned(String::from("null")))))); } _ => (), }, PropKey::Pat(ref p) => { match p { Pat::Ident(ref i) => args.push(ident_to_string_lit(i)), _ => args.extend(extract_idents_from_pat(p).into_iter().filter_map(|e| e)), } }, } if let PropValue::Expr(expr) = prop.value { prop.value = PropValue::Expr(map_expr(args, expr)); } prop } fn assign_left_to_string_lit<'a>(left: &AssignLeft<'a>) -> Option<Expr<'a>> { match left { AssignLeft::Expr(expr) => expr_to_string_lit(expr), AssignLeft::Pat(pat) => { match pat { Pat::Ident(ident) => Some(ident_to_string_lit(ident)), _ => None, } } } } }
Here we have two functions, the first pulls out the id from the provided class or uses an empty string of it doesn't exist. We then just pass that off to map_class_prop
which will handle all of the different types of properties a class can have. The first thing this does is map the prefix
into the right format, so a call to new Thing()
would print new Thing
, or a get method would print Thing get
before the method name. Next we take a look at the property.key
, this will provide us with the name of our function, but according to the specification a class property key can be an identifier, a literal value, or a pattern, so we need to figure out what the name of this method is by digging into that value. First in the case that it is an ident we want to add it to the args, unless it is the value constructor
because we already put the new
keyword in that one. Next we can pull out the literal values and add those as they appear. Lastly we will only handle the pattern case when it is a Pat::Identifier
otherwise we will just skip it. Now to get the parameter names from the method definition we need to look at the property.value
which should always be an Expr::Function
. Once we match on that we simply repeat the process of map_function
pulling the args out but only when they are Ident
s and then passing that along to console_log
and inserting that Expr
at the top of the function body.
At this point we have successfully updated our AST to include a call to console.log
at the top of each function and method in our code. Now the big question is how do we write that out to a file. This problem is not a small one, in the next section we are going to cover a third crate resw
that we can use to finish this project.
$web-only-end$