Building an IE8 Linter

To get started we need to add ress to our dependencies. This project is also going to need serde, serde_derive and toml because it will rely on a .toml file to make the list of unavailable tokens configurable.

[package]
name = "lint-ie8"
version = "0.1.0"
authors = ["Robert Masen <r@robertmasen.pizza>"]
edition = "2018"

[dependencies]
ress = "0.6"
serde = "1"
serde_derive = "1"
toml = "0.5"
atty = "0.2"

Next we want to use the Scanner and Token from ress.


# #![allow(unused_variables)]
#fn main() {
#[macro_use]
extern crate serde_derive;
use ress::{
    Scanner,
    Token,
};
#}

Since we are using a .toml file to provide the list of banned tokens, let's create a struct that will represent our configuration.


# #![allow(unused_variables)]
#fn main() {
#[derive(Deserialize)]
struct BannedTokens {
    idents: Vec<String>,
    keywords: Vec<String>,
    puncts: Vec<String>,
    strings: Vec<String>,
}
#}

The toml file we are going to use is pretty big so but if you want to see what it looks like you can check it out here. Essentially it is a list of identifiers, strings, punctuation, and keywords that would cause an error when trying to run in IE8.

To start we need to deserialize that file, we can do that with the std::fs::read_to_string and toml::from_str functions.


# #![allow(unused_variables)]
#fn main() {
    let config_text = ::std::fs::read_to_string("banned_tokens.toml").expect("failed to read config");
    let banned: BannedTokens = from_str(&config_text).expect("Failed to deserialize banned tokens");
#}

Now that we have a list of tokens that should not be included in our javascript, let's get that text. It would be useful to be able to take a path argument or read the raw js from stdin. The function will check for an argument first and fallback to reading from stdin, it looks something like this.


# #![allow(unused_variables)]
#fn main() {
fn get_js() -> Result<String, ::std::io::Error> {
    let mut cmd_args = args();
    let _ = cmd_args.next(); //discard bin name
    let js = if let Some(file_name) = cmd_args.next() {
        let js = read_to_string(file_name)?;
        js
    } else {
        let mut std_in = ::std::io::stdin();
        let mut ret = String::new();
        if atty::is(atty::Stream::Stdin) {
            return Ok(ret)
        }
        std_in.read_to_string(&mut ret)?;
        ret
    };
    Ok(js)
}
#}

we will call it like this.


# #![allow(unused_variables)]
#fn main() {
    let js = match get_js() {
        Ok(js) => if js.len() == 0 {
            print_usage();
            std::process::exit(1);
        } else {
            js
        },
        Err(_) => {
            print_usage();
            std::process::exit(1);
        }
    };
#}

We want to handle the failure when attempting to get the js, so we will match on the call to get_js. If everything went well we need to check if the text is an empty string, this means no argument was provided but the program was not pipped any text. In either of these failure cases we want to print a nice message about how the command should have been written and then exit with a non-zero status code. print_usage is a pretty simple function that will just print to stdout the two ways to use the program.


# #![allow(unused_variables)]
#fn main() {
fn print_usage() {
    println!("banned_tokens <infile>
cat <path/to/file> | banned_tokens");
}
#}

With that out of the way, we now can get into how we are going to solve the actual problem of finding these tokens in a javascript file. There are many ways to make this work but for this example we are going to wrap the Scanner in another struct that implements Iterator. First here is what that struct is going to look like.


# #![allow(unused_variables)]
#fn main() {
struct BannedFinder {
    scanner: Scanner,
    banned: BannedTokens,
}
#}

Before we get into the impl Iterator we should go over an Error implementation that we are going to use. It is relatively straight forward, the actual struct is going to be a tuple struct with three items. The first item is going to be a message that will include the token and type, the second and third are going to be the column/row of the banned token. We need to implement display (Error requires it) which will just create a nice error message for us.


# #![allow(unused_variables)]
#fn main() {
#[derive(Debug)]
pub struct BannedError(String, usize, usize);

impl ::std::error::Error for BannedError {

}

impl ::std::fmt::Display for BannedError {
    fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        write!(f, "Banned {} found at {}:{}", self.0, self.1, self.2)
    }
}
#}

The last thing we need to do is create a way to map from a byte index to a column/row pair. Thankfully the Scanner exposes the original text as a property stream so we can use that to figure out what line and column any index means. The first thing we need is the ability to tell when any given character is a new line character. JavaScript allows for 5 new line sequences (\r,\n, \r\n, \u{2028}, and \u{2029}) so a function that would test for that might look like this.


# #![allow(unused_variables)]
#fn main() {
fn is_js_new_line(c: char) -> bool {
    c == '\n'
    || c == '\u{2028}'
    || c == '\u{2029}'
}
#}

Notice that we aren't testing for \r, this could come back to bite us but for this example the \n should be enough to catch \r\n and for simplicity's sake we can just say that your team does not support the \r new line. Now we can add a method to BannedFinder that will take an index and return the row/column pair.


# #![allow(unused_variables)]
#fn main() {
impl BannedFinder {
    fn get_position(&self, idx: usize) -> (usize, usize) {
        let (row, line_start) = self.scanner.stream[..idx]
            .char_indices()
            .fold((1, 0), |(row, line_start), (i, c)| if is_js_new_line(c) {
                (row + 1, i)
            } else {
                (row, line_start)
            });
        let col = if line_start == 0 {
            idx
        } else {
            idx.saturating_sub(line_start)
        };
        (row, col)
    }
}
#}

We need to capture two pieces of information, the first step what row we are on the second is the index that row started at. We can get both pieces of information by using the char_indices method on &str which will give us an Iterator over tuples the indices and chars in the string. We then fold that iterator into a single value, the row will start at 1 and the index will start at 0. If the current character is a new line we add one to the row and replace any previous index value, otherwise we move on. We are only counting the new lines from the start until the provided index, this will make sure we don't count any extra new lines. Now that we have the row number we need to calculate the column, if the line_start is 0 that means we didn't find new lines so we can just assume it is the first line, meaning the index is already the column, otherwise we need to subtract the line_start from the index.

Ok, now for the exciting part; we are going to impl Iterator for BannedFinder which will look like this.


# #![allow(unused_variables)]
#fn main() {
impl Iterator for BannedFinder {
    type Item = Result<(), BannedError>;
    fn next(&mut self) -> Option<Self::Item> {
        if let Some(item) = self.scanner.next() {
            Some(match &item.token {
                Token::Ident(ref id) => {
                    let id = id.to_string();
                    if self.banned.idents.contains(&id) {
                        let (row, column) = self.get_position(item.span.start);
                        Err(BannedError(format!("identifier {}", id), row, column))
                    } else {
                        Ok(())
                    }
                },
                Token::Keyword(ref key) => {
                    if self.banned.keywords.contains(&key.to_string()) {
                        let (row, column) = self.get_position(item.span.start);
                        Err(BannedError(format!("keyword {}", key.to_string()), row, column))
                    } else {
                        Ok(())
                    }
                },
                Token::Punct(ref punct) => {
                    if self.banned.puncts.contains(&punct.to_string()) {
                        let (row, column) = self.get_position(item.span.start);
                        Err(BannedError(format!("punct {}", punct.to_string()), row, column))
                    } else {
                        Ok(())
                    }
                },
                Token::String(ref lit) => {
                    if self.banned.strings.contains(&lit.no_quote()) {
                        let (row, column) = self.get_position(item.span.start);
                        Err(BannedError(format!("string {}", lit.to_string()), row, column))
                    } else {
                        Ok(())
                    }
                },
                _ => Ok(()),
            })
        } else {
            None
        }
    }
}
#}

First we need to define what the Item for our Iterator is. It is going to be a Result<(), BannedError>, this will allow the caller to check if an item passed inspection. Now we can add the fn next(&mut self) -> Option<Self::Item> definition. Inside that we first want to make sure that the Scanner isn't returning None, if it is we can just return None. If the scanner returns and Item we want to check what kind of token it is, we can do that by matching on &item.token. We only care if the token is a Keyword, Ident, Punct or String, other wise we can say that the token passed. For each of these tokens we are going to check if the actual text is included in any of the Vec<String> properties of self.banned, if it is included we return a BannedError where the first property is a message containing the name of the token type and the text that token represents.

Now that we have all of the underlying infrastructure setup, let's use the BannedFinder in our main.


# #![allow(unused_variables)]
#fn main() {
    for item in finder {
        match item {
            Ok(_) => (),
            Err(msg) => println!("{}", msg),
        }
    }
#}

That is pretty much it. If you wanted to see the full project you can find it in the lint-ie8 folder of this book's github repository.

Demo