This post is a follow-up on the previous post on the topic of writing a lexer and is part of a series of posts in which I try to design and implement a compiler written in Rust. Intermediate code for expressions is the main topic of this post but there has also been some updates on the lexer code and the updates will be display at the end. This post as well as the future posts about the parser are heavily inspired by the legendary dragon book and its example code. The source code for the compiler frontend discussed in this series of posts is available on GitHub (link to the repo).

Intermediate code is an important part when writing a parser. It implements the nodes used in the abstract syntax tree (AST). The nodes can be divided into statements and expressions. Before we get into the implementation of expression nodes let us define a grammar for our language:

program     ->      block
block       ->      { stmts }
stmts       ->      stmts stmt | ε
stmt        ->      id = bool;
            |       if (bool) stmt
            |       if (bool) stmt else stmt
            |       while (bool) stmt
            |       break;
            |       block
bool        ->      bool || join | join
join        ->      join && equality | equality
equality    ->      equality == rel | equality != rel | rel
rel         ->      expr < expr | expr <= expr | expr >= expr |
                    expr > expr | expr
expr        ->      expr + term | expr - term | term
term        ->      term * unary | term / unary | unary
unary       ->      ! unary | - unary | factor
factor      ->      ( bool ) | id | num | true | false

From the grammar above one can detect the expression nodes by looking for the operators that are associated with expressions. In our language the expression nodes are Id (variables), Arith (arithmetic operators), Temp (helper node to generate three address code), Unary (unary operator -), Constant (numbers and boolean values), And, Or, Rel (relational operators) and Not (unary operator !). These nodes contain functions that allow the parser to turn the token stream generated by the lexer into intermediate representation (IR). Let us start by implementing some IR generating functions shared by all the expression nodes. In inter.rs:

use super::lexer::Token;
use std::rc::Rc;
use std::cell::RefCell;

pub trait ExprNode {
    fn emit_label(&self, i: u32) {
        println!("L{}:", i);
    }

    fn emit(&self, s: String) {
        println!("\t{}", s);
    }

    fn emit_jumps(&self, test: String, t: u32, f: u32) {
        if t != 0 && f != 0 {
            self.emit(format!("if {} goto L{}", test, t));
            self.emit(format!("goto L{}", f));
        } else if t != 0 {
            self.emit(format!("if {} goto L{}", test, t));
        } else if f != 0 {
            self.emit(format!("iffalse {} goto L{}", test, f));
        }
    }

    fn jumping(&self, t: u32, f: u32) {
        self.emit_jumps(self.to_string(), t, f);
    }

    fn to_string(&self) -> String;
}

The Rc and RefCell imports are for later use. The emit_label function emits a label with with given label numer i, emit emits a string s, emit_jumps generates IR code for logical expressions. It is given a string s, which often contains a logical expression, and label numbers t and f for a situation where condition is true and where it is false respectively. jumping calls emit_jumps for expression node itself. to_string gives a string representation of the expression node.

Now we will define a union of all the expression node types. This makes it easier to link the child nodes to the parent node. Instead of the union type I found it easier to work with enum. Let us also implement some helper functions for the union. In inter.rs:

//..
#[derive(Debug, Clone)]
pub enum ExprUnion {
    Id(Box<Id>),
    Arith(Box<Arith>),
    Temp(Box<Temp>),
    Unary(Box<Unary>),
    Constant(Box<Constant>),
    Or(Box<Or>),
    And(Box<And>),
    Not(Box<Not>),
    Rel(Box<Rel>),
}

impl ExprUnion {
    fn gen_expr_string(&self) -> String {
        match self {
            ExprUnion::Id(id) => id.to_string(),
            ExprUnion::Arith(arith) => arith.gen().to_string(),
            ExprUnion::Temp(temp) => temp.to_string(),
            ExprUnion::Unary(unary) => unary.gen().to_string(),
            ExprUnion::Constant(constant) => constant.to_string(),
            ExprUnion::Or(or) => or.gen().to_string(),
            ExprUnion::And(and) => and.gen().to_string(),
            ExprUnion::Not(not) => not.gen().to_string(),
            ExprUnion::Rel(rel) => rel.gen().to_string(),
        }
    }

    fn jumping(&self, t: u32, f: u32) {
        match self {
            ExprUnion::Id(id) => id.jumping(t, f),
            ExprUnion::Arith(arith) => arith.jumping(t, f),
            ExprUnion::Temp(temp) => temp.jumping(t, f),
            ExprUnion::Unary(unary) => unary.jumping(t, f),
            ExprUnion::Constant(constant) => constant.jumping(t, f),
            ExprUnion::Or(or) => or.jumping(t, f),
            ExprUnion::And(and) => and.jumping(t, f),
            ExprUnion::Not(not) => not.jumping(t, f),
            ExprUnion::Rel(rel) => rel.jumping(t, f),
        }
    }
}

The gen_expr_string function generates a three address code string representation of the desired node. The jumping function helps us call the jumping function of the desired expression node. Now we can implement the expression nodes.

Let us start with Id. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Id {
    token: Token,
}

impl Id {
    pub fn new(token: Token) -> Self {
        Id { token }
    }
}

impl ExprNode for Id {
    fn to_string(&self) -> String {
        self.token.clone().value_to_string()
    }
}

The Id node simply contains a token Id. Id nodes are used to represent variables in the AST.

Arith node is a bit different. It has two whild nodes which can be a direct value or a new arithmetic operation. When generating the IR we need to make sure that each arithmetic operation is represented in three address code. For example, an operation represented like

z = x + y + 2;

is not three address code but

t1 = x + y;
z = t1 + 2; 

is. This is where the Temp node mentioned earlier comes in. It’s purpose is to generate temporary values and it makes three address code IR possible. We will implement it after Arith. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Arith {
    op: Token,
    temp_count: Rc<RefCell<u32>>,
    expr1: ExprUnion,
    expr2: ExprUnion,
}

impl Arith {
    pub fn gen(&self) -> Self {
        let mut e1 = self.expr1.clone();
        let mut e2 = self.expr2.clone();

        match e1 {
            ExprUnion::Arith(arith) => {
                e1 = ExprUnion::Temp(Box::new(arith.reduce()));
            }
            ExprUnion::Unary(unary) => {
                e1 = ExprUnion::Temp(Box::new(unary.reduce()));
            }
            _ => (),
        }

        match e2 {
            ExprUnion::Arith(arith) => {
                e2 = ExprUnion::Temp(Box::new(arith.reduce()));
            }
            ExprUnion::Unary(unary) => {
                e2 = ExprUnion::Temp(Box::new(unary.reduce()));
            }
            _ => (),
        }

        Arith::new(self.op.clone(), Rc::clone(&self.temp_count), e1, e2)
    }

    pub fn new(op: Token, temp_count: Rc<RefCell<u32>>, expr1: ExprUnion, expr2: ExprUnion) -> Self {
        Arith {
            op,
            temp_count,
            expr1,
            expr2,
        }
    }

    fn reduce(&self) -> Temp {
        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.emit(format!("{} = {}", temp.to_string(), self.gen().to_string()));
        temp
    }
}

impl ExprNode for Arith {
    fn jumping(&self, t: u32, f: u32) {
        self.emit_jumps(self.to_string(), t, f);
    }
    fn to_string(&self) -> String {
        let e1 = self.expr1.gen_expr_string();
        let e2 = self.expr2.gen_expr_string();
        format!("{} {} {}", e1, self.op.clone().value_to_string(), e2)
    }
}

The gen function helps us generate a three address code representation of the node by reducing the child nodes in case they are unary or arithmetic operators. The reduce function reduces the node itself into a Temp node that is in the form of three address code. jumping creates jumping code for the node. The temp_count field keeps track of the temporary value numbers (t1, t2, t3, ...) and is shared between all nodes that make use of it. It will eventually be initialzed in the Parser struct. Arith uses it when it initializes a new Temp node.

The next node we will implement is Temp. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Temp {
    number: u32,
}

impl Temp {
    pub fn new(temp_count: Rc<RefCell<u32>>) -> Self {
        let mut c = temp_count.borrow_mut();
        *c += 1;
        let number = *c;
        drop(c);
        Temp { number }
    }
}

impl ExprNode for Temp {
    fn to_string(&self) -> String {
        format!("t{}", self.number)
    }
}

It is pretty straicht forward. The constructor gets the temp_count, borrows the value, increments it and assigns the incremented value to its field number. The Unary node is quite similar to Arith except that there is only one child node. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Unary {
    op: Token,
    temp_count: Rc<RefCell<u32>>,
    expr: ExprUnion,
}

impl Unary {
    pub fn gen(&self) -> Self {
        let mut e = self.expr.clone();
        match e {
            ExprUnion::Arith(arith) => {
                e = ExprUnion::Temp(Box::new(arith.reduce()));
            }
            ExprUnion::Unary(unary) => {
                e = ExprUnion::Temp(Box::new(unary.reduce()));
            }
            _ => (),
        }
        Unary::new(self.op.clone(), Rc::clone(&self.temp_count), e)
    }
    pub fn new(op: Token, temp_count: Rc<RefCell<u32>>, expr: ExprUnion) -> Self {
        Unary {
            op,
            temp_count,
            expr,
        }
    }

    fn reduce(&self) -> Temp {
        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.emit(format!("{} = {}", temp.to_string(), self.gen().to_string()));
        temp
    }
}

impl ExprNode for Unary {
    fn jumping(&self, t: u32, f: u32) {
        self.emit_jumps(self.to_string(), t, f);
    }

    fn to_string(&self) -> String {
        let e = self.expr.gen_expr_string();
        format!("{} {}", self.op.clone().value_to_string(), e)
    }
}

The Constant node is quite similar to Id. The difference is in the jumping function, which needs to handle situations where the constant is boolean. If the constant is true it should emit jump into the label with label number t and to label number f in case it is false. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Constant {
    constant: Token,
}

impl Constant {
    pub fn new(constant: Token) -> Self {
        Constant { constant }
    }
}

impl ExprNode for Constant {
    fn jumping(&self, t: u32, f: u32) {
        match self.constant {
            Token::True(_) => {
                if t != 0 {
                    self.emit(format!("goto L{}", t));
                }
            }
            Token::False(_) => {
                if f != 0 {
                    self.emit(format!("goto L{}", f));
                }
            }
            _ => ()
        }
    }

    fn to_string(&self) -> String {
        self.constant.clone().value_to_string()
    }
}

To generate three address code IR the Or and And nodes sometimes need to generate a lot of jumping code. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Or {
    label: Rc<RefCell<u32>>,
    temp_count: Rc<RefCell<u32>>,
    op: Token,
    expr1: ExprUnion,
    expr2: ExprUnion,
}

impl Or {
    fn gen(&self) -> Temp {
        let mut l = self.label.borrow_mut();
        *l += 1;
        let f = *l;
        *l += 1;
        let a = *l;
        drop(l);

        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.jumping(0, f);
        self.emit(format!("{} = true", temp.to_string()));
        self.emit(format!("goto L{}", a));
        self.emit_label(f);
        self.emit(format!("{} = false", temp.to_string()));
        self.emit_label(a);
        temp
    }
    pub fn new(
        label: Rc<RefCell<u32>>,
        temp_count: Rc<RefCell<u32>>,
        op: Token,
        expr1: ExprUnion,
        expr2: ExprUnion,
    ) -> Self {
        Or {
            label,
            temp_count,
            op,
            expr1,
            expr2,
        }
    }
}

impl ExprNode for Or {
    fn jumping(&self, t: u32, f: u32) {
        let mut new_label = t;
        if t == 0 {
            let mut l = self.label.borrow_mut();
            *l += 1;
            new_label = *l;
            drop(l);
        }
        self.expr1.jumping(new_label, 0);
        self.expr2.jumping(t, f);
        if t == 0 {
            self.emit_label(new_label);
        }
    }

    fn to_string(&self) -> String {
        let e1 = self.expr1.gen_expr_string();
        let e2 = self.expr2.gen_expr_string();
        format!("{} {} {}", e1, self.op.clone().value_to_string(), e2)
    }
}

#[derive(Debug, Clone)]
pub struct And {
    label: Rc<RefCell<u32>>,
    temp_count: Rc<RefCell<u32>>,
    op: Token,
    expr1: ExprUnion,
    expr2: ExprUnion,
}

impl And {
    fn gen(&self) -> Temp {
        let mut l = self.label.borrow_mut();
        *l += 1;
        let f = *l;
        *l += 1;
        let a = *l;
        drop(l);

        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.jumping(0, f);
        self.emit(format!("{} = true", temp.to_string()));
        self.emit(format!("goto L{}", a));
        self.emit_label(f);
        self.emit(format!("{} = false", temp.to_string()));
        self.emit_label(a);
        temp
    }
    pub fn new(
        label: Rc<RefCell<u32>>,
        temp_count: Rc<RefCell<u32>>,
        op: Token,
        expr1: ExprUnion,
        expr2: ExprUnion,
    ) -> Self {
        And {
            label,
            temp_count,
            op,
            expr1,
            expr2,
        }
    }
}

impl ExprNode for And {
    fn jumping(&self, t: u32, f: u32) {
        let mut new_label = f;
        if f == 0 {
            let mut l = self.label.borrow_mut();
            *l += 1;
            new_label = *l;
            drop(l);
        }
        self.expr1.jumping(0, new_label);
        self.expr2.jumping(t, f);
        if f == 0 {
            self.emit_label(new_label);
        }
    }

    fn to_string(&self) -> String {
        let e1 = self.expr1.gen_expr_string();
        let e2 = self.expr2.gen_expr_string();
        format!("{} {} {}", e1, self.op.clone().value_to_string(), e2)
    }
}

Both nodes have a label field which is a bit similar to temp_count. Number of labels is a single u32 value that is shared among all the nodes that need it. The gen functions of the nodes generate labels to jump to depending on the outcome of the operator on the expressions. It then assigns a temporary variable to true or false depending on which label the jump was made. The jumping functions are responsible for generating the jumping code of the child nodes.

The Not node is simple. It shares the same gen function with Or and And. For jumping it just calls the jumping of the child node. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Not {
    op: Token,
    label: Rc<RefCell<u32>>,
    temp_count: Rc<RefCell<u32>>,
    expr: ExprUnion,
}

impl Not {
    fn gen(&self) -> Temp {
        let mut l = self.label.borrow_mut();
        *l += 1;
        let f = *l;
        *l += 1;
        let a = *l;
        drop(l);

        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.jumping(0, f);
        self.emit(format!("{} = true", temp.to_string()));
        self.emit(format!("goto L{}", a));
        self.emit_label(f);
        self.emit(format!("{} = false", temp.to_string()));
        self.emit_label(a);
        temp
    }
    pub fn new(
        op: Token,
        label: Rc<RefCell<u32>>,
        temp_count: Rc<RefCell<u32>>,
        expr: ExprUnion,
    ) -> Self {
        Not {
            op,
            label,
            temp_count,
            expr,
        }
    }
}

impl ExprNode for Not {
    fn jumping(&self, t: u32, f: u32) {
        self.expr.jumping(f, t);
    }

    fn to_string(&self) -> String {
        let e = self.expr.gen_expr_string();
        format!("{} {}", self.op.clone().value_to_string(), e)
    }
}

Last but not least we have the Rel node. It shares the same gen functions with the rest of the logical expression nodes. For jumping it needs to reduce the its child nodes if they are Arith or Unary. Then it just emits jumps to t or f based on how it evaluates. In inter.rs:

//...
#[derive(Debug, Clone)]
pub struct Rel {
    op: Token,
    label: Rc<RefCell<u32>>,
    temp_count: Rc<RefCell<u32>>,
    expr1: ExprUnion,
    expr2: ExprUnion,
}

impl Rel {
    fn gen(&self) -> Temp {
        let mut l = self.label.borrow_mut();
        *l += 1;
        let f = *l;
        *l += 1;
        let a = *l;
        drop(l);

        let temp = Temp::new(Rc::clone(&self.temp_count));
        self.jumping(0, f);
        self.emit(format!("{} = true", temp.to_string()));
        self.emit(format!("goto L{}", a));
        self.emit_label(f);
        self.emit(format!("{} = false", temp.to_string()));
        self.emit_label(a);
        temp
    }
    pub fn new(
        op: Token,
        label: Rc<RefCell<u32>>,
        temp_count: Rc<RefCell<u32>>,
        expr1: ExprUnion,
        expr2: ExprUnion,
    ) -> Self {
        Rel {
            op,
            label,
            temp_count,
            expr1,
            expr2,
        }
    }
}

impl ExprNode for Rel {
    fn jumping(&self, t: u32, f: u32) {
        let mut e1 = self.expr1.clone();
        let mut e2 = self.expr2.clone();

        match e1 {
            ExprUnion::Arith(arith) => {
                e1 = ExprUnion::Temp(Box::new(arith.reduce()));
            }
            ExprUnion::Unary(unary) => {
                e1 = ExprUnion::Temp(Box::new(unary.reduce()));
            }
            _ => (),
        }

        match e2 {
            ExprUnion::Arith(arith) => {
                e2 = ExprUnion::Temp(Box::new(arith.reduce()));
            }
            ExprUnion::Unary(unary) => {
                e2 = ExprUnion::Temp(Box::new(unary.reduce()));
            }
            _ => (),
        }

        let test = format!(
            "{} {} {}",
            e1.gen_expr_string(),
            self.op.clone().value_to_string(),
            e2.gen_expr_string()
        );
        self.emit_jumps(test, t, f);
    }

    fn to_string(&self) -> String {
        let e1 = self.expr1.gen_expr_string();
        let e2 = self.expr2.gen_expr_string();
        format!("{} {} {}", e1, self.op.clone().value_to_string(), e2)
    }
}

Phew! That conclued all the intermediate expression code in our rusty compiler so far. I hope this post was not too long and difficult to read. If you find any flaws in the code or have suggestions for improvements please contact me!

Updated lexer code

Updated lexer code in lexer.rs:

use std::collections::{HashMap, LinkedList};

#[derive(Debug, Clone)]
pub enum Token {
    Num(f64), // numbers are currently float only, maybe splitting into Num(u64) and Real(f64) later...
    Id(String),
    True(String),
    False(String),
    If(String),
    Else(String),
    While(String),
    And(String),
    Or(String),
    Eql(String),
    Ne(String),
    Le(String),
    Ge(String),
    Lt(String),
    Gt(String),
    Asgn(String),
    Not(String),
    Add(String),
    Sub(String),
    Mul(String),
    Div(String),
    Lcb(String),
    Rcb(String),
    Lrb(String),
    Rrb(String),
    Scol(String),
}

impl Token {
    pub fn value_to_string(self) -> String {
        match self {
            Token::Num(i) => format!("{}", i),
            Token::Id(s) => s,
            Token::True(s) => s,
            Token::False(s) => s,
            Token::If(s) => s,
            Token::Else(s) => s,
            Token::While(s) => s,
            Token::And(s) => s,
            Token::Or(s) => s,
            Token::Eql(s) => s,
            Token::Ne(s) => s,
            Token::Le(s) => s,
            Token::Ge(s) => s,
            Token::Lt(s) => s,
            Token::Gt(s) => s,
            Token::Asgn(s) => s,
            Token::Not(s) => s,
            Token::Add(s) => s,
            Token::Sub(s) => s,
            Token::Mul(s) => s,
            Token::Div(s) => s,
            Token::Lcb(s) => s,
            Token::Rcb(s) => s,
            Token::Lrb(s) => s,
            Token::Rrb(s) => s,
            Token::Scol(s) => s,
        }
    }
}

#[derive(Debug, Clone)]
pub struct Lexer {
    pub tokens: LinkedList<Token>,
    words: HashMap<String, Token>,
    lineno: u32,
}

impl Lexer {
    pub fn new() -> Lexer {
        Lexer {
            tokens: LinkedList::new(),
            words: HashMap::from([
                (String::from("true"), Token::True(String::from("true"))),
                (String::from("false"), Token::False(String::from("false"))),
                (String::from("if"), Token::If(String::from("if"))),
                (String::from("else"), Token::Else(String::from("else"))),
                (String::from("while"), Token::While(String::from("while"))),
            ]),
            lineno: 1,
        }
    }
    pub fn lex(&mut self, input: &str) {
        let mut it = input.chars().peekable();

        while let Some(&c) = it.peek() {
            match c {
                ' ' | '\t' => {
                    it.next();
                }
                '\n' => {
                    self.lineno += 1;
                    it.next();
                }
                '&' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('&') = ch {
                        self.tokens.push_back(Token::And(String::from("&&")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Id(String::from("&")));
                    };
                }
                '|' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('|') = ch {
                        self.tokens.push_back(Token::Or(String::from("||")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Id(String::from("|")));
                    };
                }
                '=' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('=') = ch {
                        self.tokens.push_back(Token::Eql(String::from("==")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Asgn(String::from("=")));
                    };
                }
                '!' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('=') = ch {
                        self.tokens.push_back(Token::Ne(String::from("!=")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Not(String::from("!")));
                    };
                }
                '<' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('=') = ch {
                        self.tokens.push_back(Token::Le(String::from("<=")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Lt(String::from("<")));
                    };
                }
                '>' => {
                    it.next();
                    let ch = it.peek();
                    if let Some('=') = ch {
                        self.tokens.push_back(Token::Ge(String::from(">=")));
                        it.next();
                    } else {
                        self.tokens.push_back(Token::Gt(String::from(">")));
                    };
                }
                '+' => {
                    self.tokens.push_back(Token::Add(String::from("+")));
                    it.next();
                }
                '-' => {
                    self.tokens.push_back(Token::Sub(String::from("-")));
                    it.next();
                }
                '*' => {
                    self.tokens.push_back(Token::Mul(String::from("*")));
                    it.next();
                }
                '/' => {
                    self.tokens.push_back(Token::Div(String::from("/")));
                    it.next();
                }
                '{' => {
                    self.tokens.push_back(Token::Lcb(String::from("{")));
                    it.next();
                }
                '}' => {
                    self.tokens.push_back(Token::Rcb(String::from("}")));
                    it.next();
                }
                '(' => {
                    self.tokens.push_back(Token::Lrb(String::from("(")));
                    it.next();
                }
                ')' => {
                    self.tokens.push_back(Token::Rrb(String::from(")")));
                    it.next();
                }
                ';' => {
                    self.tokens.push_back(Token::Scol(String::from(";")));
                    it.next();
                }
                '0'..='9' => {
                    let mut n = c
                        .to_string()
                        .parse::<f64>()
                        .expect("Character not a digit.");

                    it.next();
                    let mut digitch = it.peek();

                    while let Some(&i) = digitch {
                        if !i.is_ascii_digit() {
                            if i == '.' {
                                let mut d = 10.0;
                                it.next();
                                digitch = it.peek();

                                while let Some(&j) = digitch {
                                    if !j.is_ascii_digit() {
                                        digitch = None;
                                    } else {
                                        let f = j
                                            .to_string()
                                            .parse::<f64>()
                                            .expect("Character not a digit.");
                                        n += f / d;
                                        d *= 10.0;
                                        it.next();
                                        digitch = it.peek();
                                    }
                                }
                            } else {
                                digitch = None;
                            }
                        } else {
                            let digit = i
                                .to_string()
                                .parse::<f64>()
                                .expect("Character not a digit.");
                            n = n * 10.0 + digit;
                            it.next();
                            digitch = it.peek();
                        }
                    }
                    self.tokens.push_back(Token::Num(n));
                }
                'A'..='Z' | 'a'..='z' => {
                    let mut s = String::new();
                    s.push(c);

                    it.next();
                    let mut ch = it.peek();
                    while let Some(&i) = ch {
                        if !i.is_ascii_digit() && !i.is_alphabetic() {
                            ch = None;
                        } else {
                            s.push(i);
                            it.next();
                            ch = it.peek();
                        }
                    }
                    match self.words.get(&s) {
                        Some(t) => self.tokens.push_back(Token::clone(t)),
                        None => {
                            self.tokens.push_back(Token::Id(s.clone()));
                            self.words.insert(s.clone(), Token::Id(s));
                        }
                    }
                }
                _ => {
                    self.tokens.push_back(Token::Id(String::from(c)));
                    it.next();
                }
            }
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn correct_amount_of_tokens() {
        let input = String::from("1 _ != && =ok 3.4 1.0=_");
        let mut lexer = Lexer::new();
        lexer.lex(&input);
        assert_eq!(10, lexer.tokens.len())
    }

    #[test]
    fn correct_token_types() {
        let input = String::from("1 _ while { != && =ok 3.4 1.0=_ true false if else true1");
        let mut lexer = Lexer::new();
        lexer.lex(&input);
        let output = format!("{:?}", lexer.tokens);
        assert_eq!(
            r#"[Num(1.0), Id("_"), While("while"), Lcb("{"), Ne("!="), And("&&"), Asgn("="), Id("ok"), Num(3.4), Num(1.0), Asgn("="), Id("_"), True("true"), False("false"), If("if"), Else("else"), Id("true1")]"#,
            output
        )
    }

    #[test]
    fn correct_block_handling() {
        let input = String::from("while {(*/;)}");
        let mut lexer = Lexer::new();
        lexer.lex(&input);
        let output = format!("{:?}", lexer.tokens);
        assert_eq!(
            r#"[While("while"), Lcb("{"), Lrb("("), Mul("*"), Div("/"), Scol(";"), Rrb(")"), Rcb("}")]"#,
            output
        )
    }
}