Refactoring: from strategy to interpreter pattern with Jest testing tool
Through my previous article, we explored how to extract conditionals into functions using simple and convenient use cases. However, more complicated situations won’t be addressed by simple extractions.
Along this exercice, we will replace a switch statement with a well known concept — Polymorphism.
Problem statement
First, as an example, we’ll focus on evaluating simple Algebraic Expressions such as “1 + (6 + 5) * 8” or in a more general manner by respecting associativity and Operator Precedence:
a OP b OP c
(a OP b) OP c
a OP (b OP c)
- OP stands for operator and a, b, c for operands
As with most of my examples, we have to think of this particular problem as a stand-in for something even more complicated.
First solution with switch statement
The obvious solution that comes to mind is an Expression Class that deals with an operator and left/right operands:
The whole expression is split into three parts where left/right operands could also be split in order to respect associativity.
Jest Unit Tests
Before we start implementing any solutions, we install Jest and configure it to be run with Typescript classes:
npm i -D jest @types/jest ts-jest typescript
Next, we add “jest.config.js” to the root of our project:
We assume we have a package.json setup, all typescript classes in src folder and expectations in tests folder.
Source code with all configuration and setup instructions could be found at the master branch of: https://github.com/elie29/expression
Now, we can simply call jest on the command line.
Writing our first expectations
Let’s start by creating some expectations in order to expose and clarify our problem statement:
Running this code with Jest would fail as no solution is yet provided:
Time to implement expression evaluation
As explained above, the evaluate function determines the result recursively by interpreting the operators. Following the same logic, a switch statement would be written as follows:
We run Jest after the modification, all tests should pass :
Indeed, this simple demonstration reveals a weak solution that violates OCP principle. Each time, we want to add or exploit a new operator, we need to modify the switch case.
Switch to polymorphism
When we analyse our expression carefully, we notice that an operand is either a value or an expression. Another generic solution is to move “Expression” to an abstract class where :
- ValueExpression returns simply the value
- OperatorExpression operates on left and right operands
The following steps expand the solution with a polymorphism concept:
Step 1: Expression as abstract
Expression abstract class is restricted to handle only the evaluate function signature:
Step 2: Operator Expression as abstract
We create another abstract class to handle Addition/Multiplication operators where each operator needs left and right expressions:
Step 3: Value Expression as a concrete class
Value expression returns simply the value:
Final step: Concrete Operators
Each operator acts as a concrete class by implementing its own logic evaluation:
Adding another operator becomes straightforward by creating a new class without any changes among existing classes.
Apply changes to expectations
Our unit tests need to be changed to fit the new design:
With polymorphism though we have enhanced the application as follows:
- No more operators required in the constructor
- No need to have a switch statement to handle the operators
Even so, this solution is not quite practical nor attractive. What about resolving the following expression 😅:
((-5 + 8) * (-3 + -2) * ((4 + 8) / (5 - -1) - 3) * (-2 - -4) * (5 * (8 + 6)) + 4) * 8
Thankfully, interpreter pattern comes to the rescue with its Lexer/Parser concepts.
Interpreter Pattern
Interpreter pattern provides a way to evaluate expression grammar. It is a behavioral design pattern that aims to parse any expression according to grammar rules, build abstract syntax tree and then run the interpretation to return the expected result.
Defining Grammar Rules
In order to parse any language or expression, we need to define some rules. In our case, our expression grammar is relatively simple:
- One space should separates operands from operators
- Excessive parenthesis invalidates the expression
Usually for more complex cases, we would use a notation technique such as BNF.
Lexer Class
A lexer is just an expression recognizer that transforms input of characters to an array of tokens. This step is also called — tokenization.
Following our grammar above, we have one step to accomplish tokenization:
- Splitting the expression into an array
That said, the following method apply a regular expression and some trimming/filtering techniques:
Given the expression: 4 * (8 – 2), split function would return a list of tokens: [‘4’, ‘*’, ‘8’, ‘-’, ‘2’]
This is a simple lexer implementation. A more complex tokenization process could take place but this goes beyond the scope.
Parser Class
Parser depends on the Lexer class. Once tokenization is done, we loop over the array of tokens in order to create an AST by applying a variation of NPN.
We could apply a strict NPN or even RPN algorithm, but I prefer to demonstrate a different approach.
In order to rearrange tokens to respect operator precedence, I’ve decided to implement merge sort algorithm with O(NLogN) complexity.
The idea behind a merge sort algorithm is to divide the array into several sub-arrays until each one consists of a single element and merging those sub-arrays in a manner that results into a sorted list.
The same process could apply to our array of tokens. However, we divide the array, not according to its length, but depending on the index of the low precedence operator. Finally, the merge appends the operator at the left:
To apply this algorithm, two main methods are useful:
- Finding the index — pivot — to split the array into two halves:
- Applying the merge with the operator at the current index:
Indeed, some others methods are required to do some cleaning tasks and to fully respect precedence operator concept.
For further details, parser class code is available here.
Interpreter final step
Based on the parser class, the interpreter is just an aggregator that accepts an expression as input and return its result as output:
The core of the interpreter is the expression evaluation that iterates over the array of tokens by using a stack, with the expression processed as follows:
- If the current token is an operator, we check if the next two tokens are operands. If so, we evaluate the expression and push the result onto the stack;
- Otherwise, we push back the current token onto the stack.
The interpreter implementation did not eliminate the strategy pattern, on the contrary evaluateOperands method still uses it:
The new data structure — Map introduced in ES6 — let us map each operator to its class definition:
Wrapping Up
From a simple expression, we demonstrated how to get rid of a switch statement in favor of polymorphism. We extended the strategy pattern by introducing a simple lexer/parser implementation.
Moreover, expectations were very helpful to finalize refactoring. Each single modification requires to verify that tests were passed successfully — for that purpose, I have started this article by adding and configuring Jest.
The complete source code is available at this address: https://github.com/elie29/expression
Each step has its own dedicated branch.
Don’t hesitate to give me your feedback, or contact me on Twitter @elie_nehme