Old problems, new ways of solving it?
If all you have is a hammer, everything looks like a nail.
This sentence summarizes pretty much how I felt about the idiomatic polymorphism approach in Java to most problems until recently. Polymorphism is just a tool and as for any tool, it excels at some tasks and performs poorly at others.
So what are the tasks where polymorphism does a good job at?
I would say it is a good tool when you have a few functions operating on a potentially large or unknown number of data types. Or when the number of types will vary over time. This model is more behavior-centric where every data type brings its behavior to a fixed and small set of functions.
However, there is another use case where a fixed and small set of data types exists, and an unknown number of functions rely on those types as a group. A typical example is a tree, graph, or other recursive data structure where it is difficult for a function to have a meaning as part of the types themselves.
This second use case is usually covered in Java using the Visitor pattern. This is, unfortunately, not the most transparent or straightforward design pattern.
But with the recent improvements and preview features in the JDK 17, can we do better?
Example: Simplifying a boolean expression
Before going into a code example, let’s first agree on the problem to tackle.
We will implement a boolean expression NOT
simplifier.
Why choosing a boolean expression for this example? A boolean expression is something very simple to model and understand. However, there are very few core operations on a boolean expression. Evaluating is one of those, but simplifying an expression is not part of it as there are too many moving parts or ways to do so. Hence the regular approach would use a visitor pattern.
The data model
We will have an Expression
interface with a few default methods and static factories and four records types implementing it:
public interface Expression {
default Expression not() { return new Not(this); }
default Expression and(Expression other) { return new And(this, other); }
default Expression or(Expression other) { return new Or(this, other); }
static Expression not(Expression expression) { return expression.not(); }
}
/**
* A simple boolean variable having a name
*/
public record Variable(String name) implements Expression {}
/**
* Negates an expression
*/
public record Not(Expression expression) implements Expression {}
/**
* Combines two expressions using a logical <strong>and</strong>
*/
public record And(Expression left, Expression right) implements Expression {}
/**
* Combines two expressions using a logical <strong>or</strong>
*/
public record Or(Expression left, Expression right) implements Expression {}
Records were introduced in JDK 16 via JEP 395: Records after two previews in JDK 14 and JDK 15. I encourage you to read the article Why Java’s Records Are Better* Than Lombok’s @Data and Kotlin’s Data Classes from Nicolai Parlog. It is a very detailed dive into what records are and what they’re not.
The goal
The goal is to implement a method with the following signature:
/**
* <p>Takes an {@link Expression} and builds an other {@link Expression} where the {@link Not}
* are pushed to {@link Variable} leaves.
* For example:
* <ul>
* <li>{@code NOT(A AND B) => (NOT A) OR (NOT B)}</li>
* <li>{@code NOT(A OR B) => (NOT A) AND (NOT B)}</li>
* </ul>
* </p>
* <p>When two {@link Not} are consecutive, they are eliminated.
* For example: {@code NOT(NOT(A)) => A}</p>
*
* @param e the {@link Expression} to simplify
* @return the simplified {@link Expression}.
*/
public static Expression simplify(Expression e) {
throw new UnsupportedOperationException();
}
The visitor pattern
I have some respect for you and I will therefore spare you the full implementation of the visitor pattern. If you’re still interested in the implementation, it is available on Github.
The naive non-visitor pattern implementation with if statements
In Java 15 and before, without using preview features, we would have had to write something like this:
public static Expression simplify(Expression e) {
if (e instanceof Variable) {
return e;
}
if (e instanceof And) {
And and = (And) e;
return simplify(and.left()).and(simplify(and.right()));
}
if (e instanceof Or) {
Or or = (Or) e;
return simplify(or.left()).or(simplify(or.right()));
}
if (e instanceof Not) {
return simplifyNot((Not)e);
}
throw new UnsupportedOperationException("Expression type not supported " + e.getClass());
}
public static Expression simplifyNot(Not not) {
Expression e = not.expression();
if (e instanceof Variable) {
// nothing to simplify
return not;
}
if (e instanceof And) {
// NOT(A AND B) => NOT(A) OR NOT(B)
And and = (And) e;
return simplify(not(and.left())).or(simplify(not(and.right())));
}
if (e instanceof Or) {
// NOT(A OR B) => NOT(A) AND NOT(B)
Or or = (Or) e;
return simplify(not(or.left())).and(simplify(not(or.right())));
}
if (e instanceof Not) {
// NOT(NOT(A)) => A
return simplify(((Not)e).expression());
}
throw new UnsupportedOperationException("Expression type not supported " + e.getClass());
}
This approach has two issues:
It is lengthy as we do redundant
instanceof
/castsIt is not proven complete at compilation time, hence the need to throw
UnsupportedOperationException
if no condition is met. In case a sub-type toExpression
is added, the compiler will not tell us, and we might only discover it only at runtime.
Since Java 15, we can improve the first point by leveraging on JEP 394: Pattern Matching for instanceof. It avoids the cumbersome casts and allows to rewrite code:
if (obj instanceof String) {
String s = (String) obj; // mandatory cast
}
into this one:
if (obj instanceof String s) {
// s is a String
}
Applying it to our example, it would look like this:
public static Expression simplify(Expression e) {
if (e instanceof Variable v) {
return v;
}
if (e instanceof And and) {
return simplify(and.left()).and(simplify(and.right()));
}
if (e instanceof Or or) {
return simplify(or.left()).or(simplify(or.right()));
}
if (e instanceof Not not) {
return simplifyNot(not);
}
throw new UnsupportedOperationException("Expression type not supported " + e.getClass());
}
public static Expression simplifyNot(Not not) {
Expression e = not.expression();
if (e instanceof Variable v) {
return not;
}
if (e instanceof And and) {
return simplify(not(and.left())).or(simplify(not(and.right())));
}
if (e instanceof Or or) {
return simplify(not(or.left())).and(simplify(not(or.right())));
}
if (e instanceof Not notnot) {
return simplify(notnot.expression());
}
throw new UnsupportedOperationException("Expression type not supported " + e.getClass());
}
That’s how far we can get with the if
/else
approach.
We cannot unfortunately have completeness checked by the compiler at this point.
The switch-based approach
Say hello to the switch expressions
Using a switch-based approach, we would improve readability and could potentially tackle the completeness issue. But for this, we need JEP 361: Switch Expressions which was introduced in JDK 14 after two previews.
This allows to rewrite the following code in a much nicer way:
int numLetters;
switch (day) {
case MONDAY:
case FRIDAY:
case SUNDAY:
numLetters = 6;
break;
case TUESDAY:
numLetters = 7;
break;
case THURSDAY:
case SATURDAY:
numLetters = 8;
break;
case WEDNESDAY:
numLetters = 9;
break;
default:
throw new IllegalStateException("Wat: " + day);
}
int numLetters = switch (day) {
case MONDAY, FRIDAY, SUNDAY -> 6;
case TUESDAY -> 7;
case THURSDAY, SATURDAY -> 8;
case WEDNESDAY -> 9;
};
As you can see, the throw
clause disappeared because a completeness check is performed.
A compilation error is thrown if not all cases are covered.
Assuming we remove the case WEDNESDAY
, the following compilation error is raised:
| Error:
| the switch expression does not cover all possible input values
| int numLetters = switch (day) {
| ^-------------...
This is a great addition that makes switch
more robust than if
/else
as you’ll be warned at compilation time if not all branches are covered.
The JEP also states the following to cover potential runtimes issues:
in the case of an enum switch expression that covers all known constants, a default clause is inserted by the compiler to indicate that the enum definition has changed between compile-time and runtime.
Now, this is not enough for us as we would like to perform our switch on the object type.
Also greet the pattern matching for switch
To achieve this, we need to rely on JEP 406: Pattern Matching for switch (Preview) from JDK 17.
To give a rough idea of what this is about, you can think of it as if JEP 394: Pattern Matching for instanceof and JEP 361: Switch Expressions had a baby.
So, let’s take a look at our code with this amazing feature:
public static Expression simplify(Expression e) {
return switch (e) {
case Variable v -> v;
case And and -> simplify(and.left()).and(simplify(and.right()));
case Or or -> simplify(or.left()).or(simplify(or.right()));
case Not not -> simplifyNot(not);
default -> throw new UnsupportedOperationException("Expression type not supported " + e.getClass());
};
}
public static Expression simplifyNot(Not not) {
return switch (not.expression()) {
case Variable v -> not;
case And and -> simplify(and.left().not()).or(simplify(and.right().not()));
case Or or -> simplify(or.left().not()).and(simplify(or.right().not()));
case Not nonot -> simplify(nonot.expression());
default -> throw new UnsupportedOperationException("Expression type not supported " + not.getClass());
};
}
Wait, why do we have to include this default
clause?
Remember when we said that switch expressions must be complete?
Not doing so would result in a compiler error: the switch expression does not cover all possible input values
.
That’s not very nice to have.
Especially as we know, that they are only four implementations of the Expression
interface.
However, the compiler does not know it.
But there is a way to let the compiler know and to enforce it. This is thanks to the JEP 409: Sealed Classes from JDK 17. So much goodness in this JDK 17.
We can now explicitly say that Expression
is one of Variable
, And
, Or
or Not
and nothing else.
Let’s update the interface declaration to represent it:
public sealed interface Expression permits Variable, And, Or, Not {
// ...
}
If we now try to declare a Xor
expression which can also be expressed as: (A OR B) AND (NOT(A) OR NOT(B))
we will get the following compilation error:
class is not allowed to extend sealed class: fr.loicrouchon.novisits.Expression (as it is not listed in its permits clause)
Going back to the switch we can now express it without the need for the default
clause:
public static Expression simplify(Expression e) {
return switch (e) {
case Variable v -> v;
case And and -> simplify(and.left()).and(simplify(and.right()));
case Or or -> simplify(or.left()).or(simplify(or.right()));
case Not not -> simplifyNot(not);
};
}
public static Expression simplifyNot(Not not) {
return switch (not.expression()) {
case Variable v -> not;
case And and -> simplify(and.left().not()).or(simplify(and.right().not()));
case Or or -> simplify(or.left().not()).and(simplify(or.right().not()));
case Not nonot -> simplify(nonot.expression());
};
}
This is it, we reached our goal thanks to JEP 406: Pattern Matching for switch (Preview).
The guard pattern
We did not need this feature in our little example, but this JEP also defines the notion of guard patterns. With guard patterns, an additional boolean expression can be added next to the type pattern.
One can replace the following code:
static void test(Object o) {
switch (o) {
case String s:
if (s.length() == 1) { ... }
else { ... }
break;
...
}
}
by:
static void test(Object o) {
switch (o) {
case String s && (s.length() == 1) -> ...
case String s -> ...
...
}
}
About instanceof
/casting and switch
on types
It used to be strongly discouraged to perform instanceof
/cast
operations in if
statements and conceptually out of reach for switch
statements.
Why do we start to do so now?
I haven’t played enough with instanceof
and casting in my career to have an answer to this part.
I was most of the time creative enough to avoid being in such a situation and I feel I would need to unlearn first to be able to provide a decent answer here.
However, when it comes to the switch
, it feels like a very handy replacement for the visitor design pattern.
They were two major complaints against the switch
:
implementation far from the type and scattered across the codebase
The visitor pattern has the same issue, so wherever the visitor pattern was good for, the pattern matching for switch is also good enough.
completeness: When a new type/constant is added, the
if
/else
/switch
will not complain at compilation time and issues will arise at runtime.
Here, the switch
expression solves the issue by checking for completeness.
If a new type/case is added, a compilation error will arise.
Unless a manual default
clause has been written.
So when writing switch-expressions, be aware of the hidden cost of the default
clause and try to favor the usage of sealed types.
Looking at the Java pattern future
This article covered the yet-to-be-released JEP 406: Pattern Matching for switch (Preview) and how it enables the Java language to express more, more clearly.
The code can be found on Github.
But can we look even further? In JDK 18 and beyond for example? As of June 20th, 2021, the list of JEP that will land in JDK 18 is not yet known. But we do know about one candidate which is JEP 405: Record Patterns & Array Patterns (Preview).
This will bring deconstruction patterns for both Arrays and Records.
We could then imagine writing things like:
public static Expression simplify(Expression e) {
return switch (e) {
case Variable v -> v;
case And(var left, var right) -> simplify(left).and(simplify(right));
case Or(var left, var right) -> simplify(left).or(simplify(right));
case Not(var expression) -> simplifyNot(expression);
};
}
public static Expression simplifyNot(Expression e) {
return switch (e) {
case Variable v -> not(e);
case And(var left, var right) -> simplify(not(left)).or(simplify(not(right)));
case Or(var left, var right) -> simplify(not(left)).and(simplify(not(right)));
case Not(var expression) -> simplify(expression);
};
}
Or even, all in one switch:
public static Expression simplify(Expression e) {
return switch (e) {
case Variable v -> e;
case And(var left, var right) -> simplify(left).and(simplify(right));
case Or(var left, var right) -> simplify(left).or(simplify(right));
case Not(Variable v) -> e;
case Not(And(var left, var right) -> simplify(not(left)).or(simplify(not(right)));
case Not(Or(var left, var right) -> simplify(not(left)).and(simplify(not(right)));
case Not(Not(var expression)) -> simplify(expression);
};
}
The JEP even says the following:
Today, to express ad-hoc polymorphic calculations like this we would use the cumbersome visitor pattern. In the future, using pattern matching will lead to code that is transparent and straightforward
Keep in mind, this last part regarding JEP 405: Record Patterns & Array Patterns (Preview) is based on my current understanding of the JEP and that the JEP will evolve before making it into the JDK.
To conclude I’d like to say this:
What seemed to be unrelated JEPs in the beginning, switch expressions, records, pattern matching (instance of, switch), deconstruction patterns, is slowly but surely converging toward a highly consistent and well-thought design which I can’t stop being amazed at.