Equals and hashCode
The methods equals(Object) and hashCode() are used to denote equivalence and exist since before the Java Collections framework. So they should be well understood, right?
Unfortunately, there is still plenty of room for confusion. Especially in relation to database entities (whether modelled as the old EJB'2 or as JPA @Entity classes). The problems occur for two reasons:
- It's easy to not adhere to the
equals()
contract. -
While legal, a changing return value from
hashCode()
orequals()
causes undefined behaviour inMap
andSet
implementations. This causes bugs.
For the initial part of this essay, the hashCode()
method will all but be ignored. The reason is
that equals()
much more difficult to get right. Once we have that correct, hashCode()
follows easily.
The contract
According to it's javadoc, the equals method implements an equivalence relation on non-null object references:
-
It is reflexive: for any non-null reference value
x
,x.equals(x)
should returntrue
. -
It is symmetric: for any non-null reference values
x
andy
,x.equals(y)
should returntrue
if and only ify.equals(x)
returnstrue
. -
It is transitive: for any non-null reference values
x
,y
, andz
, ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
should returntrue
. -
It is consistent: for any non-null reference values
x
andy
, multiple invocations ofx.equals(y)
consistently returntrue
or consistently returnfalse
, provided no information used inequals
comparisons on the objects is modified. -
For any non-null reference value
x
,x.equals(null)
should returnfalse
.
The tricky part about this contract is the combination of symmetry, transitivity and inheritance. The
reason becomes clear when you have a class structure with non-trivial extensions. Many (relatively) simple
implementations either fail to provide symmetry, or transitivity. This is detailed by Angelika Langer & Klaus
Kreft in Secrets of
equals()
. And although they do show in a second installment
that it's possible to maintain both symmetry and transitivity by using a Visitor pattern to navigate the
class hierarchy, this is much too convoluted for most (if not all) cases.
Considerations for mere mortals
We've seen that if you're really smart, it's possible to create a complex class hierarchy where
equals()
is fully correctly implemented. Usually though, this is not needed.
To see why, first let us look at our classes. We'll notice there are two basic types. These are
behavioural and value classes. Note that the difference is quite subtle: value classes do have
some behavior, as guided by the Tell, don't ask
principle. Behavioral classes however, never have the
need to have data. Important here is that because of the Single Responsibility Principle, a behavioral
class cannot become a value class, nor vice versa.
The easiest to handle are behavioural classes. Since they are all about what they do, you'll usually
not have many instances, and when you do there'll be no point in comparing them. So you'll use the
equals()
and hashCode()
implementations of java.lang.Object
.
For value classes, the class hierarchy becomes important. The choice you need to make, is whether you'll allow subclass instances to be equal to superclass instances. And if you do, whether you'll allow non-trivial extensions. Let's summarize them briefly:
- No subclass/superclass instance equality
-
The easiest to get right, but a bit limited: it's unusable with EJB's, as they often use proxies. Also, this
implementation violates the Liskov Substitution Principle: when you substitute an instance with a
subclass instance, the result of
equals
is different. If you don't mind that, a typical implementation looks like this:public boolean equals(object obj) { if (this == obj) { return true; } if (obj!=null && getClass() == obj.getClass()) { // Cast and compare all relevant fields. // Return the result. } return false; // obj is null or of the wrong class } public int hashCode() { final int modifier = 31; // Any odd prime will do. // (look at multiple powers of a number in binary to find out why) int hashCode = 17; // Any non-zero number will do. // Add the following line, adding the hashcode of each relevant field. hashCode = hashCode * modifier + 0; /* * Hash codes for primitives: * boolean -> value ? 1 : 0 * byte, char, short, int -> (int)value * long -> (int)(value^(value>>>32)) * float -> floatToIntBits(value) * double -> (int)(doubleToLongBits(value)^(doubleToLongBits(value)>>>32)) */ }
- Allow only trivial extensions to be equal
-
A little bit trickier, because you need a
final
modifier. The downside is severely limited subclassing, but it's usable with EJB's and when you have a data model in a database, subclasses are usually (!) limited anyway. A typical implementation looks like this:public final boolean equals(object obj) { if (this == obj) { return true; } if (obj instanceof MyClass) // Replace with your actual class name { // Cast and compare all relevant fields. // Return the result. } return false; // obj is null or of the wrong class } public final int hashCode() { // Implement as above. }
- Allow complex hierarchies
- The most difficult by far. Angelika Langer & Klaus Kreft have detailed a solution for this though.
Entity Java Beans, equality and collections
You might be tempted to use your database's primary key for equals
comparisons and
hashCode
calculations — don't. You'll get subtle bugs when you create an
entity, add it to a collection, and then persist it. The reason is that the primary key will first be
null
, and after persisting it the database / persistence provider will have given it a value.
Thus, your Map
or Set
will give undefined results.
The correct way of implementing equals
and hashCode
is by using an alternate,
natural candidate key. Or, in programming language, final fields that uniquely identify
your objects. Such fields do not change during the lifetime of the object, and as such are perfect for use in
equals()
and hashCode()
.