Guest post by Bahul Jain, a colleague of mine, and a very talented software engineer. Bahul is a Scala enthusiast and a passionate advocate for functional programming. He is our in-house shapeless guru.
In the previous post, we discovered that defining types for all kinds of data we work with is a great idea. It makes code more readable, expressive, safe, clean, reusable, and maintainable. But, does it come at a cost?
Creating classes is fine, but creating objects does have some runtime implications. Object creation takes up some memory and processing time. Futhermore, for simpler data such as height, weight, name, etc. the number of objects created are generally a lot more than those created for complex structures. The larger the number of instances, the more work garbage collector has to do. This is probably the main reason why developers don’t bother creating extra types and instead use primitive types such as Int
, Long
, Boolean
, etc.
In this post we will talk about addressing the performance implications of creating types everywhere. One solution we will discuss is provided by the Scala 2 compiler and the other one is a light-weight library that I just love.
AnyVal
In the previous post you may have noticed that in all the example types, I extended AnyVal
trait to the class definition.
final case class Height(value: Long) extends AnyVal
Here’s the documentation on AnyVal
in Scala’s library:
Beginning with Scala 2.10, it is possible to define a subclass of AnyVal called user-defined value class which is treated specially by the compiler. Properly-defined user value classes provide a way to improve performance on user-defined types by avoiding object allocation at runtime, and by replacing virtual method invocations with static method invocations.
Compiler Constraints
AnyVal
is a neat solution the Scala compiler provides us to address this performance issue, but it has some limitations too since JVM does not natively support the concept of value classes.
Again from the Scala documentation:
User-defined value classes which avoid object allocation…
- can only have a single
val
parameter.- can define
def
s, but noval
s,var
s, or nestedtrait
s,class
es orobject
s.- can typically extend no other trait apart from
AnyVal
.- cannot be used in type tests or pattern matching.
- may not override
equals
orhashCode
methods.
Inaccessible Underlying Value Type in Companion Object
Another major short-coming of value classes are that within the companion object of the value class there is no way to know the type of the underlying value. This is almost a deal-breaker for me. In the Height
type, let’s say we want to define Ordering
for this type. Here’s how we would have done it traditionally:
final case class Height(value: Int) extends AnyVal with Ordering[Height] {
override def compare(x: Height, y: Height): Int = x.value.compareTo(y.value)
}
But extending Ordering[Height]
will violate the requirement for AnyVal
and we will lose the compiler optimizations that AnyVal
offers us. Another option is to define an implicit Ordering
in the companion object:
final case class Height(value: Int) extends AnyVal
object Height {
implicit val ordering: Ordering[Height] = Ordering.by(_.value)
}
This works for the most part, but now I need to duplicate this same line of code for all value classes whose underlying value type is Int
. For e.g. if I had to define Ordering
for Distance
, Age
or Weight
all of which have an underlying Int
value, I would have to define the same logic everywhere. In other words, Ordering
cannot be generically defined for any value class (in the above example, Height
) which relies on the Ordering
of the underlying value type (in the above example, Int
), because, we don’t know the type of the underlying value in the scope of the companion object.
Here are some more examples where the behavior of the value class type depends on the behavior of the underlying value type:
// alleycats.Empty
implicit def empty(implicit E: Empty[Int]): Empty[Height] = Empty(E.empty)
// cats.Show
implicit def show(implicit S: Show[Int]): Show[Height] = Show.show(height => S.show(height.value))
// cats.Hash
implicit def hash(implicit H: Hash[Int]): Hash[Height] = Hash.by(height => H.hash(height.value))
Inability to Access Methods of Underlying Value Type
final case class Email(value: String) extends AnyVal
val email = Email("[email protected]")
email.toLowercase // fails compilation
email.value.toLowercase // compiles
Here you cannot access any of the standard libray methods defined on String
. It’s not the worst thing possible but slightly inconvenient.
Supertagged
This extremely light-weight library offers TaggedType
s and NewType
s which are a much friendlier alternative to AnyVal
s. At runtime these types are erased and only the unboxed Raw type remains so no additional memory or peformance costs are incurred when using them.
Note: This is by no means a comprehensive documentation about the library and everything it offers. The idea is just to highlight it’s salient features that help us avoid a world without types. For full documentation visit here.
Creating Custom Types
Here’s how you would define a custom type (TaggedType
or NewType
) using this library:
object HeightCms extends TaggedType[Int] /* or extends NewType[Int] */ {
// all implicits defined here will be found for HeightCms.Type
implicit final class Ops(private val value: Type) extends AnyVal {
// member methods for HeightCms can be defined here
def toMeters: Double = value.toDouble / 100.0
}
}
type HeightCms = HeightCms.Type
val height: HeightCms = HeightCms(100)
height.toMeters // 1.0
val height1: HeightCms = 100 @@ HeightCms // alternate way to tag a value with a `TaggedType`
Type Relationship
A type created using TaggedType
is a subtype of the underlying Raw
type. With this relationship defined, the TaggedType
gets access to all the methods of the Raw
type.
object Height extends TaggedType[Int]
type Height = Height.Type
val height = Height(10)
height.toDouble // .toDouble is a method defined on Int
TaggedType
s are particularly useful when strict new typing is not required, but you just need separate semantics (for e.g. if you don’t want to mix up Height
and Weight
). Since a TaggedType
is a subtype of the Raw
type it could be used in situations when Raw
type is expected.
object Height extends TaggedType[Double]
type Height = Height.Type
object Weight extends TaggedType[Double]
type Weight = Weight.Type
def calculateBmi(height: Double, weight: Double): Double
calculateBmi(Height(100.0), Weight(79.0)) // this compiles fine because Height is a subtype of Double
A type created using NewType
, as the name suggests, is a completely new type with no relationship to the Raw
type at all. It’s recommended to use NewType
when strict typing is required.
Multi-Tagging
Using this library, one can tag multiple TaggedType
s to a value. A multi-tagged value would benefit from member methods of all the tags associated to it. This does not work with NewType
.
object PositiveNumber extends TaggedType[Int] {
implicit final class Ops(private val value: Type) extends AnyVal {
def double: Int @@ PositiveNumber.Tag & EvenNumber.Tag = (value * 2) @@@ PositiveNumber @@@ EvenNumber
}
}
type PositiveNumber = PositiveNumber.Type
object EvenNumber extends TaggedType[Int] {
implicit final class Ops(private val value: Type) extends AnyVal {
def half: PositiveNumber = PositiveNumber(value / 2)
}
}
type EvenNumber = EvenNumber.Type
val positiveEvenNumber = 26 @@@ PositiveNumber @@@ EvenNumber
// positiveEvenNumber: Int @@ PositiveNumber.Tag & EvenNumber.Tag = 26
positiveEvenNumber.double // 52
positiveEvenNumber.half // 13
Type Definitions
In both TaggedType
and NewType
you have the ability to access the custom type and underlying value type from within the companion object and elsewhere.
object Height extends TaggedType[Int] { // or extends NewType[Int]
// here Type is type of Height
implicit final class Ops(private val value: Type) extends AnyVal {
// and here Raw is the type of underlying value which is Int in this case
def getRaw: Raw = this.untag(value)
}
}
type Height = Height.Type
type HeightRaw = Height.Raw
val height: Height = Height(10)
val heightRaw: HeightRaw = 10
Lifted Behavior
Now that we have access to all the type definitions pertaining to the custom type we can easily create lifted behavior for the custom type based on the behavior of the underlying type. To make this concept reusable we can define the lifted behavior logic in a trait and mix-in that trait with the custom type we create. Let’s take the Ordering
example again:
trait LiftedOrdering {
type Raw
type Type
implicit def ordering(implicit O: Ordering[Raw]): Ordering[Type] = Ordering.by(_.asInstanceOf[Raw])
}
object Height extends TaggedType[Int] with LiftedOrdering // Ordering of Height will be same as ordering of Int
type Height = Height.Type
object Weight extends TaggedType[Int] with LiftedOrdering // Ordering of Weight will be same as ordering of Int
type Weight = Weight.Type
object Name extends NewType[String] with LiftedOrdering // Ordering of Name will be same as ordering of String
type Name = Name.Type
Note: LiftedOrdering
is defined for all TaggedType
s by default in the supertagged library.
Pattern Matching
Both TaggedType
s and NewType
s can be pattern-matched upon without sacrificing performance, unlike AnyVal
s.
val height = Height(5)
height match {
case Height(5) => //...
}
Conclusion
Through this post we learnt the various ways in which all low-level data can be modeled as types, without incurring any performance or memory penalties. More languages and compilers should offer this ability so that developers can model any kind of data as a type without worrying about runtime optimizations.
Next, we explored the supertagged library, which provides great flexibility and versatility in defining custom types. Thanks to the structure of TaggedType
and NewType
, we can easily define behavior for them in a clean and reusable way.
In the next post, we’ll understand the theory of Algebraic Data Types and how it can help the compiler define a super strong type system and offer developers capabilities that one cannot imagine.