A world without types - Part 2

Bahul Jain Bahul Jain
Views

Guest post by Bahul Jain, a colleague of mine, and a very talented software engineer. Bahul is a Scala enthusiast and a passionate advocate for functional programming. He is our in-house shapeless guru.

In the previous post, we discovered that defining types for all kinds of data we work with is a great idea. It makes code more readable, expressive, safe, clean, reusable, and maintainable. But, does it come at a cost?

Creating classes is fine, but creating objects does have some runtime implications. Object creation takes up some memory and processing time. Futhermore, for simpler data such as height, weight, name, etc. the number of objects created are generally a lot more than those created for complex structures. The larger the number of instances, the more work garbage collector has to do. This is probably the main reason why developers don’t bother creating extra types and instead use primitive types such as Int, Long, Boolean, etc.

In this post we will talk about addressing the performance implications of creating types everywhere. One solution we will discuss is provided by the Scala 2 compiler and the other one is a light-weight library that I just love.

AnyVal

In the previous post you may have noticed that in all the example types, I extended AnyVal trait to the class definition.

final case class Height(value: Long) extends AnyVal

Here’s the documentation on AnyVal in Scala’s library:

Beginning with Scala 2.10, it is possible to define a subclass of AnyVal called user-defined value class which is treated specially by the compiler. Properly-defined user value classes provide a way to improve performance on user-defined types by avoiding object allocation at runtime, and by replacing virtual method invocations with static method invocations.

Compiler Constraints

AnyVal is a neat solution the Scala compiler provides us to address this performance issue, but it has some limitations too since JVM does not natively support the concept of value classes.

Again from the Scala documentation:

User-defined value classes which avoid object allocation…

  • can only have a single val parameter.
  • can define defs, but no vals, vars, or nested traits, classes or objects.
  • can typically extend no other trait apart from AnyVal.
  • cannot be used in type tests or pattern matching.
  • may not override equals or hashCode methods.

Inaccessible Underlying Value Type in Companion Object

Another major short-coming of value classes are that within the companion object of the value class there is no way to know the type of the underlying value. This is almost a deal-breaker for me. In the Height type, let’s say we want to define Ordering for this type. Here’s how we would have done it traditionally:

final case class Height(value: Int) extends AnyVal with Ordering[Height] {
  override def compare(x: Height, y: Height): Int = x.value.compareTo(y.value)
}

But extending Ordering[Height] will violate the requirement for AnyVal and we will lose the compiler optimizations that AnyVal offers us. Another option is to define an implicit Ordering in the companion object:

final case class Height(value: Int) extends AnyVal
object Height {
  implicit val ordering: Ordering[Height] = Ordering.by(_.value)
}

This works for the most part, but now I need to duplicate this same line of code for all value classes whose underlying value type is Int. For e.g. if I had to define Ordering for Distance, Age or Weight all of which have an underlying Int value, I would have to define the same logic everywhere. In other words, Ordering cannot be generically defined for any value class (in the above example, Height) which relies on the Ordering of the underlying value type (in the above example, Int), because, we don’t know the type of the underlying value in the scope of the companion object.

Here are some more examples where the behavior of the value class type depends on the behavior of the underlying value type:

// alleycats.Empty
implicit def empty(implicit E: Empty[Int]): Empty[Height] = Empty(E.empty)

// cats.Show
implicit def show(implicit S: Show[Int]): Show[Height] = Show.show(height => S.show(height.value))

// cats.Hash
implicit def hash(implicit H: Hash[Int]): Hash[Height] = Hash.by(height => H.hash(height.value))

Inability to Access Methods of Underlying Value Type

final case class Email(value: String) extends AnyVal
val email = Email("[email protected]")

email.toLowercase       // fails compilation
email.value.toLowercase // compiles

Here you cannot access any of the standard libray methods defined on String. It’s not the worst thing possible but slightly inconvenient.

Supertagged

This extremely light-weight library offers TaggedTypes and NewTypes which are a much friendlier alternative to AnyVals. At runtime these types are erased and only the unboxed Raw type remains so no additional memory or peformance costs are incurred when using them.

Note: This is by no means a comprehensive documentation about the library and everything it offers. The idea is just to highlight it’s salient features that help us avoid a world without types. For full documentation visit here.

Creating Custom Types

Here’s how you would define a custom type (TaggedType or NewType) using this library:

object HeightCms extends TaggedType[Int] /* or extends NewType[Int] */ {
  // all implicits defined here will be found for HeightCms.Type

  implicit final class Ops(private val value: Type) extends AnyVal {
    // member methods for HeightCms can be defined here

    def toMeters: Double = value.toDouble / 100.0
  }
}
type HeightCms = HeightCms.Type

val height: HeightCms = HeightCms(100)
height.toMeters  // 1.0

val height1: HeightCms = 100 @@ HeightCms // alternate way to tag a value with a `TaggedType`

Type Relationship

A type created using TaggedType is a subtype of the underlying Raw type. With this relationship defined, the TaggedType gets access to all the methods of the Raw type.

object Height extends TaggedType[Int]
type Height = Height.Type

val height = Height(10)
height.toDouble // .toDouble is a method defined on Int

TaggedTypes are particularly useful when strict new typing is not required, but you just need separate semantics (for e.g. if you don’t want to mix up Height and Weight). Since a TaggedType is a subtype of the Raw type it could be used in situations when Raw type is expected.

object Height extends TaggedType[Double]
type Height = Height.Type

object Weight extends TaggedType[Double]
type Weight = Weight.Type

def calculateBmi(height: Double, weight: Double): Double

calculateBmi(Height(100.0), Weight(79.0)) // this compiles fine because Height is a subtype of Double

A type created using NewType, as the name suggests, is a completely new type with no relationship to the Raw type at all. It’s recommended to use NewType when strict typing is required.

Multi-Tagging

Using this library, one can tag multiple TaggedTypes to a value. A multi-tagged value would benefit from member methods of all the tags associated to it. This does not work with NewType.

object PositiveNumber extends TaggedType[Int] {
  implicit final class Ops(private val value: Type) extends AnyVal {
    def double: Int @@ PositiveNumber.Tag & EvenNumber.Tag = (value * 2) @@@ PositiveNumber @@@ EvenNumber
  }
}
type PositiveNumber = PositiveNumber.Type

object EvenNumber extends TaggedType[Int] {
  implicit final class Ops(private val value: Type) extends AnyVal {
    def half: PositiveNumber = PositiveNumber(value / 2)
  }
}
type EvenNumber = EvenNumber.Type

val positiveEvenNumber = 26 @@@ PositiveNumber @@@ EvenNumber
// positiveEvenNumber: Int @@ PositiveNumber.Tag & EvenNumber.Tag = 26

positiveEvenNumber.double // 52
positiveEvenNumber.half   // 13

Type Definitions

In both TaggedType and NewType you have the ability to access the custom type and underlying value type from within the companion object and elsewhere.

object Height extends TaggedType[Int] { // or extends NewType[Int]
  // here Type is type of Height
  implicit final class Ops(private val value: Type) extends AnyVal {
    // and here Raw is the type of underlying value which is Int in this case
    def getRaw: Raw = this.untag(value)
  }
}

type Height = Height.Type
type HeightRaw = Height.Raw

val height: Height = Height(10)
val heightRaw: HeightRaw = 10

Lifted Behavior

Now that we have access to all the type definitions pertaining to the custom type we can easily create lifted behavior for the custom type based on the behavior of the underlying type. To make this concept reusable we can define the lifted behavior logic in a trait and mix-in that trait with the custom type we create. Let’s take the Ordering example again:

trait LiftedOrdering {
  type Raw
  type Type
  implicit def ordering(implicit O: Ordering[Raw]): Ordering[Type] = Ordering.by(_.asInstanceOf[Raw])
}

object Height extends TaggedType[Int] with LiftedOrdering // Ordering of Height will be same as ordering of Int
type Height = Height.Type

object Weight extends TaggedType[Int] with LiftedOrdering // Ordering of Weight will be same as ordering of Int
type Weight = Weight.Type

object Name extends NewType[String] with LiftedOrdering // Ordering of Name will be same as ordering of String
type Name = Name.Type

Note: LiftedOrdering is defined for all TaggedTypes by default in the supertagged library.

Pattern Matching

Both TaggedTypes and NewTypes can be pattern-matched upon without sacrificing performance, unlike AnyVals.

val height = Height(5)

height match {
  case Height(5) => //...
}

Conclusion

Through this post we learnt the various ways in which all low-level data can be modeled as types, without incurring any performance or memory penalties. More languages and compilers should offer this ability so that developers can model any kind of data as a type without worrying about runtime optimizations.

Next, we explored the supertagged library, which provides great flexibility and versatility in defining custom types. Thanks to the structure of TaggedType and NewType, we can easily define behavior for them in a clean and reusable way.

In the next post, we’ll understand the theory of Algebraic Data Types and how it can help the compiler define a super strong type system and offer developers capabilities that one cannot imagine.

scala