A world without types - Part 1

Bahul Jain Bahul Jain
Views

Guest post by Bahul Jain, a colleague of mine, and a very talented software engineer. Bahul is a Scala enthusiast and a passionate advocate for functional programming. He is our in-house shapeless guru.

Types are one of the cornerstones of many programming languages. They define the kind of data we work with, and help the compiler understand what operations are safe to perform on that data. Scala is one such language whose strong, static type system is particularly useful, as it allows us to express and enforce constraints that improve code clarity, safety, and reusability.

As developers, we define types to model all kinds of data all the time using classes, abstract classes/traits etc. More often than not we only define types for complicated data such as Person, Car, Shapes, etc. But do we ever consider about creating types for simpler data such as height, weight, age, name, etc.? The answer is almost always no.

In this post, we’ll explore why types are essential, why every data being operated on by the code we write should be defined using a specific type, and why a world without types is not one I want to live in. In doing so we’ll be using Scala 2 for writing up examples and explaining the concepts.

Semantic Sense: Making Code More Meaningful

Types are a powerful tool in providing semantic meaning to your code and data model. They help clarify the purpose of data, making it easier for developers to understand, reason about, and work with.

Long vs. Height

Imagine you’re working with a value that represents the height of a person. Using the Long type for this can be ambiguous. The Long type represents a 64-bit integer, but it doesn’t convey what the value actually represents. On the other hand, using a more specific type like Height gives meaning to the data:

final case class Height(cms: Long) extends AnyVal

val personHeight: Height = Height(180)

In this case, the Height type clearly communicates the intention of the value, improving readability and reducing ambiguity. If you had simply used Long, the value could represent anything β€” a timestamp, a distance, or anything that uses a large integer.

By defining a custom type like Height, you ensure that the compiler, as well as human readers, understand the context of the data.

Methods Defined in Appropriate Context

In most languages, when methods are defined for a specific type, they become semantically linked to that type. Consider the method .toCentimeters. This method makes sense when used on types representing length or distance, not on something that is a Long (not always). And it is impossible to define more methods for Long type anyways.

final case class Length(meters: Long) extends AnyVal {
  def toCentimeters: Double = meters * 100.0
}

val width = Length(2)
println(width.toCentimeters) // 200.0 cm

The method .toCentimeters is directly tied to the Length type, and it expresses the conversion clearly, making the code more readable and easier to understand.

Defining Functions on Types

Rather than using utility libraries or scattered functions, you can define methods directly on the types they apply to. This keeps your code well-organized and reduces the cognitive load needed to understand where functionality belongs.

case class Length(meters: Long) extends AnyVal {
  def toKilometers: Double = meters / 1000.0
}

val distance = Length(1500)
println(distance.toKilometers) // 1.5 kilometers

Here, toKilometers is directly associated with the Length type, making it obvious where the functionality belongs. This organization eliminates the need to search for this function through a myriad of utility libraries.

Defining Behavior on Types

Let’s define a Person class with 3 fields:

final case class Person(
  firstName: String,
  cholesterolLevel: Double,
  twitterHandle: String
)

Now think about it for a second. Is the first name of a person personally identifiable information (PII)? Is the person’s cholesterol levels their protected health information (PHI) or PII? Is the person’s twitter handle PII or PHI or public information? The answer is different for each field. First name is PII, cholesterol level is PHI and twitter handle is public information.

But how do we associate the privacy rule with the respective data? It’s impossible to do that now because their types are String, Double and String respectively. Not all strings are PII or PHI, and not all of their public information either. The same applies to all primitive types.

Using custom types this resolving this would be a piece of cake:

sealed trait PrivacyRule
object Privacy {
  case object PII extends PrivacyRule
  case object PHI extends PrivacyRule
  case object Public extends PrivacyRule
}

final case class FirstName(value: String) extends AnyVal {
  val privacyRule: PrivacyRule = PrivacyRule.PII
}

final case class CholesterolLevel(value: Double) extends AnyVal {
  val privacyRule: PrivacyRule = PrivacyRule.PHI
}

final case class TwitterHandle(value: String) extends AnyVal {
  val privacyRule: PrivacyRule = PrivacyRule.Public
}

final case class Person(
  firstName: FirstName,
  cholesterolLevel: CholesterolLevel,
  twitterHandle: TwitterHandle
)

One can associate as many behaviors as required to custom types in order to drive any kind of domain decisions and build complex domain logic. It’s easy to define, centralized and clean.

Increased Re-usability: Code Once, Use Everywhere

Once a type is defined, it can be used throughout your codebase, promoting reusability, reducing redundancy and increasing maintainability.

Types Are Reusable

Let’s use the Height and Weight type. These types can be used across the entire application for modelling any data that requires a height and/or weight field:

final case class Height(cm: Long) extends AnyVal
final case class Weight(kgs: Double) extends AnyVal

final case class Building(height: Height) extends AnyVal

trait Person {
  val height: Height
  val weight: Weight
}

final case class Tree(height: Height) extends AnyVal

trait Car {
  val height: Height
  val weight: Weight
}

final case class Dumbbell(weight: Weight) extends AnyVal

By defining Height and Weight as types, you ensure consistency across your codebase. You avoid the need to repeatedly define height and weight in different contexts.

Methods Defined on Types Are Reusable

Methods defined on types are reusable across instances of that type. For instance, a method to obtain length in kilometers defined on Length type, can be reused in all places it is used without needing to redefine it.

case class Length(meters: Long) extends AnyVal {
  def toKilometers: Double = meters / 1000.0
}

final case class River(length: Length)
final case class Bridge(span: Length)

To obtain the length of the River in kilometers or to get the span of the Bridge in kilometers the function invoked will be the same.

val hudson: River = River(length = Length(150401))
println(s"${hudson.length.toKilometers} kms") // 150.401 kms

val goldenGate: Bridge = Bridge(span = Length(2349))
println(s"${goldenGate.span.toKilometers} kms") // 2.349 kms

Without the Length type you would have to define a val lengthKilometers: Double in River and val spanKilometers: Double in Bridge both of which would do the same operation. With types this redundancy is eliminated, making the codebase more maintainable and clean.

Behavior Defined on Types Are Reusable

This one is quite self-explanatory and similar to methods defined on types. Once defined on a type, the behavior is shared and consistent across all instances of that type, making your code reusable and maintainable.

Safety: Reducing Bugs and Increasing Developer Productivity

A strong, statically-typed system ensures that many errors are caught at compile time rather than runtime. This helps developers catch mistakes early and avoid bugs in production.

Note: This applies to only statically-typed languages such as Java, Scala, etc. and not to dynamically-typed languages such as Python.

Ensuring Compile-time Safety

Types are checked at compile-time, which means many errors are detected before your program even runs. Let’s take a look at the calculateBmi function:

def calculateBmi(heightCm: Double, weightKgs: Double): Double

val heightPerson: Double = 170.2
val weightPerson: Double = 79.4

calculateBmi(heightPerson, weightPerson) // no compile-time error
calculateBmi(weightPerson, heightPerson) // no compile-time error!!!

Well, why have code that can allow for such mistakes? With types defined for height and weight this kind of problem is much less likely to happen.

def calculateBmi(height: Height, weight: Weight): Double

val heightPerson: Height = Height(170.2)
val weightPerson: Weight = Weight(79.4)

calculateBmi(heightPerson, weightPerson) // no compile-time error
calculateBmi(weightPerson, heightPerson) // Compile-time error: Type mismatch, expected: Height, actual: Weight

Without types, you could try to perform operations that don’t make sense, and these issues might only appear at runtime, making debugging that much harder.

Reducing Bugs

In more than one instance I have discovered code in production systems where incorrect data in being passed to methods because the two fields had the same primitive data type such as String or Int and the developer mistook the order in which the values were supposed to be passed. If unit tests are missing or poorly written such bugs can easily enter production codebases.

Using types for all kinds of data ensures that you don’t accidentally misuse data in ways that could cause bugs. The compiler will prevent the error from happening early in the development proces.

Increased Developer Productivity

By catching many issues during compilation, developers can focus on building features rather than dealing with hard-to-diagnose runtime errors. It gives developers more confidence in the correctness of their code which is a huge boost to morale. All of this in turn improves development speed and overall productivity. The added semantic sense improves code readability, maintainability and makes it much easier to build complex systems.

Conclusion

Through this post we have essentially just re-explained the benefits of using classes and object oriented programming in general, but we have extended that principle to the simplest kinds of data we work with. Defining types at the lowest levels of data helps make code more meaningful and readable, encourages reuse, ensures safety at compile time, and allows us to build complex systems upon simpler yet solid foundations. As your codebase grows and evolves, leveraging the power of types becomes crucial to building a robust, scalable, maintainable, less error-prone and easy-to-understand application.

In future posts, we’ll explore the performance aspects of types, and leverage libraries such as supertagged and shapeless that unleash the full potential of types in ways you cannot imagine.

scala