Guest post by Bahul Jain, a colleague of mine, and a very talented software engineer. Bahul is a Scala enthusiast and a passionate advocate for functional programming. He is our in-house shapeless guru.
Types are one of the cornerstones of many programming languages. They define the kind of data we work with, and help the compiler understand what operations are safe to perform on that data. Scala is one such language whose strong, static type system is particularly useful, as it allows us to express and enforce constraints that improve code clarity, safety, and reusability.
As developers, we define types to model all kinds of data all the time using classes, abstract classes/traits etc. More often than not we only define types for complicated data such as Person, Car, Shapes, etc. But do we ever consider about creating types for simpler data such as height, weight, age, name, etc.? The answer is almost always no.
In this post, we’ll explore why types are essential, why every data being operated on by the code we write should be defined using a specific type, and why a world without types is not one I want to live in. In doing so we’ll be using Scala 2 for writing up examples and explaining the concepts.
Semantic Sense: Making Code More Meaningful
Types are a powerful tool in providing semantic meaning to your code and data model. They help clarify the purpose of data, making it easier for developers to understand, reason about, and work with.
Long vs. Height
Imagine you’re working with a value that represents the height of a person. Using the Long
type for this can be ambiguous. The Long
type represents a 64-bit integer, but it doesn’t convey what the value actually represents. On the other hand, using a more specific type like Height
gives meaning to the data:
final case class Height(cms: Long) extends AnyVal
val personHeight: Height = Height(180)
In this case, the Height
type clearly communicates the intention of the value, improving readability and reducing ambiguity. If you had simply used Long
, the value could represent anything β a timestamp, a distance, or anything that uses a large integer.
By defining a custom type like Height
, you ensure that the compiler, as well as human readers, understand the context of the data.
Methods Defined in Appropriate Context
In most languages, when methods are defined for a specific type, they become semantically linked to that type. Consider the method .toCentimeters
. This method makes sense when used on types representing length or distance, not on something that is a Long
(not always). And it is impossible to define more methods for Long
type anyways.
final case class Length(meters: Long) extends AnyVal {
def toCentimeters: Double = meters * 100.0
}
val width = Length(2)
println(width.toCentimeters) // 200.0 cm
The method .toCentimeters
is directly tied to the Length
type, and it expresses the conversion clearly, making the code more readable and easier to understand.
Defining Functions on Types
Rather than using utility libraries or scattered functions, you can define methods directly on the types they apply to. This keeps your code well-organized and reduces the cognitive load needed to understand where functionality belongs.
case class Length(meters: Long) extends AnyVal {
def toKilometers: Double = meters / 1000.0
}
val distance = Length(1500)
println(distance.toKilometers) // 1.5 kilometers
Here, toKilometers
is directly associated with the Length
type, making it obvious where the functionality belongs. This organization eliminates the need to search for this function through a myriad of utility libraries.
Defining Behavior on Types
Let’s define a Person
class with 3 fields:
final case class Person(
firstName: String,
cholesterolLevel: Double,
twitterHandle: String
)
Now think about it for a second. Is the first name of a person personally identifiable information (PII)? Is the person’s cholesterol levels their protected health information (PHI) or PII? Is the person’s twitter handle PII or PHI or public information? The answer is different for each field. First name is PII, cholesterol level is PHI and twitter handle is public information.
But how do we associate the privacy rule with the respective data? It’s impossible to do that now because their types are String
, Double
and String
respectively. Not all strings are PII or PHI, and not all of their public information either. The same applies to all primitive types.
Using custom types this resolving this would be a piece of cake:
sealed trait PrivacyRule
object Privacy {
case object PII extends PrivacyRule
case object PHI extends PrivacyRule
case object Public extends PrivacyRule
}
final case class FirstName(value: String) extends AnyVal {
val privacyRule: PrivacyRule = PrivacyRule.PII
}
final case class CholesterolLevel(value: Double) extends AnyVal {
val privacyRule: PrivacyRule = PrivacyRule.PHI
}
final case class TwitterHandle(value: String) extends AnyVal {
val privacyRule: PrivacyRule = PrivacyRule.Public
}
final case class Person(
firstName: FirstName,
cholesterolLevel: CholesterolLevel,
twitterHandle: TwitterHandle
)
One can associate as many behaviors as required to custom types in order to drive any kind of domain decisions and build complex domain logic. It’s easy to define, centralized and clean.
Increased Re-usability: Code Once, Use Everywhere
Once a type is defined, it can be used throughout your codebase, promoting reusability, reducing redundancy and increasing maintainability.
Types Are Reusable
Let’s use the Height
and Weight
type. These types can be used across the entire application for modelling any data that requires a height
and/or weight
field:
final case class Height(cm: Long) extends AnyVal
final case class Weight(kgs: Double) extends AnyVal
final case class Building(height: Height) extends AnyVal
trait Person {
val height: Height
val weight: Weight
}
final case class Tree(height: Height) extends AnyVal
trait Car {
val height: Height
val weight: Weight
}
final case class Dumbbell(weight: Weight) extends AnyVal
By defining Height
and Weight
as types, you ensure consistency across your codebase. You avoid the need to repeatedly define height
and weight
in different contexts.
Methods Defined on Types Are Reusable
Methods defined on types are reusable across instances of that type. For instance, a method to obtain length in kilometers defined on Length
type, can be reused in all places it is used without needing to redefine it.
case class Length(meters: Long) extends AnyVal {
def toKilometers: Double = meters / 1000.0
}
final case class River(length: Length)
final case class Bridge(span: Length)
To obtain the length of the River
in kilometers or to get the span of the Bridge
in kilometers the function invoked will be the same.
val hudson: River = River(length = Length(150401))
println(s"${hudson.length.toKilometers} kms") // 150.401 kms
val goldenGate: Bridge = Bridge(span = Length(2349))
println(s"${goldenGate.span.toKilometers} kms") // 2.349 kms
Without the Length
type you would have to define a val lengthKilometers: Double
in River
and val spanKilometers: Double
in Bridge
both of which would do the same operation. With types this redundancy is eliminated, making the codebase more maintainable and clean.
Behavior Defined on Types Are Reusable
This one is quite self-explanatory and similar to methods defined on types. Once defined on a type, the behavior is shared and consistent across all instances of that type, making your code reusable and maintainable.
Safety: Reducing Bugs and Increasing Developer Productivity
A strong, statically-typed system ensures that many errors are caught at compile time rather than runtime. This helps developers catch mistakes early and avoid bugs in production.
Note: This applies to only statically-typed languages such as Java, Scala, etc. and not to dynamically-typed languages such as Python.
Ensuring Compile-time Safety
Types are checked at compile-time, which means many errors are detected before your program even runs. Let’s take a look at the calculateBmi
function:
def calculateBmi(heightCm: Double, weightKgs: Double): Double
val heightPerson: Double = 170.2
val weightPerson: Double = 79.4
calculateBmi(heightPerson, weightPerson) // no compile-time error
calculateBmi(weightPerson, heightPerson) // no compile-time error!!!
Well, why have code that can allow for such mistakes? With types defined for height
and weight
this kind of problem is much less likely to happen.
def calculateBmi(height: Height, weight: Weight): Double
val heightPerson: Height = Height(170.2)
val weightPerson: Weight = Weight(79.4)
calculateBmi(heightPerson, weightPerson) // no compile-time error
calculateBmi(weightPerson, heightPerson) // Compile-time error: Type mismatch, expected: Height, actual: Weight
Without types, you could try to perform operations that don’t make sense, and these issues might only appear at runtime, making debugging that much harder.
Reducing Bugs
In more than one instance I have discovered code in production systems where incorrect data in being passed to methods because the two fields had the same primitive data type such as String
or Int
and the developer mistook the order in which the values were supposed to be passed. If unit tests are missing or poorly written such bugs can easily enter production codebases.
Using types for all kinds of data ensures that you donβt accidentally misuse data in ways that could cause bugs. The compiler will prevent the error from happening early in the development proces.
Increased Developer Productivity
By catching many issues during compilation, developers can focus on building features rather than dealing with hard-to-diagnose runtime errors. It gives developers more confidence in the correctness of their code which is a huge boost to morale. All of this in turn improves development speed and overall productivity. The added semantic sense improves code readability, maintainability and makes it much easier to build complex systems.
Conclusion
Through this post we have essentially just re-explained the benefits of using classes and object oriented programming in general, but we have extended that principle to the simplest kinds of data we work with. Defining types at the lowest levels of data helps make code more meaningful and readable, encourages reuse, ensures safety at compile time, and allows us to build complex systems upon simpler yet solid foundations. As your codebase grows and evolves, leveraging the power of types becomes crucial to building a robust, scalable, maintainable, less error-prone and easy-to-understand application.
In future posts, we’ll explore the performance aspects of types, and leverage libraries such as supertagged and shapeless that unleash the full potential of types in ways you cannot imagine.