Scala · Tutorial

Using Case Classes to Structure Your Data

get programming with scala Using Case Classes to Structure Your Data         

By Daniela Sfregola

This article, taken from the pages of Get Programming with Scala, discusses an important tool in your Scala toolbox, called “case class.”

Save 37% off Get Programming with Scala with code fccsfregola at manning.com.

When coding, dealing with data is an essential and recurring task. A case class provides a convenient and efficient way to represent your data in an immutable way to make sure your program can handle data sharing between multiple threads. Being able to represent data in an easy and convenient way is essential to make sure our program works correctly. You’ll also see case objects and how they can be useful when serialization’s involved. Finally, you’ll learn about tuples, a structure to couple data in a quick but not powerful way.

Consider this
Think of the languages that you’ve encountered in your coding experience. How do you represent data? What functions do you usually define? Do you use getters? Do you use setters? Do you have support from either the language itself of your IDE to reduce any eventual boilerplate?

Case class

Representing data is a key part of writing programs, but it’s also mechanical: you need to define your fields, setters, getters, and other details. When coding in languages which are a bit more verbose, such as Java, you often end up using the tools, such as your IDE, to automatically generate some code for you. What if, instead of seeking support from your IDE, the compiler could do this for you?

A case class is a class with an arbitrary number of parameters for which the compiler automatically adds ad-hoc code. Case classes are the ideal data containers because of these functionalities that encourage the use of immutability. Listing 1 shows an example of case class representing a person with a name and age:

Listing 1: The Person case class

case class Person(name: String, age: Int)

You can also think of a case class as a class that is characterized by its parameters. For example, an instance of an identified by a name and an age for class Person. For this reason, you can also refer to this as “product type.”

As a result of the keyword case, the compiler automatically adds a few functionalities to it. Let’s have an overview of what these are by considering the following instance of the class Person:

val p = new Person("Tom", 25)

Getters

For each parameter, the compiler adds a getter function with the same name of the parameter it refers to, allowing easy access to them. For example, you can access the name and age as follows:

p.name // returns "Tom"
p.age // returns 25

copy function

Because a case class represents data in an immutable way, you don’t have setter functions for its parameters. When in need of changing a value, you should use the copy function to create a new data representation. Suppose you want to change the age of you instance p to be 35:

val anotherP: Person = p.copy(age = 35)
p.age // returns 25
anotherP.age // returns 35

You can also change more parameters at the same time. For example, if you want to change both its name and age you can provide more parameters to the copy function:

val mark = p.copy(name = "Mark", age = p.age + 1)
mark.name // returns "Mark"
mark.age // returns 26

toString, hashcode, equals functions

Every class has the functions toString, hashCode and equals and their implementation comes directly from the Java world. In Java, java.lang.Object is the superclass of all the classes and it provides an implementation for several functions, which are inherited by all the other classes. A case class redefines the implementation for some of these functions inherited from java.lang.Object. Let’s see how:

    • toString – By default, a toString function returns a string representing the name of the class followed by the hexadecimal representation of the hash code associated to the class instance: for example, this may look similar to “Person@1e04fa0a.” A case class redefines this method to return a string which is descriptive on the data it contains:
p.toString() // returns "Person(Tom,25)"
    • hashCode – The hashCode function returns an integer that represents an instance of a class. The JVM uses this number in data structures and hash tables when storing objects in a more performant way. Although a hash code of an instance usually takes into account both its internal structure and memory allocation, a case class overrides its hash code, keeping consideration to its internal structure only. This implies that two case classes with the same structure have the same hash code.
class C(x: Int)
new C(5).hashCode == new C(5).hashCode // returns false

case class A(x: Int)
case class B(n: Int)
new A(5).hashCode == new B(5).hashCode // returns true
    • equals – According to the implementation of equals defined in lang.Object, equality holds that if two instances are the same, they’re pointing to the same memory allocation. When working with case classes, the compiler provides a different implementation for equals in which case classes that belong to the same class and which have the same structure are considered to be equal.
class C(x: Int)
new C(5).equals(new C(5)) // returns false

case class A(x: Int)
case class B(n: Int)
new A(5).equals(new B(5)) // returns false
new A(5).equals(new A(5)) // returns true

Companion object: apply and unapply functions

When declaring a case class, the compiler also creates a companion object for it in which the compiler generated implementations for the apply and unapply methods respectively to construct and deconstruct an instance of a case class. Let’s see how they work:

  • apply – thanks to the generated apply method, you can create an instance of your case class by providing parameters for it. For example, to create an instance of Person, you can use the apply method instead of using its constructor directly. All the following statements equivalent:
new Person("Tom", 25)
Person.apply("Tom", 25)
Person("Tom", 25)
Person(age = 25, name = "Tom")
  • unapply – the unapply method is a way to decompose a class. In a case class the unapply returns the parameters of a class. For example, you can decompose an instance of Person to obtain an optional grouping containing a name and an age:
Person.unapply(Person("Tom", 25))
// returns Some((Tom,25)) with type Option[(String, Int)]

You might be a bit confused right now if you haven’t encountered options and tuples yet, but don’t worry, we’ll get there soon. For the moment, all you need to remember is that thanks to the unapply method you can use pattern matching of the parameters of a case class. Have a look at listing 2 for a few examples of things you achieve using pattern matching when an unapply function is defined:

Listing 2: Pattern Matching of a Person

def welcome(person: Person): String = person match {
   case Person("Tom", _) => "Good to see you Tom!" //#A
   case Person(name, age) if age > 18 => s"Hello Mr/Mrs $name" //#B
   case Person(_, _) => "Hello"  //#C
}
//#A It matches a person with name "Tom"
//#B It matches a person with age bigger than 18
//#C It matched a person with any name and any age

As you’ve seen, the compiler saves us a lot of code (and potential buggy code) through the use of case classes! Listing 3 shows the amount of code that you’d need to write to create a class equivalent to a case class:

Listing 3: Class vs Case Class

class Person(n: String, a: Int) {
   val name: String = n
   val age: Int = a

   def copy(name: String, age: Int) = new Person(name, age)
   override def toString(): String = s"Person($n,$a)"
   override def hashCode(): Int = ??? //#A
   override def equals(obj: Any): Boolean = ??? //#A
}
//#A Implementation omitted
object Person {
   def apply(name: String, age: Int): Person = new Person(name, age)
   def unapply(p: Person): Option[(String, Int)] = Some((p.name, p.age))
}

Figure 1 shows a summary of how to declare a case class and the functionalities the compiler generates for it.
12.1_case_class
Figure 1: A syntax diagram for a case class. A case class is equivalent to a class with companion object that the compiler has enriched with a set of useful functions.

Quick Check 1
Define case classes to express the following relations:

  • a brewery has a name
  • a beer is identified by a name and a brewery

Case object

Now that you understand what case classes are, you may wonder if an equivalent scenario exists for objects: they’re called “case objects”. Have a look at listing 4 for an example of a case object for the currency USD:

Listing 4: The USD currency

case object USD

A case object is a regular object for which the compiler automatically adds some useful code: the definition of toString is redefined to make it easier to “serialize” – representing it with a string. For a regular object, toString returns the name of the object followed by the hexadecimal encoding of its hash code: this looks similar to “USD@7b36aa0c.” When dealing with a case object, the compiler changes the definition for toString to return only the name of the object without its hash code:

USD.toString // returns "USD"

Have a look at figure 2 for a summary of the syntax for case objects.
12.2_case_object Figure 2: A syntax diagram for a case object in Scala. A case object is equivalent to an object with a redefined toString function.

Quick Check 2
Define three more currencies using case objects: GBP, CAD and EUR. Use a sealed trait to group them as currency.

Tuples

We’ve seen case classes as a way to structure data. Suppose you want to temporarily group data together, without the need of creating an explicit case class for it. For example, suppose you want to pattern match over more than one value. Listing 5 shows an example of pattern matching over a string and an integer:

Listing 5: Tuples and pattern matching

def hello(text: String, n: Int): String =
  (text, n) match {
     case ("Hello", i) => s"hi $i times"
     case (_, i) if i  "negative"
     case (_, 5) => "Hi five!"
     case (s, i) => s"Received $s and $i"
  }

In Scala, a tuple is a tool that allows you to quickly group data together. A tuple can have a minimum of two items and a maximum of twenty-two, and it’s created by the use of round brackets with items separated by a coma. Following are a few examples of tuples:

scala> (1,2,3) // tuple with 3 items of type Int
res0: (Int, Int, Int) = (1,2,3)

scala> (a, 1) // tuple with 2 item of type String and Int respectively
res1: (String, Int) = (hello,1)

Using pattern matching isn’t the only way to deconstruct a tuple. For example, you can use its getters for each of its parameters based on their order: the getter for the first parameter’s the function _1, for the second one you can use the function _2.

scala> val t = ("hello", "scala", "!")
t: (String, String, String) = (hello,scala,!)

scala> t._1
res2: String = hello

scala> t._2
res3: String = scala

scala> t._3
res4: String = !

Using the getter functions defined for tuples is one of the options, but it isn’t the most elegant one: the position of the item in the tuple usually isn’t indicative of what they represent. A more readable option is to assign the items of tuples to values:

scala> val (a, b, c) = ("hello", "scala", "!")
a: String = hello
b: String = scala
c: String = !

If you’re interested in only extracting some items, all you need to do is use underscore to tell the compiler to discard the corresponding value and not to bind it to any value:

scala> val (a, b, _) = ("hello", "scala", "!")
a: String = hello
b: String = scala

Think in Scala: When to use Tuples?

Tuples are a great way to group data together in a quick way, but they have several problems related to their lack of expressiveness. Use tuples in functions for temporary grouping of data, but try to avoid using them outside the context of a single function.

Every time you find yourself using tuples as return types of a function, you should ask yourself: is this function doing too many things? Try to refactor it! Maybe you need to represent the returned data with a more expressive data structure, such as a case class.

Try to use tuples only in short and concise fragments if your program: your code is more readable and easier to maintain.

Quick Check 3
Define a tuple with three: the number 5, the string “John,” the number 3. Extract the second and third items and multiply them together using the multiplier operator *.

Summary

In this article, my objective was to teach you about the use of case classes, how they differ from regular classes and why they’re extremely useful when representing data. We’ve also discussed case objects and why they’re ideal when needing to serialize objects. You also learned that tuples are useful to quickly group data together. Let’s see if you got this!

Try this
Use case classes and case objects to represent the following data:

  • An author has a forename and a surname.
  • A genre has only three possible values: Drama, Horror, Romantic.
  • A book has a title, an author and a genre.

Answers to Quick Checks

Quick Check 1
The following code describes the relations between brewery and beer:

case class Brewery(name: String)
case class Beer(name: String, brewery: Brewery)

Quick Check 2
A representation of the requested currencies, using case objects and a sealed trait should look like the following:

sealed trait Currency

case object USD extends Currency
case object GBP extends Currency
case object CAD extends Currency
case object EUR extends Currency

Quick Check 3
The following code multiples the second and third items of a tuple:

val (_, name, n) = (5, "John", 3)
name * n // returns "JohnJohnJohn"

If you want to learn more about Scala, read the first chapter of the book here.

3 thoughts on “Using Case Classes to Structure Your Data

Leave a comment