Static Duck Typing in Scala

Structural Types is a neat feature of Scala that not many developers know or use. This article will provide a brief introduction on what they are and how to use them.

What are Structural Types?

Types can be really powerful in Scala: they can define classes, abstract classes, objects, traits, functions…and a lot more!
What if we don’t really care what our instance is as long as it has a particular structure?

This problem is also called Duck Typing. The duck test by Mr Riley asserts that

When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

A Structural Type allows you to specify a desired set of properties and to define a type for them: the compiler will guarantee that the given type is compatible with the defined structure and reflection is used at runtime on the instance to call the method. It is important to mention that the use of reflection is quite expensive, so you should probably be careful when using it if performance is a big concern.

How to use them

Let’s simplify the duck test and let’s see how the problem can be easily solved using structural types.

For the sake of this tutorial we consider a duck any entity that can quack.

This requirement can be translated in the following structural type:

scala> type Duck = { def quack(): String }
defined type alias Duck

Every time we will use the type Duck the compiler will guarantee that the instance associated to it is compatible with the given structure:

scala> class Bird { def quack() = "quack"}
defined class Bird

scala> val d1: Duck = new Bird
d1: Duck = Bird@541a9917
// it works!

scala> class Bro { def quack() = "YOOOO QUAAAACKKK" }
defined class Bro

scala> val d2: Duck = new Bro
d2: Duck = Bro@7db1592b
// it works!

scala> class Person
defined class Person

scala> val d3: Duck = new Person
<console>:9: error: type mismatch;
 found   : Person
 required: Duck
    (which expands to)  AnyRef{def quack(): String}
       val d3: Duck = new Person
// it doesn't work because Person doesn't quack

Summary

In this article we have described how Scala implements Static Duck Typing and how it can be used to validate instances on their structure rather than type.

Performance Comparison between immutable Seq, List, Vector

I have recently attended the Advanced Scala Training Course by TypeSafe during the Scala Days Conference in Amsterdam. During the course we discussed a lot on how to write cleaner and more performant Scala code: one of the parameters that can greatly influence your performance is the type of collections used. Which type of collection shall we use? In this article we try to answer this question by comparing the runtime performance of three immutable Scala collections: Seq, List, Vector.

The Experiment

Our experiment consists in creating increasingly bigger collections of random integers and measure the average execution time of a specific operation. The (quick and dirty!) script used and the generated data can be found here. The script was run on a standard MacBook Pro with a 2.8 GHz Intel Core i5 processor. After each iteration, the number of allocated elements has been increased exponentially with base 2. Each operation has been performed 10 times and the average execution time has been considered for the purposes of this experiment.

My poor Mac managed to analyse up to collections with 2^27 elements before starting screaming the pain of hell — and that is when I decided to stop!

Although, this test cannot be considered valid for any statical significance due to the limited amount of retries and the fact that our collections have been limited to be of type Int, I believe that the results of our experiment are interesting enough to provide some guidance of what type of immutable Scala collection to use according to the feature of our system.

Apply

The analysed operation is the apply operation used to access a random object. By looking at thegraphs, we can see that List didn’t perform well. Lists don’t have randomised element access: every time we need to go through the list until we reach the element at the index we are looking for. Although both Seq and Vector behaved quite well, our winner for this round is Seq. None that the default implementation for Seq for Scala 2.11 is based on LinkedLists, while Vectors are implemented using a bit-map with a factor branching of 32 (a weird structure that can be seen as a tree with up to 32 children nodes).

Seq-applyList-applyVector-applyAll-apply

Append

For the append operation, the clear winner is Vector: because it has a tree structure that make really efficient to append elements. On the other side, List and Seq have a linked structure that makes this operation quite expensive.

Seq-appendList-appendVector-append

All-append

Prepend

List is unbeatable when prepending an element to the collection: all it has to do is add a new pointer and connect the new element with the head of existing list…easy! A Vector has quite a good performance, very similar to the append study case thanks to its tree-ish structure. One the other side, Seq has a disastrous results due to the fact that all the indexes need to be updated when a new element is added at the beginning of the collection.

Seq-prependList-prependVector-prependAll-prepend

Who’s the winner?

The results of our test suggests that unless our system requires an intensive use of specific operations like append or prepend, we should avoid list and sequences in favour of vectors as they have an overall better performance. Note that this is particularly true when performing operations on big collections: when dealing with small ones (i.e. less than 10 elements), there no significant performance difference between one collection type and the other.  This is consistent with the Collection Performance Characteristics described in the Scala Documentation.

Seq List Vector

Summary

Choosing the correct type of Scala collection can have a big impact on the performance of our code. This article has analysed the results of an experiment where the performance of Seq, List, Vector have been compared when accessing, appending and prepending an element. Our experiments are consistent with the Scala Documentation and suggest that Vectors are the collection with the overall better performance.

Pimp My Library

Methods are an efficient way of reducing code duplication and make our code cleaner. What happens if a class that you don’t own (i.e.: any class of the Standard Scala Library) doesn’t have a particular method that could make your life a lot easier — and your code a lot more readable? This article will describe how we can efficiently pimp an existing Scala library and how to seamlessly use it in our code.

How to Extend an Existing Class

Let’s assume that in our application we often need to complete a text with the string “Yo”.
We could write a nice method for it and import it when needed, but that does not make its usage look exactly the same as the other standard methods of the String class. Instead, by “pimping” our class, we will be able to use our method as it was actually part of to the standard implementation of String.

Creating the following class will do the trick:

// in file com/daniela/sfregola/tutorial/package.scala
package com.daniela.sfregola

package object tutorial {

  implicit class ExtendedString(val text: String) extends AnyVal {
    def yofy = s"Yo $text"
  }
}

We can now use our yofy method for any String used in a class in the package com.daniela.sfregola.tutorial:

package com.daniela.sfregola.tutorial

object Main extends App {
	println("bro".yofy)
	//"Yo bro"
}

These are just a few lines of code, but they use quite some interesting and powerful tools of the Scala Language.
First of all, ExtendedString is inside a package object, called tutorial: the class will be automatically imported in all the files that belong to package com.daniela.sfregola.tutorial. For more information on package objects, have a look at this article.
Also, the class is implicit: this allows the compiler to seamlessly wrap an instant of a String inside ExtendedString.
Finally, we can see that our class is a subclass of AnyVal: this is a functionality introduced from Scala 2.10, called Custom Value Class: in practical terms, it makes our code a lot faster following some optimisation from the compiler.

Custom Value Classes

If we play a bit with the :javap command in the scala console we can see how the compiler disassembles our code. If we do this for a Custom Value Classes (i.e. a class that extends AnyVal) we can notice that, instead of allocating an instance for that class type, the compiler will just allocate a java.lang.String: this little trick makes our code a lot more performant as it will avoid the allocation of runtime objects…magic indeed!

Ok, so why don’t we use extends AnyVal everywhere?

The compiler translates our instance into a java.lang.String, so it could struggle if serialising/deserialising it. This approach is usually suggested only when pimping libraries: when extending an existing class, usually we are not actually creating new classes but just adding methods by wrapping an instance of the original class.

Summary

Pimping libraries is a powerful tool to enrich existing libraries. In this article we have described how to efficiently add methods to existing classes. Also, we have briefly described the principle used by the compiler to perform runtime optimisation using Custom Value Classes.

Loading Configurations in Scala

The separation of configuration from code is a good practice that makes our system customisable as we can load different configurations according to the environment we are running it in. In this article we will describe different approaches to load configurations in Scala and how they can be combined together: loading configurations from a file, from command line parameters or from environment variables.

Configurations from a file

Let’s start with the basic case scenario: given a file, we want to read it and parse its values to use them in our code.

First, we need to define our configuration file, let’s call it application.conf.

// application.conf
my {
	secret {
		value = "super-secret"	
	}
}

We can now parse the file and use the obtained configuration in our script:

// config-tutorial.scala
import com.typesafe.config.ConfigFactory

val value = ConfigFactory.load().getString("my.secret.value")
println(s"My secret value is $value")
>> scala config-tutorial.scala 
My secret value is super-secret

By default, the ConfigFactory looks for a configuration file called application.conf. If willing to use a different configuration file (e.g.: another.conf), we just need to indicate a different file name and path to load (e.g.: ConfigFactory.load("another")). The Typesafe Config library provides several methods to make sure that the parsed value is compatible with the expected type: have a look at the Config Typesafe Documentation for methods to parse integers, longs, floats, etc.

Configurations from command line parameters

Another approach is to allow our users to redefine settings through command line parameters rather than changing the configuration file directly. All we have to do is changing our configuration file as following:

// application.conf
my {
	secret {
		value = "super-secret"
		value = ${?VALUE}	
	}
}

The output of our script will now change accordingly to the command line parameters provided.

>> scala config-tutorial.scala 
My secret value is super-secret

>> scala config-tutorial.scala -Dmy.secret.value=another-secret
My secret value is another-secret

Configurations from environment variables

Redefining configurations as part of the command line parameters works in most of the cases, but it can be tedious when we have a lot of parameters to change. Also putting sensitive information, such as passwords or tokens, in clear text in a configuration file or a run script may not be safe enough. Another option to load configurations is to inject our parameters from predefined environment variables.

In order to achieve this, we can just write a simple method that looks for a specific environment variable before loading the configurations in the previously described approach.

import scala.util.Properties

def envOrElseConfig(name: String): String = {
    Properties.envOrElse(
      name.toUpperCase.replaceAll("""\.""", "_"),
      config.getString(name)
    )
}	

Before loading our my.secret.value configuration, this simple method will first check if an environment variable called MY_SECRET_VALUE exists.

We can now put all together and create a script (gist available here) that will inject configurations in the following order:
1) From properly named environment variables
2) From command line parameters
3) From a configuration file

// application.conf
my {
	secret {
		value = "super-secret"
		value = ${?VALUE}	
	}
}
import com.typesafe.config.ConfigFactory
import scala.util.Properties

class MyConfig(fileNameOption: Option[String] = None) {
    
  val config = fileNameOption.fold(
                  ifEmpty = ConfigFactory.load() )(
                  file => ConfigFactory.load(file) )

  def envOrElseConfig(name: String): String = {
    Properties.envOrElse(
      name.toUpperCase.replaceAll("""\.""", "_"),
      config.getString(name)
    )
  }
}

The script can be used as following:

val myConfig = new MyConfig()
val value = myConfig.envOrElseConfig("my.secret.value")
println(s"My secret value is $value")

[/code]

Summary

Having a clear separation between configurations and code allow us to customise its execution to the environment where it runs in. In this article we have described different approaches of defining specific settings. In particular, we have presented a simple script that combines all these approaches in one: the script loads configurations first from environment variables, then from command line parameters and finally from a configuration file.

How to Compose Futures

Futures are a powerful tool that has been developed by the Akka team and then adopted as a standard Scala library from version 2.10.
A Future is a placeholder for a value that will be available in the future: thanks to it, it is possible to run operations in parallel and to worry about what to do with it only once the value is available making our applications more scalable and performant. A lot can be achieved with it, have a look at the official Scala documentation for Future and Promises. Each future can be seen as an isolated parallel operation, so combining them can be challenging: in this article we will describe how Futures can be composed together.

How to Select the Fastest Future

Let’s assume that in our application we have more services to perform the same operation and that these services have a different response time according to their traffic load. Because our application doesn’t have any information on the load of each service, or simply we don’t want to rely on it, we want to call all the services and get the first reply we get back: let’s see how this can be achieved using futures.

First of all, let’s simplify our life a bit: for the purposes of this tutorial, we will simulate the behaviour of our services with a method that will wait a period of time before returning a String wrapped in a Future:

def reply(timeout: Duration, msg: String): Future[String] = Future {
  Thread.sleep(timeout.toMillis)
  msg
}

Future.firstCompletedOf is the function that we are looking for: it will get a sequence of futures and return the first one that completes:

val futureSlowReply = reply(1 second, "Hello from a slow fella")
val futureFastReply = reply(100 milliseconds, "I am a super fast fella!")

val futureReplies = Seq(futureSlowReply, futureFastReply)
val futureFastestReply = Future.firstCompletedOf(futureReplies)

Await.result(futureFastestReply, 100 milliseconds)
// res0: String = I am a super fast fella!

Note that waiting 100 milliseconds to complete the future is enough: all the futures are run in parallel and we know that the fastest will complete by then.

How to Combine Futures in Parallel

What if we have different services that process that same information differently? For example, given a customer id we have a service to retrieve the account information, another to retrieve the payment details, another to retrieve product suggestions based on previous selections. We could do it the old Java style way and retrieve sequentially all the information…or we could retrieve all the information in parallel and be really efficient! 😀

Let’s see how this can be achieved using the zip method of the Future class:

val futureSlowReply = reply(1 second, "Hello from a slow fella")
val futureFastReply = reply(100 milliseconds, "I am a super fast fella!")

val futureAllParallelReplies< = futureSlowReply.zip(futureFastReply)
Await.result(futureAllParallelReplies, 1 second)
// res1: (String, String) = (Hello from a slow fella,I am a super fast fella!)

Note that waiting the combined future value, called futureAllParallelReplies, for less than 1 second would generate a java.util.concurrent.TimeoutException: the zip function needs all the futures to be completed before returning a composition of all the futures!

How to Concatenate Futures

In order to combine futures in parallel they need to be independent from each other. What if this is not possible and we need to run them sequentially?

All we need to do is using the for-comprehension loop to force the futures to run sequentially:

def futureAllSequentialReplies(msg: String) = for {
  firstReply <- reply(100 milliseconds, msg)
  nextMsg = if (msg.length < 3) msg.reverse else msg.toUpperCase 
  secondReply <- reply(200 milliseconds, nextMsg)
} yield (firstReply, secondReply)

Await.result(futureAllSequentialReplies("Hi"), 400 milliseconds)
// res2: (String, String) = (Hi,iH)
Await.result(futureAllSequentialReplies("Hello"), 400 milliseconds)
// res3: (String, String) = (Hello,HELLO)

Note that waiting for 300 milliseconds is not enough: not only the futures are run sequentially moreover, but also we spend some time computing the nextMsg String.

Summary

Future is a powerful tool to perform operations in parallel. However, combining several parallel operation can be challenging. This article has described who easily we can compose Scala Futures: how to filter them, how to combine them in parallel and, when needed, how to force them to run sequentially.

How to use Case Classes in Scala

Scala has adopted many concepts from other functional programming languages: higher-order functions from Haskell, actors model from Erlang, futures from Clojure, etc. However, Scala has also introduced new tools in the functional programming world: case classes is one of them. Case classes are a special type of classes, as the compiler automatically adds some useful methods to them, and this makes them particularly interesting for pattern matching and data definition. This article describes how to use them.

Defining a Case Class

A case class is identified by the case keyword. An example of case class is following:

scala> case class Person(name: String, age: Int)
defined class Person

Note that no body is needed — you can omit curly braces when empty!

Every time the Scala compiler finds a case class, it automatically adds some convenience methods.

The first one that the compiler generates is a factory method: it allows to create instances of the case class without the need of using the new keyword. Following are three equivalent ways of creating a Person instance:

scala> new Person("Mr Dude", 18)
res0: Person = Person(Mr Dude,18)

scala> Person.apply("Mr Dude", 18)
res1: Person = Person(Mr Dude,18)

scala> Person("Mr Dude", 18) 	
res2: Person = Person(Mr Dude,18)
// The most concise and elegant one!

Having a factory method may not seem a big advantage, but it will make your code easier to read, in particular when dealing with creation of nested objects.

Moreover, all the parameters are converted in immutable values that can be easily accessed without worrying about any side effect:

scala> val dude = Person("Mr Dude", 18)
dude: Person = Person(Mr Dude,18)

scala> dude.name
res3: String = Mr Dude

scala> dude.age
res4: Int = 18

scala> dude.age = 19 
<console>:10: error: reassignment to val
       dude.age = 19
                ^
// Immutability is guaranteed!

Last but not least, the Scala compiler adds intuitive implementations of toString, hashCode, equals that exploit the structure of the class and the value of its parameters. For example, two classes are considered equal if they have the same structure and its parameters are recursively equal:

scala> val dude = Person("Mr Dude", 18)
dude: Person = Person(Mr Dude,18)

scala> val sameDude = Person("Mr Dude", 18)
sameDude: Person = Person(Mr Dude,18)

scala> val anotherDude = Person("Another Dude", 18)
anotherDude: Person = Person(Another Dude,18)

scala> dude equals sameDude
res3: Boolean = true

scala> dude equals anotherDude
res4: Boolean = false

scala> dude equals List()
res5: Boolean = false

Using a Case Class

Case classes are perfect data containers: not only they make the data easy to access and compare, but they also guarantee immutability — a really important aspect to keep in mind when dealing with concurrency!

Also, thanks to their implicitly generated factory method, they naturally support pattern matching:

scala> def sayHello(entity: Any) = entity match {
	| case Person("Mr Dude", _) => "Yo Dude!"
	| case Person(name, _) => s"Hello $name"
	| case _ => "You are not even a person! Go away!"
	| }
sayHello: (entity: Any)String


scala> sayHello(Person("Mr Dude", 18))
res6: String = Yo Dude!

scala> sayHello(Person("James Bond", 50))
res7: String = Hello James Bond

scala> sayHello(List())
res8: String = You are not even a person! Go away!

Summary

This article has discussed the use of case classes in Scala. Case classes are a novel functional programming feature in Scala, particularly useful when used as data containers and for pattern matching.

Type Parameterisation

The following article provides a list of all the different type parameterisations available in Scala and how to use them.

Variance

A class can depend on a type constructor. The following is an example trait that depends on a generic type T:

trait MyExample[T] {...}

The above declaration implies that the generic type T is the only valid parameter for MyExample: this is called nonvariance.

Let’s assume we would like our trait to be compatible with other types, as long as this types are subclasses of T. This can be expressed as following:

trait MyExample[+T] {...}

This concept is called covariance: Given a type U subtype of T, a class MyClass is called covariant if MyClass[U] is a subtype of MyClass[T].

What if we want to achieve the opposite? In other words, we would like our trait to be compatible with any type as long as they are superclass of type T:

trait MyExample[-T] {...}

This concept is called contravariance: Given a type S supertype of T, a class MyClass is called contravariant if MyClass[S] is a subtype of MyClass[T].

Bounds

Compatibility between classes is not the only thing we can achieve. For example, we would like our trait MyExample to accept all the types that are subclasses of T. This can be expressed using the lower bound annotation:

trait MyExample[U >: T] {...}

The upper bound annotation can be used in a similar way to let our MyExample trait accept all the types that are superclasses of T:

trait MyExample[S <: T] {...}

Last but not least, let’s assume we want our trait to accept any type as long as convertible to T. This can be achieved using the view bound annotation:

trait MyExample[V <%: T] {...}

Thank you M. Odersky for implicit conversions!