Static Duck Typing in Scala

Structural Types is a neat feature of Scala that not many developers know or use. This article will provide a brief introduction on what they are and how to use them.

What are Structural Types?

Types can be really powerful in Scala: they can define classes, abstract classes, objects, traits, functions…and a lot more!
What if we don’t really care what our instance is as long as it has a particular structure?

This problem is also called Duck Typing. The duck test by Mr Riley asserts that

When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

A Structural Type allows you to specify a desired set of properties and to define a type for them: the compiler will guarantee that the given type is compatible with the defined structure and reflection is used at runtime on the instance to call the method. It is important to mention that the use of reflection is quite expensive, so you should probably be careful when using it if performance is a big concern.

How to use them

Let’s simplify the duck test and let’s see how the problem can be easily solved using structural types.

For the sake of this tutorial we consider a duck any entity that can quack.

This requirement can be translated in the following structural type:

scala> type Duck = { def quack(): String }
defined type alias Duck

Every time we will use the type Duck the compiler will guarantee that the instance associated to it is compatible with the given structure:

scala> class Bird { def quack() = "quack"}
defined class Bird

scala> val d1: Duck = new Bird
d1: Duck = Bird@541a9917
// it works!

scala> class Bro { def quack() = "YOOOO QUAAAACKKK" }
defined class Bro

scala> val d2: Duck = new Bro
d2: Duck = Bro@7db1592b
// it works!

scala> class Person
defined class Person

scala> val d3: Duck = new Person
<console>:9: error: type mismatch;
 found   : Person
 required: Duck
    (which expands to)  AnyRef{def quack(): String}
       val d3: Duck = new Person
// it doesn't work because Person doesn't quack

Summary

In this article we have described how Scala implements Static Duck Typing and how it can be used to validate instances on their structure rather than type.

Performance Comparison between immutable Seq, List, Vector

I have recently attended the Advanced Scala Training Course by TypeSafe during the Scala Days Conference in Amsterdam. During the course we discussed a lot on how to write cleaner and more performant Scala code: one of the parameters that can greatly influence your performance is the type of collections used. Which type of collection shall we use? In this article we try to answer this question by comparing the runtime performance of three immutable Scala collections: Seq, List, Vector.

The Experiment

Our experiment consists in creating increasingly bigger collections of random integers and measure the average execution time of a specific operation. The (quick and dirty!) script used and the generated data can be found here. The script was run on a standard MacBook Pro with a 2.8 GHz Intel Core i5 processor. After each iteration, the number of allocated elements has been increased exponentially with base 2. Each operation has been performed 10 times and the average execution time has been considered for the purposes of this experiment.

My poor Mac managed to analyse up to collections with 2^27 elements before starting screaming the pain of hell — and that is when I decided to stop!

Although, this test cannot be considered valid for any statical significance due to the limited amount of retries and the fact that our collections have been limited to be of type Int, I believe that the results of our experiment are interesting enough to provide some guidance of what type of immutable Scala collection to use according to the feature of our system.

Apply

The analysed operation is the apply operation used to access a random object. By looking at thegraphs, we can see that List didn’t perform well. Lists don’t have randomised element access: every time we need to go through the list until we reach the element at the index we are looking for. Although both Seq and Vector behaved quite well, our winner for this round is Seq. None that the default implementation for Seq for Scala 2.11 is based on LinkedLists, while Vectors are implemented using a bit-map with a factor branching of 32 (a weird structure that can be seen as a tree with up to 32 children nodes).

Seq-applyList-applyVector-applyAll-apply

Append

For the append operation, the clear winner is Vector: because it has a tree structure that make really efficient to append elements. On the other side, List and Seq have a linked structure that makes this operation quite expensive.

Seq-appendList-appendVector-append

All-append

Prepend

List is unbeatable when prepending an element to the collection: all it has to do is add a new pointer and connect the new element with the head of existing list…easy! A Vector has quite a good performance, very similar to the append study case thanks to its tree-ish structure. One the other side, Seq has a disastrous results due to the fact that all the indexes need to be updated when a new element is added at the beginning of the collection.

Seq-prependList-prependVector-prependAll-prepend

Who’s the winner?

The results of our test suggests that unless our system requires an intensive use of specific operations like append or prepend, we should avoid list and sequences in favour of vectors as they have an overall better performance. Note that this is particularly true when performing operations on big collections: when dealing with small ones (i.e. less than 10 elements), there no significant performance difference between one collection type and the other.  This is consistent with the Collection Performance Characteristics described in the Scala Documentation.

Seq List Vector

Summary

Choosing the correct type of Scala collection can have a big impact on the performance of our code. This article has analysed the results of an experiment where the performance of Seq, List, Vector have been compared when accessing, appending and prepending an element. Our experiments are consistent with the Scala Documentation and suggest that Vectors are the collection with the overall better performance.

Pimp My Library

Methods are an efficient way of reducing code duplication and make our code cleaner. What happens if a class that you don’t own (i.e.: any class of the Standard Scala Library) doesn’t have a particular method that could make your life a lot easier — and your code a lot more readable? This article will describe how we can efficiently pimp an existing Scala library and how to seamlessly use it in our code.

How to Extend an Existing Class

Let’s assume that in our application we often need to complete a text with the string “Yo”.
We could write a nice method for it and import it when needed, but that does not make its usage look exactly the same as the other standard methods of the String class. Instead, by “pimping” our class, we will be able to use our method as it was actually part of to the standard implementation of String.

Creating the following class will do the trick:

// in file com/daniela/sfregola/tutorial/package.scala
package com.daniela.sfregola

package object tutorial {

  implicit class ExtendedString(val text: String) extends AnyVal {
    def yofy = s"Yo $text"
  }
}

We can now use our yofy method for any String used in a class in the package com.daniela.sfregola.tutorial:

package com.daniela.sfregola.tutorial

object Main extends App {
	println("bro".yofy)
	//"Yo bro"
}

These are just a few lines of code, but they use quite some interesting and powerful tools of the Scala Language.
First of all, ExtendedString is inside a package object, called tutorial: the class will be automatically imported in all the files that belong to package com.daniela.sfregola.tutorial. For more information on package objects, have a look at this article.
Also, the class is implicit: this allows the compiler to seamlessly wrap an instant of a String inside ExtendedString.
Finally, we can see that our class is a subclass of AnyVal: this is a functionality introduced from Scala 2.10, called Custom Value Class: in practical terms, it makes our code a lot faster following some optimisation from the compiler.

Custom Value Classes

If we play a bit with the :javap command in the scala console we can see how the compiler disassembles our code. If we do this for a Custom Value Classes (i.e. a class that extends AnyVal) we can notice that, instead of allocating an instance for that class type, the compiler will just allocate a java.lang.String: this little trick makes our code a lot more performant as it will avoid the allocation of runtime objects…magic indeed!

Ok, so why don’t we use extends AnyVal everywhere?

The compiler translates our instance into a java.lang.String, so it could struggle if serialising/deserialising it. This approach is usually suggested only when pimping libraries: when extending an existing class, usually we are not actually creating new classes but just adding methods by wrapping an instance of the original class.

Summary

Pimping libraries is a powerful tool to enrich existing libraries. In this article we have described how to efficiently add methods to existing classes. Also, we have briefly described the principle used by the compiler to perform runtime optimisation using Custom Value Classes.

Loading Configurations in Scala

The separation of configuration from code is a good practice that makes our system customisable as we can load different configurations according to the environment we are running it in. In this article we will describe different approaches to load configurations in Scala and how they can be combined together: loading configurations from a file, from command line parameters or from environment variables.

Configurations from a file

Let’s start with the basic case scenario: given a file, we want to read it and parse its values to use them in our code.

First, we need to define our configuration file, let’s call it application.conf.

// application.conf
my {
	secret {
		value = "super-secret"	
	}
}

We can now parse the file and use the obtained configuration in our script:

// config-tutorial.scala
import com.typesafe.config.ConfigFactory

val value = ConfigFactory.load().getString("my.secret.value")
println(s"My secret value is $value")
>> scala config-tutorial.scala 
My secret value is super-secret

By default, the ConfigFactory looks for a configuration file called application.conf. If willing to use a different configuration file (e.g.: another.conf), we just need to indicate a different file name and path to load (e.g.: ConfigFactory.load("another")). The Typesafe Config library provides several methods to make sure that the parsed value is compatible with the expected type: have a look at the Config Typesafe Documentation for methods to parse integers, longs, floats, etc.

Configurations from command line parameters

Another approach is to allow our users to redefine settings through command line parameters rather than changing the configuration file directly. All we have to do is changing our configuration file as following:

// application.conf
my {
	secret {
		value = "super-secret"
		value = ${?VALUE}	
	}
}

The output of our script will now change accordingly to the command line parameters provided.

>> scala config-tutorial.scala 
My secret value is super-secret

>> scala config-tutorial.scala -Dmy.secret.value=another-secret
My secret value is another-secret

Configurations from environment variables

Redefining configurations as part of the command line parameters works in most of the cases, but it can be tedious when we have a lot of parameters to change. Also putting sensitive information, such as passwords or tokens, in clear text in a configuration file or a run script may not be safe enough. Another option to load configurations is to inject our parameters from predefined environment variables.

In order to achieve this, we can just write a simple method that looks for a specific environment variable before loading the configurations in the previously described approach.

import scala.util.Properties

def envOrElseConfig(name: String): String = {
    Properties.envOrElse(
      name.toUpperCase.replaceAll("""\.""", "_"),
      config.getString(name)
    )
}	

Before loading our my.secret.value configuration, this simple method will first check if an environment variable called MY_SECRET_VALUE exists.

We can now put all together and create a script (gist available here) that will inject configurations in the following order:
1) From properly named environment variables
2) From command line parameters
3) From a configuration file

// application.conf
my {
	secret {
		value = "super-secret"
		value = ${?VALUE}	
	}
}
import com.typesafe.config.ConfigFactory
import scala.util.Properties

class MyConfig(fileNameOption: Option[String] = None) {
    
  val config = fileNameOption.fold(
                  ifEmpty = ConfigFactory.load() )(
                  file => ConfigFactory.load(file) )

  def envOrElseConfig(name: String): String = {
    Properties.envOrElse(
      name.toUpperCase.replaceAll("""\.""", "_"),
      config.getString(name)
    )
  }
}

The script can be used as following:

val myConfig = new MyConfig()
val value = myConfig.envOrElseConfig("my.secret.value")
println(s"My secret value is $value")

[/code]

Summary

Having a clear separation between configurations and code allow us to customise its execution to the environment where it runs in. In this article we have described different approaches of defining specific settings. In particular, we have presented a simple script that combines all these approaches in one: the script loads configurations first from environment variables, then from command line parameters and finally from a configuration file.

Scala Italy 2015 Highlights

A few days ago I had the opportunity of attending and speaking at Scala Italy 2015, hosted in Milan. Scala Italy is the only Italian Conference on Scala: they aim to spread the use of Scala in the Italian Developers Community and to contribute to the Scala Community.

The conference was hosted by UniCredit and DataBiz in the beautiful skyscraper of the UniCredit Town Hall and it was divided into the morning and afternoon sessions.

The first talk of the morning session was given by Martin Odersky (@odersky) on the evolution of Scala: where it started and where they would like it to be in the near future (video). My favourite part was when he said that Scala is a library-driven language: it provides the tools for people to create libraries rather than imposing them “because we trust developers” ….how cool is that? 😀 Another unforgettable moment was when he showed a slide with the sentence “Scala is a gateway drug to Haskell” and an awesome proposal for a new name for Scala: HASCALATOR.

The second talk was by Phil Calcado (@pcalcado), Director of Core Engineering at SoundCloud (video). He talked about the design and architecture of their microservice system and how they managed to apply principles of functional programming to them. It was a truly inspirational talk, I was so excited about it that I was about to send a message to my CTO during the talk itself to tell him that we should ABSOLUTELY do the same! Also, he stated that functional principles correlate with good design and good architecture. I agree with Phil, we should explore this concept a lot more than what we have done so far.

Last but not least for the morning session, Jon Pretty (@propensive) gave a talk on the Type Inference in Scala (video). His talk was giving the audience a demonstration on how mysterious and complex this topic can be, but also on how some of these problems can be overcome. I do remember my “Holy Crap!” exclamation when he showed us an inferred type a few pages long.

After a nice and chatty lunch break, the afternoon session started…together with the panic for my talk.

My talk was the first talk of the afternoon session. I gave an introduction to Akka and the Actor-Based Model (video). Yes, I did say that “Actors are like monkeys sending emails”. No, I don’t regret it. Also, I showed how you can interact with the Actor System using their four core operations (i.e.: create, send, become, supervise) and talked a bit about best design practices.
Overall, I think it went quite well: I didn’t trip over or stab myself with a pen, I didn’t spill water all over myself or the microphone, no people snoring (at least that I could hear/see) or throwing rotten tomatoes at me. I have also received a couple of interesting questions around the interdependencies between Actors. If interested, the slides on my talk are available here, while the video can be found here.

At the same time of my talk, Mirco Dotta (@mircodotta), Software Engineer at Typesafe, was talking about Reactive Streams (video). I cannot wait to see his talk! Reactive Streams has been in my “TO-PLAY-WITH” list for a while now…

Finally, the last talk I’d like to mention was the one by Alessandro Abbruzzetti from Ericson (video). His talk was so overwhelming and entertaining that there was a Standing Ovation at the end of it: he built a Commodore 64 emulator in Scala that runs on the same hardware resources of an original Commodore 64 — he even analysed the Circuit Diagram!
Why on earth would you do that? If Scala is so performant as they claim, then if it runs on an equivalent hardware of a Commodore 64, the overall result should be more performant than the original one we had in the 80s.
He took him 1 year and a half of his spare time but he did it….and the result is simply amazing. Can you imagine doing crazy cool graphics, processing and I/O with only one thread?? He proved it can be done. THIS IS THE COOLEST THING EVER! 😀
During the talk you could see all his passion and happiness, all his hard work dedicated to this crazy project. Every few minutes or so he was reminding us, just to make sure we didn’t forget: “Remember…one thread…just one thread…”. His project, called Kernal64, is available on GitHub at abbruzze/kernal64.

Finally, I’d like to mention the great work that the organizers of this conference did, in particular Marco Firrincelli (@mfirry) that took good care of me and made sure I made it to the event without getting lost in the middle of nowhere in Milan….well done guys! 🙂

All the videos of this conference can be found here.

Akka Dead Letters Channel

Akka doesn’t guarantee the delivery of a message. What happens when a message cannot be delivered? In this article we will describe how the Dead Letters Channel works and how it can be used to spot issues in our system.

How it works

In a previous article we have described the use of Event Streams in Akka. The Dead Letter Channel is nothing more that a special Event Stream that the system uses internally every time a message cannot be delivered: either because the message cannot be processed or delivered.

When Akka is redirecting the failed message to the Dead Letter actor, it wraps the message in a case class called Dead Letter to provide the message, the original sender and recipient:

case class DeadLetter(message: Any, sender: ActorRef, recipient: ActorRef)

Unless specified differently, dead letters are logged in the INFO level: more information on how to tweak your logging settings can be found here.

How to use it

Because the Dead Letter Channel is an Event Stream, we can subscribe to it and listen to all the messages it publishes.

The code used for this tutorial is available here.

First of all, let’s create a dummy actor, called EchoActor, that prints all the messages it receives:

 
class EchoActor extends Actor {
  
  def receive = {
    case msg => println(s"New msg received: $msg")
  }
  
}

The second step is to create our actor system: we will have two instance of EchoActor, one called deadLettersSubscriber that will listen for DeadLetters and the other, called echoActor, that will simply wait and receive messages.

  implicit val system = ActorSystem("dead-letters-usage-example")

  val deadLettersSubscriber = system.actorOf(Props[EchoActor], name = "dead-letters-subscriber")
  val echoActor = system.actorOf(Props[EchoActor], name = "generic-echo-actor")

  system.eventStream.subscribe(deadLettersSubscriber, classOf[DeadLetter])

When successfully sending a message, no dead letter is generated.

  echoActor ! "First Message"
  // generic-echo-actor - New msg received: First Message

However, when we try to send a message to an actor that has been killed, the message is successfully transformed into a DeadLetter.

  echoActor ! PoisonPill
  echoActor ! "Second Message"
  // dead-letters-subscriber - New msg received: DeadLetter(Second Message,Actor[akka://dead-letters-usage-example/deadLetters],Actor[akka://dead-letters-usage-example/user/generic-echo-actor#317003256])
  // INFO  [RepointableActorRef]: Message [java.lang.String] from Actor[akka://dead-letters-usage-example/deadLetters] to Actor[akka://dead-letters-usage-example/user/generic-echo-actor#317003256] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

Finally, we can also send messages directly to the Dead Letter Actor: this is usually not advised as the Dead Letter Channel should be reserved for the system to redirect failed messages.

  system.deadLetters ! "Dead Message"
  // dead-letters-subscriber - New msg received: DeadLetter(Dead Message,Actor[akka://dead-letters-usage-example/deadLetters],Actor[akka://dead-letters-usage-example/deadLetters])
  // INFO  [DeadLetterActorRef]: Message [java.lang.String] from Actor[akka://dead-letters-usage-example/deadLetters] to Actor[akka://dead-letters-usage-example/deadLetters] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

Summary

Akka redirects all the messages that couldn’t be delivered or process to the Dead Letter Channel. In this article we have discussed how Akka uses it and how we can exploit it when testing our system and investigating issues with our system.

Peer-to-Many Communication in Akka

The most common communication channel in Akka is Peer-to-Peer, where an individual actor sends a message directly to another individual actor actor. However, sometimes this is not enough as we may need a Peer-to-Many type of communication, where an individual actor sends a message to a group of actors. This is particularly useful when you need to model your system using a Publisher-Subscriber Pattern. This article will provide a quick tutorial on how to use and customise an Event Bus, the Akka way of implementing a Peer-to-Many communication channel.

Event Stream

Event Stream is the simplest and most common implementation of an Event Bus. It follows the classic Publisher-Subscriber Pattern: one system actor will publish a message and all the actors that subscribed to that specific message type will receive it.

Let’s see with a simple tutorial how easily this can be achieved: (gist available here).

In this tutorial, we want to model the following scenario: every time someone publishes a book, all the subscribers need to receive it.

First of all, we need to define what a book is:

case class Book(title: String, authors: List[String])

Then, we need to specify an Actor that acts as book publisher: every time it receives a book, it publishes it on the System Event Stream.

class BookPublisher extends Actor {
  
  def receive = {
    case book: Book => {
      println(s"Yeah! Publishing a new book: $book")
      context.system.eventStream.publish(book)
    }
  }
  
}

Finally, let’s create an Actor that will subscribe to the System Event Stream for all messages of type Book. Note that the preStart function will be executed by Akka right after the creation of the Actor.

class BookSubscriber extends Actor {
  
  override def preStart = context.system.eventStream.subscribe(self, classOf[Book])
  
  def receive = {
    case book: Book => println(s"My name is ${self.path.name} and I have received a new book: $book")
  }
}

Done! See, it wasn’t that bad… 🙂

Now we just need to play with our system to make sure it works as expected:

object Main extends App {
  
  implicit val system = ActorSystem("publisher-subscribers-example")
 
  val author = "Author"
  
  val bookPublisher = system.actorOf(Props[BookPublisher], name = "book-publisher")
  
  val subscriber1 = system.actorOf(Props[BookSubscriber], name = "subscriber-1") 
  val subscriber2 = system.actorOf(Props[BookSubscriber], name = "subscriber-2")
  
  bookPublisher ! Book(title = "A book title", authors = List(author, "Another author"))
  // Yeah! Publishing a new book: Book(A book title,List(Author, Another author))
  // My name is subscriber-1 and I have received a new book: Book(A book title,List(Author, Another author))
  // My name is subscriber-2 and I have received a new book: Book(A book title,List(Author, Another author))
  
  system.eventStream.unsubscribe(subscriber2, classOf[Book])
 
  bookPublisher ! Book(title = "Another book title", authors = List("Another author"))
  // Yeah! Publishing a new book: Book(Another book title,List(Another author))
  // My name is subscriber-1 and I have received a new book: Book(Another book title,List(Another author))
} 

Note that instead of unsubscribing subscriber2 to all the messages of type Book, we could have also unsubscribed it to any type of messages by using system.eventStream.unsubscribe(subscriber2).

Custom Event Bus

Event Streams are really easy to use. However, they may not be that useful if we want to perform some filtering on the published message. One solution to our problem could be to do some filtering before publishing the message, but what if this cannot be done or simply we want to do it in a more elegant way?

An Event Stream is a specific implementation of a Event Bus trait: Akka gives us to opportunity to reuse this trait to create Custom Event Buses.

Assume that now a subscriber wants to receive books for a specific author — gist of the following code can be found here.

First of all, the BookSubscriber actor doesn’t need to automatically subscribe to the System Event Stream. Moreover, the BookPublisher actor now needs to publish on a given Event Bus, rather than the System Event Stream.

class BookPublisher(bus: AuthorBookBus) extends Actor {
  
  def receive = {
    case book: Book => {
      println(s"Yeah! Publishing a new book: $book")
      bus.publish(book)
    }
  }
  
}
 
class BookSubscriber extends Actor {
  
  def receive = {
    case book: Book => println(s"My name is ${self.path.name} and I have received a new book: $book")
  }
}

Finally, let’s define our AuthorBookBus that will filter books according to their authors.

class AuthorBookBus(author: String) extends EventBus
  with LookupClassification
  with ActorEventBus {
  
  type Event = Book
  type Classifier = Boolean
  
  override def mapSize = 2
  
  override def classify(book: Book) = book.authors.contains(author)
 
  override protected def publish(book: Book, subscriber: Subscriber)= subscriber ! book
} 

Our Event Bus accepts events of type Book and it filters books with two possible values: true if author is one of the authors of the book, false otherwise — and this is why override def mapSize = 2! The function classify is used to categories a book according to the Classifier type. Akka provides different type of classifiers (see here), in our case the LookupClassification was enough as we wanted a classification based on the event itself.

The following app shows an example of how our actor system could be used:

object Main extends App {
  
  implicit val system = ActorSystem("publisher-subscribers-example")
  
  val author = "Author"
 
  val authorBookBus = new AuthorBookBus(author)
  val bookPublisher = system.actorOf(Props(new BookPublisher(authorBookBus)), name = "book-publisher")
 
  val subscriber1 = system.actorOf(Props[BookSubscriber], name = "subscriber-1")
  val subscriber2 = system.actorOf(Props[BookSubscriber], name = "subscriber-2")
 
  authorBookBus.subscribe(subscriber1, true)
  // i.e.: subscriber1 will receive all the books
  // where one of the authors is "Author"

  authorBookBus.subscribe(subscriber2, false)
  // i.e.: subscriber2 will receive all the books 
  // where "Author" is not an author

  bookPublisher ! Book(title = "A book title", authors = List(author, "Another Author"))
  // Yeah! Publishing a new book: Book(A book title,List(Author, Another Author))
  // My name is subscriber-1 and I have received a new book: Book(A book title,List(Author, Another Author))
 
  bookPublisher ! Book(title = "Another book title", authors = List("Another Author"))
  // Yeah! Publishing a new book: Book(Another book title,List(Another Author))
  // My name is subscriber-2 and I have received a new book: Book(Another book title,List(Another Author))
}

Summary

Not only Akka allows Peer-to-Peer communication, but also Peer-to-Many, which is useful to implement a publisher-subscriber pattern. This article has described how this can be achieved using Event Streams for simplest case scenarios and Custom Event Buses when some classification on the event is needed.