Let it crash

Feb 19

A sample application showcasing play-mini and Akka

In Akka 2.0 we have decided to replace the Mist based HTTP module with a module called play-mini. With play-mini it’s possible not only to create REST based API in an easy fashion but also to use the power of the Play framework.

In this post we will describe the steps needed to build a simple REST service and have it interact with actors in an ActorSystem. It will also show the power of Akka’s futures and how to apply them in a real world example.

If you are not familiar with the infinite monkey theorem you can study the background of the application here: http://en.wikipedia.org/wiki/Infinite_monkey_theorem
(Disclaimer: the sample application used is not a complete implementation, or a proof, of the theorem in any way.)

Prerequisites

Download the Application Code

Start by downloading the code to your local machine. Open a terminal and type the following:
> git clone git@github.com:henrikengstrom/PlayMiniSample.git
> cd PlayMiniSample
> sbt
sbt> compile

That last command will print a lot of traces but it should end with something similar to:
[info] Compiling 2 Scala sources to /<your_structure>/PlayMiniSample/target/scala-2.9.1/classes…
[success] Total time: 5 s, completed Feb 19, 2012 10:31:41 AM

Okay, that’s it. Now you have the code downloaded and compiled.
Before giving the application a spin it’s a good idea to go through the different parts so you get a better understanding of it all.

Application Dissection

Play-mini gives you the tools to create a REST API and also easily have it integrate with your ActorSystem. You need to do 2 things in order to create a play-mini application:

  1. Create a Global class to indicate to play-mini what class to run
  2. Have your play-mini class extend com.typesafe.play.mini.Application:

Routing

When extending com.typesafe.play.mini.Application you have to provide a route implementaion. This is where the routing logic should go, i.e. where you add all URL paths the application should handle.

The sample application handles two paths:

The anatomy of the route method is to first state the HTTP method and thereafter what path the method should be applied to, e.g.:
GET(Path(“/ping”)) or
POST(Path(“/write”))

The result returned is a “Action”. For more information about this see:
https://github.com/playframework/Play20/wiki/ScalaActions
  
Let’s dissect the HTTP POST in the sample.

Interesting lines are:
Line 1: Declares HTTP method and path. Also adds an “implicit request”. The latter part is required by line 3, i.e. the form handling.
Line 3: Extracts parameters posted in the request with help of the Form extractor (line 17).
Line 4: AsyncResult instructs the code to park the HTTP request and release the thread.
Line 5: This is where the call to Akka is done with “shakespeare ? numberMonkeys”. This returns an Akka Future instance and that Future is then mapped to the case class “Result”. (The rest of the code, i.e. “.asPromise.map” is merely a way of converting from an Akka Future into Play’s own Future implementation.)
Line 7-11: Builds the reply, in this case a simple string.
Line 12: Returns the result. There are a a set of predefined replies pre-defined in Play (for more info see: https://github.com/playframework/Play20/wiki/ScalaActions)

The ActorSystem

Our infinite monkey theorem implementation is quite simple. It consists of two actor types; Shakespeare and MonkeyWorker.

The former receives information about how many monkey workers to use, creates the actors and sends a generated random number which instructs the monkey worker how many words to type.

A nice feature that should be described in a little more detail is the use of Akka’s Future implementation. Shakespeare seems to have been quite an astute chap and the same can be said about the Futures implementation in Akka!

Line 3-5: This is the part where we create the actors and sends them information about how many words to type. Since we use the ask notation (“?”) this means that we get a Future back, or as in this case a list of Future. Noteworthy is also that the result of each Future will contain a set of strings and this is interestingly enough also what the money workers send back to the “sender” as a message.
Line 7: Extracts each result from the list of futures
Line 8: Merges all results into one result (Set[String])
Line 9: Divides the result into two subsets; Shakespearian and non-Shakespearian words 
Line 10: Creates an instance of the case class Result
Line 11: Pipes the result back to the “sender” which in this case is the HTTP POST part in the route method.


Fore more information about the Akka Future implementation see: http://akka.io/docs/akka/2.0-RC1/scala/futures.html

We leave it to the reader to investigate how the MonkeyWorker has been implemented.

Running the Application

Want to give the application a test ride?
Just type the following in the terminal window:
>sbt
sbt> run

You should now see something like:
[info] Running play.core.server.NettyServer
Play server process ID is 30095
[info] play - Application started (Prod)
[info] play - Listening for HTTP on port 9000…

Fire up another terminal window and try it out:
> curl http://localhost:9000/ping
Pong @ 1329654200089

That went fine so the application seems to be alive and kicking. Now let’s get them monkeys busy:
> curl -d “number=1” http://localhost:9000/write
SHAKESPEARE WORDS:
UNWORTHY WORDS CREATED: 56
In 40376us

Okay, so using one monkey is probably not enough to create Shakespearian words. Let’s try with 100 monkeys:
> curl -d “number=100” http://localhost:9000/write
SHAKESPEARE WORDS:
or
UNWORTHY WORDS CREATED: 4691
In 133460us

Yeah, they typed an “or”. Let’s use one thousand monkey and see what words they can come up with:
> curl -d “number=1000” http://localhost:9000/write
SHAKESPEARE WORDS:
or
to
not
be
UNWORTHY WORDS CREATED: 45773
In 219808us

Who said that writing Shakespeare couldn’t be done by monkeys? :-)

Further reading

Happy hAKKing
@h3nk3

Feb 16

Why no mailboxSize in Akka 2 ?

Akka 1.x exposed a method to query the mailbox size, which was available both from inside an actor and from outside. We removed this method in 2.0 for a number of reasons, and since this topic came up on the mailing list multiple times already, here is my attempt at making the rationale accessible to a broader audience and disburden the akka-user group.

What are the problems with querying the mailbox size?

So, no mailboxSize in the default implementation. (There are even more reasons as soon as you consider remote actor references and the asynchronous nature of everything within Akka, just to name a few.)

Why would you want to use it anyway?

Assuming there would be a method to query the mailbox size from within an actor (doing it from without is really impossible in a distributed setting), and assuming that the actor detects that messages are piling up faster than it can process them, what would its plan of action be? Slowing down the senders—by way of a bounded queue—is basically the only thing it can do itself. Other than that only the supervisor can do something, but your application is likely not prepared to handle that, since all the supervisor can do is terminate the poor guy, invalidating all the references the senders have.

But I want to check periodically and interrupt my long-running task when new messages arrive …

That is not a good idea, because this way you block the processing of internal messages (system messages) which are used in the implementation of supervision, actor selections and other things. It is a much better approach to break up long-running tasks into smaller packages and have the actor send these to itself continuously until the job is done; that way it stays fully reactive and does not hog resources uncontrollably. Or you hand off the big pieces of work to Futures, compose those with the awesome Future API and feed the result back to the target actor using pipeTo (or callbacks as per onSuccess, etc.).

So, what is the recommended way?

When designing an application, you will immediately spot (most of) the hot spots and create these actors using Routers. And if you forget one, don’t worry, it is quite painless to insert “.withRouterConfig(RoundRobinRouter(10))” when testing reveals a bottle-neck. This gets even better when using Resizers, which add elasticity to the scaling.

Okay, I considered all this and I still want mailbox metrics.

In case you cannot do without, it is quite easy to write your own mailbox implementation, building on the traits in the akka.dispatch package and inserting book-keeping code into enqueue() and dequeue(). Then you could either use down-casting (evil) or keep track of your mailboxes in an akka.actor.Extension (recommended) to access the stats from within your actor and do whatever is necessary.

But wait: did I mention that it might even be easier to tag latency-critical (but not too high frequency) messages with timestamps and react on the age of a message when processing it?

So, in summary: while there still is a way to get the mailbox size, you will probably never actually need it.

PS: In case you need mailbox metrics for monitoring and in general operating a deployed system, you might want to have a look at the Typesafe Console.

Feb 14

Scalability of Fork Join Pool

Akka 2.0 message passing throughput scales way better on multi-core hardware than in previous versions, thanks to the new fork join executor developed by Doug Lea. One micro benchmark illustrates a 1100% increase in throughput!

The new 48 core server had arrived and we were excited to run the benchmarks on the new hardware, but it was sad to see the initial results. It didn’t scale! What was wrong?

The purpose of the used micro benchmark is to see how throughput of message send and receive is affected by increasing number of concurrent, active, actors sharing the same dispatcher. Pairs of actors send messages to each other, classical ping-pong. Load is increased by adding more pairs of actors that are processing messages in parallel with other actors.

Full source code of the benchmark: TellThroughputPerformanceSpec.scala

Hardware and configuration:

When using thread pool executor (java.util.concurrent.ThreadPoolExecutor) the benchmark didn’t scale beyond 12 parallel actors. Throughput was stuck at 1.4 million messages per second and didn’t increase with added load even though the 48 core box at a first glance was not heavily loaded, less than 10% of total cpu capacity was used.

First we thought the BIOS configuration was wrong, since the machine was new, but after going through all settings and turning off all power savings options the result was still as bad. We also bought more memory, to use 4 DIMMs per CPU for maximum memory bandwidth.

We noticed that the number of context switches was abnormal, above 70000 per second.

That must be the problem, but what is causing it? Viktor came up with the qualified guess that it must be the task queue of the thread pool executor, since that is shared and the locks in the LinkedBlockingQueue could potentially generate the context switches when there is contention.

Discussions with Doug Lea resulted in an improved implementation of the fork join pool, which we have embedded in akka-actor, and is used by default. The task queue is striped using randomized queuing and stealing. Read more about it here.

When running the same benchmark with the fork-join-executor the context switches were normal, around 1300 per second.

The results of the benchmark illustrates that the throughput scales with number of actors up to the number of cores (48) and saturates at around 20 million messages per second.

There are several things that can be tuned to achieve even higher throughput, which we will describe in follow up blog posts.

Jan 31

Akka 2.0 Remoting with Java

There have been a lot of requests for a Java example of how to use the remoting capabilities in Akka 2.0 and now there is one. It can be found here:

https://github.com/jboner/akka/tree/master/akka-samples/akka-sample-remote

Some interesting highlights are:

Looking up a remote actor on a remote node:

Creating an actor on a remote node:

As you can see there isn’t much to the code itself and this is because everything is configuration driven.

Here is what the configration files look like:

and

That’s it! Pretty slick right.

Happy hAKKing

//h3nk3

Dec 26

Benchmark: Akka vs Erlang

Franz Bettag have written up a small post about a benchmark between Akka and Erlang.

Interesting results. On his hardware and specific benchmark Erlang R14B04 did 1 million messages per second while Akka 2.0-SNAPSHOT did 2.1 million per second.

Read more here: http://uberblo.gs/2011/12/scala-akka-and-erlang-actor-benchmarks

…and we still have a lot more performance to squeeze out of Akka.

Dec 22

Location Transparency: Remoting in Akka 2.0

The remoting capabilities of Akka 2.0 are really powerful. Something that not has been as powerful is the documentation of the Akka remoting. We are constantly striving on improving it and this blog post will, hopefully, shed some light on the topic.

The remoting contains functionality not only to lookup a remote actor and send messages to it but also to deploy actors on remote nodes. These two types of interaction are referred to as:

In the section below the two different approaches will be explained.
(It may be worth pointing out that a combination of the two ways is, of course, also feasible)

The Setup

In order to run the example code below the following jars must be available on your classpath:

The Configuration File

Firstly we’ll start off by examining the configuration file as it plays a pivotal part in the remoting. The remote section holds a lot of configuration possibilities but the bare minimum to get started is described in this section:


If you want to run multiple actor systems on the same machine you can group your settings like this:


Okay, so basically what you need to do to get started are four things:

The above settings are enough for the lookup approach. In that case we only need to make sure the involved actor system run on a unique combination of host name and port.

Let’s say we are interested in testing the other approach, i.e. the creation functionality of remoting. This means that we have to add some information to the configuration file:


The configuration above instructs Akka to react specially once when an actor at path /actorName is created, i.e. using system.actorOf(Props(...), "actorName"). This specific actor will not only be instantiated, but instead the remote daemon of the remote system will be asked to create the actor instead,which is at actorSystem@127.0.0.1:2553 in this sample.

Lookup Approach in code

Let’s say you want to look up an actor on a remote node and use it. It’s really simple! Just do the following:
context.actorFor("akka://ACTOR_SYSTEM@HOST:PORT/user/ACTOR_NAME")

where:

The actor you want to use has to exist in the actor system and have the name associated with it. Here’s some example code how to set it up and try it out:


Creation Approach in code

In this section we will see how to create and use and actor on a remote node:


A fully fledged example of the two approaches can also be found here:
https://github.com/jboner/akka/tree/master/akka-samples/akka-sample-remote

Happy hAKKing
@h3nk3

Dec 21

Running Akka on ARM7 with 100 requests per second

@honzam399 writes:

“#Akka on ARM7 board, processing 100 simple REST requests/s at 5 V, 500 mA!”

Dec 20

Fun running Akka tests in parallel on my 24 core Linux box. 
sbt parallel testing FTW. 

Fun running Akka tests in parallel on my 24 core Linux box. 

sbt parallel testing FTW. 

Dec 19

Akka Team Blog Is Live

Stay tuned for more soon… :-P