In this post I’m going to show a pattern that can be used to discover facts about an actor system while it is running. It can be used to understand how messages flow through the actors in the system. The main reason why I built this pattern is to understand what is going on in a running actor system that is distributed across many machines. If I can’t picture it, I can’t understand it (and I’m in good company with that quote :)
Building actor systems is fun but debugging them can be difficult, you mostly end up browsing through many log files on several machines to find out what’s going on. I’m sure you have browsed through logs and thought, “Hey, where did that message go?”, “Why did this message cause that effect” or “Why did this actor never get a message?”
This is where the Spider pattern comes in.
We recently needed to build a caching system in front of a slow backend system with the following requirements:
- The data in the backend system is constantly being updated so the caches need to be updated every N minutes.
- Requests to the backend system need to be throttled.
The caching system we built used Akka actors and Scala’s support for functions as first class objects.
Graphs have always been an interesting structure to study in both mathematics and computer science (among other fields), and have become even more interesting in the context of online social networks such as Facebook and Twitter, whose underlying network structures are nicely represented by graphs.
These graphs are typically “big”, even when sub-graphed by things such as location or school. With “big” graphs comes the desire to extract meaningful information from these graphs. In the age of multi-core CPU’s and distributed computing, concurrent processing of graphs proves to be an important topic.
I’ve seen a question pop up a number of times on the Akka Mailing List that looks something like: "How do you tell Akka to shut down the ActorSystem when everything’s finished?" It turns out that there’s no magical flag for this, no configuration setting, no special callback you can register for, and neither will the illustrious shutdown fairy grace your application with her glorious presence at that perfect moment. She’s just plain mean.
In this post, we’ll discuss why this is the case and provide you with a simple option for shutting down “at the right time”, as well as a not-so-simple-option for doing the exact same thing.
Simple distributed load-balancing
“AMQP proxies” is a simple way of integrating AMQP with Akka to distribute jobs across a network of computing nodes. You still write “local” code, have very little to configure, and end up with a distributed, elastic, fault-tolerant grid where computing nodes can be written in nearly every programming language.
When an Actor stops, its children stop in an undefined order. Child termination is asynchronous and thus non-deterministic.
If an Actor has children that have order dependencies, then you might need to ensure a particular shutdown order of those children so that their
postStop() methods get called in the right order. This pattern approaches that problem by introducing an Actor type that we’ll call a Terminator.
In Akka 2, there is a nifty little thing called the BalancingDispatcher, which will magically distribute work to a collection of Actors in the most efficient way possible (i.e. it’s a work stealing dispatcher). The SmallestMailboxRouter has this kind of feel as well. However, the
BalancingDispatcher and the
SmallestMailboxRouter differ in how the choice is made about who, and when to deliver the incoming message. The
BalancingDispatcher dispatches a message to an Actor only when that Actor would otherwise be idle. The
SmallestMailboxRouter dispatches the message to the Actor with the least number of messages in its Mailbox even while the Actor is currently working on other things.
Often times, people want the functionality of the
BalancingDispatcher with the stipulation that the Actors doing the work have distinct Mailboxes on remote nodes. In this post we’ll explore the implementation of such a concept.
Our goal is to implement a message throttler, a piece of code that ensures that messages are not sent out at too high a rate.
Let’s see a concrete example. Suppose you are writing an application that makes HTTP requests to an external web service and that this web service has a restriction in place: you may not make more than 10 requests in 1 minute. You will get blocked or need to pay if you don’t stay under this limit. With a throttler, you can ensure that the requests you make do not overstep the threshold rate.