Akka :: Using messages with different priorities over event stream in ActorSystem - actor

I need to publish messages of different types to event stream, and those
messages should have different priorities for example, if 10 messages of type
A have been posted, and one message of type B is posted after all, and
priority of B is higher than the priority of A - message B should be picked up
by next actor even if there are 10 messages of type A in queue.
I have read about prioritized messages here and created my simple implementation of that mailbox:
class PrioritizedMailbox(settings: Settings, cfg: Config) extends UnboundedPriorityMailbox(
PriorityGenerator {
case ServerPermanentlyDead => println("Priority:0"); 0
case ServerDead => println("Priority:1"); 1
case _ => println("Default priority"); 10
}
)
then I configured it in application.conf
akka {
actor {
prio-dispatcher {
type = "Dispatcher"
mailbox-type = "mailbox.PrioritizedMailbox"
}
}
}
and wired into my actor:
private val myActor = actors.actorOf(
Props[MyEventHandler[T]].
withRouter(RoundRobinRouter(HIVE)).
withDispatcher("akka.actor.prio-dispatcher").
withCreator(
new Creator[Actor] {
def create() = new MyEventHandler(storage)
}), name = "eventHandler")
I'm using ActorSystem.eventStream.publish in order to send messages, and my actor
is subscribed to it (I can see in logs that messages are processed, but in
FIFO order).
However looks like it is not enough, because in logs/console I've never seen the
messages like "Default priority". Am I missing something here? Does the
described approach work with event streams or just with direct invocations of
sending a message on actor? And how do I get prioritized messages with
eventStream?

Your problem is that your actors are insanely fast so messages get processed before they have time to queue up, so there cannot be any priorization done by the mailbox. The example below proves the point:
trait Foo
case object X extends Foo
case object Y extends Foo
case object Z extends Foo
class PrioritizedMailbox(settings: ActorSystem.Settings, cfg: Config)
extends UnboundedPriorityMailbox(
PriorityGenerator {
case X ⇒ 0
case Y ⇒ 1
case Z ⇒ 2
case _ ⇒ 10
})
val s = ActorSystem("prio", com.typesafe.config.ConfigFactory.parseString(
""" prio-dispatcher {
type = "Dispatcher"
mailbox-type = "%s"
}""".format(classOf[PrioritizedMailbox].getName)))
val latch = new java.util.concurrent.CountDownLatch(1)
val a = s.actorOf(Props(new akka.actor.Actor {
latch.await // Just wait here so that the messages are queued up
inside the mailbox
def receive = {
case any ⇒ /*println("Processing: " + any);*/ sender ! any
}
}).withDispatcher("prio-dispatcher"))
implicit val sender = testActor
a ! "pig"
a ! Y
a ! Z
a ! Y
a ! X
a ! Z
a ! X
a ! "dog"
latch.countDown()
Seq(X, X, Y, Y, Z, Z, "pig", "dog") foreach { x => expectMsg(x) }
s.shutdown()
This test passes with flying colors

Related

Event Sourcing: How to combine divergent states?

Suppose:
The events are A perceived, B perceived or Ping perceived.
A possible sequence of events could be A,A,A,B,Ping.
The states are InA, InB, PingMissing.
The rules are
No Ping in all events -> PingMissing.
A -> InA
B -> InB
(Only Ping events -> InA)
I would like to have one recommended action/ state.
I see three possibilities for the transition function f(s,e)->s:
Create a pseudo event likePingMissing perceived. Hence everything is in one function.
Two separate transition functions and combining the result.
One transition function with two states as a tuple and combining the result.
Any thoughts? Best practices?
Implementation of 2. in F# (language doesn't really matter):
type Event =
| A
| B
| Ping
type State1 =
| InA
| InB
type State2 =
| PingReceived
| PingMissing
type StateCombined =
| InA'
| InB'
| PingMissing'
let f1 s e :State1 =
match s,e with
| _, A -> InA
| _, B -> InB
| _, _ -> s
let f2 s e :State2 =
match s,e with
| _, Ping -> PingReceived
| _, _ -> s
let fCombined events =
let finalState1 = events |> Seq.fold f1 InA
let finalState2 = events |> Seq.fold f2 PingMissing
match finalState1, finalState2 with
| _, PingMissing -> PingMissing'
| InA, _ -> InA'
| InB, _ -> InB'
fCombined [A;A;A;B]
// PingMissing'
fCombined [A;A;A;B;Ping]
// InB'
I would tend to model the unified state as a tuple of the two substates (broadly in this case: "has a ping been received" and "if a ping has been received, was the last perception an A or a B"). A convenience function can then distill that into a recommendation.
This has the advantage of not reusing the sequence of events, so is a bit more compatible with a view of the events as a stream: at the very least this results in not having to refetch the events from an event store or keep the entire sequence of events in memory.
For example, in Scala (and explicitly modeling the situation where no A nor B has been perceived yet):
sealed trait Event
case object A extends Event
case object B extends Event
case object Ping extends Event
sealed trait PingState
case object PingReceived extends Event // Don't strictly need...
case object PingMissing extends Event
sealed trait LastPerceivedState
case object InA extends Event
case object InB extends Event
// ... could just as well be (Option[PingMissing], Option[LastPerceivedState])...
type State = (PingState, Option[LastPerceivedState])
// ... in which case, this is (Some(PingMissing), None)
val InitialState = PingMissing -> None
def distilledState(state: State): Either[PingMissing, Option[LastPerceivedState]] =
state match {
case (PingMissing, _) => Left(PingMissing)
case (_, lpsOpt) => Right(lpsOpt)
}
The transition function could then be written directly (taking advantage of the fact that the events can be partitioned into events which affect PingState or LastPerceivedState but never both):
val transitionF = { (state: State, evt: Event) =>
val (ps, lpsOpt) = state
evt match {
case A => ps -> Some(InA)
case B => ps -> Some(InB)
case Ping => PingReceived -> lpsOpt
}
}
In the event that there are events which affect both, then decomposing into subhandlers might simplify the code (at the expense of some possibly redundant invocations):
val pingStateTransition = { (ps: PingState, evt: Event) =>
if (ps == PingReceived) PingReceived
else if (evt == Ping) PingReceived
else ps
}
val lastPerceivedStateTransition = { (lpsOpt: Option[LastPerceivedState], evt: Event) =>
evt match {
case A => Some(InA)
case B => Some(InB)
case _ => lpsOpt
}
}
val transitionF = { (state: State, evt: Evt) =>
pingStateTransition(state._1, evt) -> lastPerceivedStateTransition(state._2, evt)
}

Using par map to increase performance

Below code runs a comparison of users and writes to file. I've removed some code to make it as concise as possible but speed is an issue also in this code :
import scala.collection.JavaConversions._
object writedata {
def getDistance(str1: String, str2: String) = {
val zipped = str1.zip(str2)
val numberOfEqualSequences = zipped.count(_ == ('1', '1')) * 2
val p = zipped.count(_ == ('1', '1')).toFloat * 2
val q = zipped.count(_ == ('1', '0')).toFloat * 2
val r = zipped.count(_ == ('0', '1')).toFloat * 2
val s = zipped.count(_ == ('0', '0')).toFloat * 2
(q + r) / (p + q + r)
} //> getDistance: (str1: String, str2: String)Float
case class UserObj(id: String, nCoordinate: String)
val userList = new java.util.ArrayList[UserObj] //> userList : java.util.ArrayList[writedata.UserObj] = []
for (a <- 1 to 100) {
userList.add(new UserObj("2", "101010"))
}
def using[A <: { def close(): Unit }, B](param: A)(f: A => B): B =
try { f(param) } finally { param.close() } //> using: [A <: AnyRef{def close(): Unit}, B](param: A)(f: A => B)B
def appendToFile(fileName: String, textData: String) =
using(new java.io.FileWriter(fileName, true)) {
fileWriter =>
using(new java.io.PrintWriter(fileWriter)) {
printWriter => printWriter.println(textData)
}
} //> appendToFile: (fileName: String, textData: String)Unit
var counter = 0; //> counter : Int = 0
for (xUser <- userList.par) {
userList.par.map(yUser => {
if (!xUser.id.isEmpty && !yUser.id.isEmpty)
synchronized {
appendToFile("c:\\data-files\\test.txt", getDistance(xUser.nCoordinate , yUser.nCoordinate).toString)
}
})
}
}
The above code was previously an imperative solution, so the .par functionality was within an inner and outer loop. I'm attempting to convert it to a more functional implementation while also taking advantage of Scala's parallel collections framework.
In this example the data set size is 10 but in the code im working on
the size is 8000 which translates to 64'000'000 comparisons. I'm
using a synchronized block so that multiple threads are not writing
to same file at same time. A performance improvment im considering
is populating a separate collection within the inner loop ( userList.par.map(yUser => {)
and then writing that collection out to seperate file.
Are there other methods I can use to improve performance. So that I can
handle a List that contains 8000 items instead of above example of 100 ?
I'm not sure if you removed too much code for clarity, but from what I can see, there is absolutely nothing that can run in parallel since the only thing you are doing is writing to a file.
EDIT:
One thing that you should do is to move the getDistance(...) computation before the synchronized call to appendToFile, otherwise your parallelized code ends up being sequential.
Instead of calling a synchronized appendToFile, I would call appendToFile in a non-synchronized way, but have each call to that method add the new line to some synchronized queue. Then I would have another thread that flushes that queue to disk periodically. But then you would also need to add something to make sure that the queue is also flushed when all computations are done. So that could get complicated...
Alternatively, you could also keep your code and simply drop the synchronization around the call to appendToFile. It seems that println itself is synchronized. However, that would be risky since println is not officially synchronized and it could change in future versions.

Class set method in Haskell using State-Monad

I've recently had a look at Haskell's Monad - State. I've been able to create functions that operate with this Monad, but I'm trying to encapsulate the behavior into a class, basically I'm trying to replicate in Haskell something like this:
class A {
public:
int x;
void setX(int newX) {
x = newX;
}
void getX() {
return x;
}
}
I would be very grateful if anyone can help with this. Thanks!
I would start off by noting that Haskell, to say the least, does not encourage traditional OO-style development; instead, it has features and characteristics that lend themselves well to the sort of 'pure functional' manipulation that you won't really find in many other languages; the short on this is that trying to 'bring over' concepts from other (traditional languages) can often be a very bad idea.
but I'm trying to encapsulate the behavior into a class
Hence, my first major question that comes to mind is why? Surely you must want to do something with this (traditional OO concept of a) class?
If an approximate answer to this question is: "I'd like to model some sort of data construct", then you'd be better off working with something like
data A = A { xval :: Int }
> let obj1 = A 5
> xval obj1
5
> let obj2 = obj1 { xval = 10 }
> xval obj2
10
Which demonstrates pure, immutable data structures, along with 'getter' functions and destructive updates (utilizing record syntax). This way, you'd do whatever work you need to do as some combination of functions mapping these 'data constructs' to new data constructs, as appropriate.
Now, if you absolutely needed some sort of model of State (and indeed, answering this question requires a bit of experience in knowing exactly what local versus global state is), only then would you delve into using the State Monad, with something like:
module StateGame where
import Control.Monad.State
-- Example use of State monad
-- Passes a string of dictionary {a,b,c}
-- Game is to produce a number from the string.
-- By default the game is off, a C toggles the
-- game on and off. A 'a' gives +1 and a b gives -1.
-- E.g
-- 'ab' = 0
-- 'ca' = 1
-- 'cabca' = 0
-- State = game is on or off & current score
-- = (Bool, Int)
type GameValue = Int
type GameState = (Bool, Int)
playGame :: String -> State GameState GameValue
playGame [] = do
(_, score) <- get
return score
playGame (x:xs) = do
(on, score) <- get
case x of
'a' | on -> put (on, score + 1)
'b' | on -> put (on, score - 1)
'c' -> put (not on, score)
_ -> put (on, score)
playGame xs
startState = (False, 0)
main = print $ evalState (playGame "abcaaacbbcabbab") startState
(shamelessly lifted from this tutorial). Note the use of the analogous 'pure immutable data structures' within the context of a state monad, in addition to 'put' and 'get' monadic functions, which facilitate access to the state contained within the State Monad.
Ultimately, I'd suggest you ask yourself: what is it that you really want to accomplish with this model of an (OO) class? Haskell is not your typical OO-language, and trying to map concepts over 1-to-1 will only frustrate you in the short (and possibly long) term. This should be a standard mantra, but I'd highly recommend learning from the book Real World Haskell, where the authors are able to delve into far more detailed 'motivation' for picking any one tool or abstraction over another. If you were adamant, you could model traditional OO constructs in Haskell, but I wouldn't suggest going about doing this - unless you have a really good reason for doing so.
It takes a bit of permuting to transform imperative code into a purely functional context.
A setter mutates an object. We're not allowed to do that directly in Haskell because of laziness and purity.
Perhaps, if we transcribe the State monad to another language it'll be more apparent. Your code is in C++, but because I at least want garbage collection I'll use java here.
Since Java never got around to defining anonymous functions, first, we'll define an interface for pure functions.
public interface Function<A,B> {
B apply(A a);
}
Then we can make up a pure immutable pair type.
public final class Pair<A,B> {
private final A a;
private final B b;
public Pair(A a, B b) {
this.a = a;
this.b = b;
}
public A getFst() { return a; }
public B getSnd() { return b; }
public static <A,B> Pair<A,B> make(A a, B b) {
return new Pair<A,B>(a, b);
}
public String toString() {
return "(" + a + ", " + b + ")";
}
}
With those in hand, we can actually define the State monad:
public abstract class State<S,A> implements Function<S, Pair<A, S> > {
// pure takes a value a and yields a state action, that takes a state s, leaves it untouched, and returns your a paired with it.
public static <S,A> State<S,A> pure(final A a) {
return new State<S,A>() {
public Pair<A,S> apply(S s) {
return new Pair<A,S>(a, s);
}
};
}
// we can also read the state as a state action.
public static <S> State<S,S> get() {
return new State<S,S>() {
public Pair<S,S> apply(S, s) {
return new Pair<S,S>(s, s);
}
}
}
// we can compose state actions
public <B> State<S,B> bind(final Function<A, State<S,B>> f) {
return new State<S,B>() {
public Pair<B,S> apply(S s0) {
Pair<A,S> p = State.this.apply(s0);
return f.apply(p.getFst()).apply(p.getSnd());
}
};
}
// we can also read the state as a state action.
public static <S> State<S,S> put(final S newS) {
return new State<S,S>() {
public Pair<S,S> apply(S, s) {
return new Pair<S,S>(s, newS);
}
}
}
}
Now, there does exist a notion of a getter and a setter inside of the state monad. These are called lenses. The basic presentation in Java would look like:
public abstract class Lens[A,B] {
public abstract B get(A a);
public abstract A set(B b, A a);
// .. followed by a whole bunch of concrete methods.
}
The idea is that a lens provides access to a getter that knows how to extract a B from an A, and a setter that knows how to take a B and some old A, and replace part of the A, yielding a new A. It can't mutate the old one, but it can construct a new one with one of the fields replaced.
I gave a talk on these at a recent Boston Area Scala Enthusiasts meeting. You can watch the presentation here.
To come back into Haskell, rather than talk about things in an imperative setting. We can import
import Data.Lens.Lazy
from comonad-transformers or one of the other lens libraries mentioned here. That link provides the laws that must be satisfied to be a valid lens.
And then what you are looking for is some data type like:
data A = A { x_ :: Int }
with a lens
x :: Lens A Int
x = lens x_ (\b a -> a { x_ = b })
Now you can write code that looks like
postIncrement :: State A Int
postIncrement = do
old_x <- access x
x ~= (old_x + 1)
return old_x
using the combinators from Data.Lens.Lazy.
The other lens libraries mentioned above provide similar combinators.
First of all, I agree with Raeez that this probably the wrong way to go, unless you really know why! If you want to increase some value by 42 (say), why not write a function that does that for you?
It's quite a change from the traditional OO mindset where you have boxes with values in them and you take them out, manipulate them and put them back in. I would say that until you start noticing the pattern "Hey! I always take some value as an argument, and at the end return it slightly modified, tupled with some other value and all the plumbing is getting messy!" you don't really need the State monad. Part of the fun of (learning) Haskell is finding new ways to get around the stateful OO thinking!
That said, if you absolutely want a box with an x of type Int in it, you could try making your own get and put variants, something like this:
import Control.Monad.State
data Point = Point { x :: Int, y :: Int } deriving Show
getX :: State Point Int
getX = get >>= return . x
putX :: Int -> State Point ()
putX newVal = do
pt <- get
put (pt { x = newVal })
increaseX :: State Point ()
increaseX = do
x <- getX
putX (x + 1)
test :: State Point Int
test = do
x1 <- getX
increaseX
x2 <- getX
putX 7
x3 <- getX
return $ x1 + x2 + x3
Now, if you evaluate runState test (Point 2 9) in ghci, you'll get back (12,Point {x = 7, y = 9}) (since 12 = 2 + (2+1) + 7 and the x in the state gets set to 7 at the end). If you don't care about the returned point, you can use evalState and you'll get just the 12.
There's also more advanced things to do here like abstracting out Point with a typeclass in case you have multiple datastructures which have something which behaves like x but that's better left for another question, in my opinion!

Scala stateful actor, recursive calling faster than using vars?

Sample code below. I'm a little curious why MyActor is faster than MyActor2. MyActor recursively calls process/react and keeps state in the function parameters whereas MyActor2 keeps state in vars. MyActor even has the extra overhead of tupling the state but still runs faster. I'm wondering if there is a good explanation for this or if maybe I'm doing something "wrong".
I realize the performance difference is not significant but the fact that it is there and consistent makes me curious what's going on here.
Ignoring the first two runs as warmup, I get:
MyActor:
559
511
544
529
vs.
MyActor2:
647
613
654
610
import scala.actors._
object Const {
val NUM = 100000
val NM1 = NUM - 1
}
trait Send[MessageType] {
def send(msg: MessageType)
}
// Test 1 using recursive calls to maintain state
abstract class StatefulTypedActor[MessageType, StateType](val initialState: StateType) extends Actor with Send[MessageType] {
def process(state: StateType, message: MessageType): StateType
def act = proc(initialState)
def send(message: MessageType) = {
this ! message
}
private def proc(state: StateType) {
react {
case msg: MessageType => proc(process(state, msg))
}
}
}
object MyActor extends StatefulTypedActor[Int, (Int, Long)]((0, 0)) {
override def process(state: (Int, Long), input: Int) = input match {
case 0 =>
(1, System.currentTimeMillis())
case input: Int =>
state match {
case (Const.NM1, start) =>
println((System.currentTimeMillis() - start))
(Const.NUM, start)
case (s, start) =>
(s + 1, start)
}
}
}
// Test 2 using vars to maintain state
object MyActor2 extends Actor with Send[Int] {
private var state = 0
private var strt = 0: Long
def send(message: Int) = {
this ! message
}
def act =
loop {
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
}
}
}
// main: Run testing
object TestActors {
def main(args: Array[String]): Unit = {
val a = MyActor
// val a = MyActor2
a.start()
testIt(a)
}
def testIt(a: Send[Int]) {
for (_ <- 0 to 5) {
for (i <- 0 to Const.NUM) {
a send i
}
}
}
}
EDIT: Based on Vasil's response, I removed the loop and tried it again. And then MyActor2 based on vars leapfrogged and now might be around 10% or so faster. So... lesson is: if you are confident that you won't end up with a stack overflowing backlog of messages, and you care to squeeze every little performance out... don't use loop and just call the act() method recursively.
Change for MyActor2:
override def act() =
react {
case 0 =>
state = 1
strt = System.currentTimeMillis()
act()
case input: Int =>
state match {
case Const.NM1 =>
println((System.currentTimeMillis() - strt))
state += 1
case s =>
state += 1
}
act()
}
Such results are caused with the specifics of your benchmark (a lot of small messages that fill the actor's mailbox quicker than it can handle them).
Generally, the workflow of react is following:
Actor scans the mailbox;
If it finds a message, it schedules the execution;
When the scheduling completes, or, when there're no messages in the mailbox, actor suspends (Actor.suspendException is thrown);
In the first case, when the handler finishes to process the message, execution proceeds straight to react method, and, as long as there're lots of messages in the mailbox, actor immediately schedules the next message to execute, and only after that suspends.
In the second case, loop schedules the execution of react in order to prevent a stack overflow (which might be your case with Actor #1, because tail recursion in process is not optimized), and thus, execution doesn't proceed to react immediately, as in the first case. That's where the millis are lost.
UPDATE (taken from here):
Using loop instead of recursive react
effectively doubles the number of
tasks that the thread pool has to
execute in order to accomplish the
same amount of work, which in turn
makes it so any overhead in the
scheduler is far more pronounced when
using loop.
Just a wild stab in the dark. It might be due to the exception thrown by react in order to evacuate the loop. Exception creation is quite heavy. However I don't know how often it do that, but that should be possible to check with a catch and a counter.
The overhead on your test depends heavily on the number of threads that are present (try using only one thread with scala -Dactors.corePoolSize=1!). I'm finding it difficult to figure out exactly where the difference arises; the only real difference is that in one case you use loop and in the other you do not. Loop does do fair bit of work, since it repeatedly creates function objects using "andThen" rather than iterating. I'm not sure whether this is enough to explain the difference, especially in light of the heavy usage by scala.actors.Scheduler$.impl and ExceptionBlob.

Slow Scala assert

We've been profiling our code recently and we've come across a few annoying hotspots. They're in the form
assert(a == b, a + " is not equal to " + b)
Because some of these asserts can be in code called a huge amount of times the string concat starts to add up. assert is defined as:
def assert(assumption : Boolean, message : Any) = ....
why isn't it defined as:
def assert(assumption : Boolean, message : => Any) = ....
That way it would evaluate lazily. Given that it's not defined that way is there an inline way of calling assert with a message param that is evaluated lazily?
Thanks
Lazy evaluation has also some overhead for the function object created. If your message object is already fully constructed (a static message) this overhead is unnecessary.
The appropriate method for your use case would be sprintf-style:
assert(a == b, "%s is not equal to %s", a, b)
As long as there is a speciaized function
assert(Boolean, String, Any, Any)
this implementation has no overhead or the cost of the var args array
assert(Boolean, String, Any*)
for the general case.
Implementing toString would be evaluated lazily, but is not readable:
assert(a == b, new { override def toString = a + " is not equal to " + b })
It is by-name, I changed it over a year ago.
http://www.scala-lang.org/node/825
Current Predef:
#elidable(ASSERTION)
def assert(assertion: Boolean, message: => Any) {
if (!assertion)
throw new java.lang.AssertionError("assertion failed: "+ message)
}
Thomas' answer is great, but just in case you like the idea of the last answer but dislike the unreadability, you can get around it:
object LazyS {
def apply(f: => String): AnyRef = new {
override def toString = f
}
}
Example:
object KnightSpeak {
override def toString = { println("Turned into a string") ; "Ni" }
}
scala> assert(true != false , LazyS("I say " + KnightSpeak))
scala> println( LazyS("I say " + KnightSpeak) )
Turned into a string
I say Ni
Try: assert( a==b, "%s is not equals to %s".format(a,b))
The format should only be called when the assert needs the string. Format is added to RichString via implicit.

Resources