Chronicle Queue: How to read excepts/documents with different WireKey? - chronicle

Assume a chronicle queue, and a producer that writes 2 types of messages into the queue.
Each type of message is written with a different "WireKey".
// Writes: {key1: TestMessage}
appender.writeDocument(w -> w.write("key1").text("TestMessage"));
// Writes: {key2: AnotherTextMessage}
appender.writeDocument(w -> w.write("key2").text("AnotherTextMessage"));
Question:
How can I write a single-threaded consumer that can read both types of messages and handle them differently?
What I've tried:
// This can read both types of messages, but cannot
// tell which type a message belongs to.
tailer.readDocument(wire -> {
wire.read().text();
});
// This only reads type "key1" messages, skips all "key2" messages.
tailer.readDocument(wire -> {
wire.read("key1").text();
});
// This crashes. (because it advances the read position illegally?)
tailer.readDocument(wire -> {
wire.read("key1").text();
wire.read("key2").text();
});
I was hoping I can do something like wire.readKey() and get the WireKey of a document, then proceed to read the document and handle it dynamically. How can I do this?
Note: I'm aware this can be accomplished using methodReader and methodWriter, and it seems like documentation/demo recommends this approach (?) But I'd prefer not to use that API, and be explicit about reading and writing messages. I assume there has to be a way to accomplish this use case.
Thank you.

You are correct, e.g. MethodReader accomplishes it.
You can do it two ways
// a reused StringBuilder
StringBuilder sb = new StringBuilder();
wire.read(sb); // populate the StringBuilder
or a more convenient method is
String name = wire.readEvent(String.class);
switch(name) {
case "key1":
String text1 = wire.getValueIn().text();
// do something with text1
break;
case "key2":
String text2 = wire.getValueIn().text();
// do something with text1
break;
default:
// log unexpected key
}
For other readers who don't know about MethodReader, the same messages can be accomplished with
interface MyEvents {
void key1(String text1);
void key2(String text2);
}
MyEvents me = wire.methodWriter(MyEvents.class);
me.key1("text1");
me.key2("text2");
MyEvents me2 = new MyEvents() {
public void key1(String text1) {
// handle text1
}
public void key2(String text2) {
// handle text2
}
};
Reader reader = wire.methodReader(me2;
do {
} while(reader.readeOne());
NOTE: The content is the same, so you can mix and match the two options
You can use a Chronicle Queue instead of a Wire to persist this information

Related

Chronicle Queue reading any kind of message with readDocument

In the Chronicle Queue I have two types of messages written. I wanna read this messages using the same tailer and if it is possible with the same method for example using tailer.readDocument().
Anyone now if it is possible, the message types are from different kind of objects. They haven't relationship.
In my actual reading logic I need to read all the entries of the queue and the order is important, for example:
Queue
MessageA
MessageA
MessageB
I need to read message B only after message A in this example, because of that I am looking for a method that read all the entries independent of message type.
The simplest approach is to write messages using a MethodWriter/MethodReader https://github.com/OpenHFT/Chronicle-Queue#high-level-interface
You start by defining an asynchronous interface, where all methods have:
arguments which are only inputs
no return value or exceptions expected.
A simple asynchronous interface
import net.openhft.chronicle.wire.SelfDescribingMarshallable;
interface MessageListener {
void method1(Message1 message);
void method2(Message2 message);
}
static class Message1 extends SelfDescribingMarshallable {
String text;
public Message1(String text) {
this.text = text;
}
}
static class Message2 extends SelfDescribingMarshallable {
long number;
public Message2(long number) {
this.number = number;
}
}
To write to the queue you can call a proxy that implements this interface.
SingleChronicleQueue queue1 = ChronicleQueue.singleBuilder(path).build();
MessageListener writer1 = queue1.acquireAppender().methodWriter(MessageListener.class);
// call method on the interface to send messages
writer1.method1(new Message1("hello"));
writer1.method2(new Message2(234));
These calls produce messages which can be dumped as follows.
# position: 262568, header: 0
--- !!data #binary
method1: {
text: hello
}
# position: 262597, header: 1
--- !!data #binary
method2: {
number: !int 234
}
To read the messages, you can provide a reader which calls your implementation with the same calls that you made.
// a proxy which print each method called on it
MessageListener processor = ObjectUtils.printAll(MessageListener.class)
// a queue reader which turns messages into method calls.
MethodReader reader1 = queue1.createTailer().methodReader(processor);
assertTrue(reader1.readOne());
assertTrue(reader1.readOne());
assertFalse(reader1.readOne());
Running this example prints:
method1 [!Message1 {
text: hello
}
]
method2 [!Message2 {
number: 234
}
]
Nice #PeterLawrey has a different way to build the processor. I mean in your example you print the objects I want populate the two different types of objects. I don't find a way until now using the methods in the same listener to do it.

Is it possible to make map-only task execute parallelly in Apache Flink

I'm using Flink to process some JSON-format streaming data:
{"uuid":"903493290432934", "bin": "68.3"}
{"uuid":"324938722984237", "bin": "56.8"}
...
My job is quite simple:
get stream from the Data Source ---> deserialize data into String ---> transform String to JSON object myJsonObj ---> double res = myJsonObj.get("bin") ---> do some heavy calculation with res.
Here is my code:
FlinkPravegaReader<String> source = ... // init source
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// transform String to MyJson
DataStream<MyJson> jsonStream = env.addSource(source).name("Pravega Stream")
.map(new MapFunction<String, MyJson>() {
#Override
public MyJson map(String s) throws Exception {
MyJson myJson = JSON.parseObject(s, MyJson.class);
return myJson;
}
});
// do the heavy process
DataStream<String> heavyResult = jsonStream
.map(new MapFunction<MyJson, String>() {
#Override
public String map(MyJson myJson) throws Exception {
double res = myJson.get("bin");
// do some very heavy calculation
return myJson.get("uuid").asText() + " done.";
}
});
heavyResult.print();
As my understanding, I haven't used any keyBy/window, so I think I used windowAll by default. Am I right?
If I'm right, the doc of Flink told me that windowAll couldn't be run in the parallel way. So does it mean that I have to do the heavy calculation one by one? I'm thinking if it is possible to do the heavy calculation parallelly.
As you see, in my case, it doesn't seem that using keyBy/window makes any sense. So how to make this case execute parallelly? Is it possible to make two jobs running together with the same Data Source as below?
/----windowAll ---- do the heavy calculation
/
Data Source-
\
\----windowAll ---- do the heavy calculation
Is this design possible? Saying that the Data Source generates three elements: A and B. With this design, I'm expecting that one windowAll processes A while the other windowAll processes B.
A keyed stream is used to create a partition in your data, so all the trafic from the same key is sent to thee same taskmanager.
A window is used when you want to aggregate elements from the stream to compute them as a set for a given reason.
If you case does not fit on above cases you don't use them.
To provide parallelism to the whole stream just use
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(3); //Notice you'll need 3 taskmanagers slots available.
To define paralelism for a single operator (heavy calculation) use:
DataStream<String> heavyResult = jsonStream
.map(new MapFunction<MyJson, String>() {
#Override
public String map(MyJson myJson) throws Exception {
double res = myJson.get("bin");
// do some very heavy calculation
return myJson.get("uuid").asText() + " done.";
}
}).setParallelism(3); //Notice you'll need 3 taskmanagers slots available.
More info at https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/parallel.html

Copy data from one Chronicle to another

For the concept of taking back up , i need to copy data from one chronicle queue to another .
Would it be safe to do a directly copy the whole Bytes object from wire of one queue into another ?
something like
documentContext().wire().bytes().read(byte_buffer)
and then wrapping this byte_buffer into byte_store and writing as
documentContext().wire().bytes().write(byte_Store).
The reason i'm doing it is avoid any conversion back and forth into custom objects?
You can, but a simpler approach is to copy directly from one to the other.
ChronicleQueue inQ = SingleChronicleQueueBuilder.binary("in").build();
ExcerptTailer tailer = inQ.createTailer();
ChronicleQueue outQ = SingleChronicleQueueBuilder.binary("out").build();
ExcerptAppender appender = outQ.acquireAppender();
while(true) {
try (DocumentContext inDC = tailer.readingDocument()) {
if (!inDC.isPresent()) {
// not message available
break; // or pause or do something else.
}
try (DocumentContext outDC = appender.writingDocument()) {
outDC.wire().write(inDC.wire().bytes());
}
}
}
}

How to close Java Formatter, in finally or not?

I know that normally streams and formatters (particularly java.util.Formatter) in Java should be closed in finally to avoid from resource leaks. But here I am a little bit confused, because I see a lot of examples where people just close it without any finally block, especially the formatters. This question may have no sense to some people, but I want to be sure in what I am asking about.
Some examples from java2s.com and from tutorialspoint.com where the formatters are just closed without any block.
Please consider that my question is only for Java 6 and lower versions, because I know about try with resources.
Example:
public static void main(String[] args) {
StringBuffer buffer = new StringBuffer();
Formatter formatter = new Formatter(buffer, Locale.US);
// format a new string
String name = "from java2s.com";
formatter.format("Hello %s !", name);
// print the formatted string
System.out.println(formatter);
// close the formatter
formatter.close();
// attempt to access the formatter results in exception
System.out.println(formatter);
}
In this specific example, it is not necessary to call close(). You only need to close the formatter if the underlying appender is Closable. In this case you are using a StringBuffer, which is not Closable so the call to close() does nothing. If you were to use Writer or PrintStream, those are closable and the call to close() would be necessary to avoid leaving the stream open.
If you are ever unsure if it is Closable it is best to just call close() anyway. No harm in doing so.
How about this, without further comments:
public static void main(String[] args) {
StringBuffer buffer = new StringBuffer();
Formatter formatter = null;
try {
formatter = new Formatter(buffer, Locale.US);
String name = "from java2s.com";
formatter.format("Hello %s !", name);
System.out.println(formatter);
}
finally {
if (formatter != null) {
formatter.close();
}
}
}

FIFO queue synchronization

Should FIFO queue be synchronized if there is only one reader and one writer?
What do you mean by "synchronized"? If your reader & writer are in separate threads, you want the FIFO to handle the concurrency "correctly", including such details as:
proper use of FIFO API should never cause data structures to be corrupted
proper use of FIFO API should not cause deadlock (although there should be a mechanism for a reader to wait until there is something to read)
the objects read from the FIFO should be the same objects, in the same order, written to the FIFO (there shouldn't be missing objects or rearranged order)
there should be a bounded time (one would hope!) between when the writer puts something into the FIFO, and when it is available to the reader.
In the Java world there's a good book on this, Java Concurrency In Practice. There are multiple ways to implement a FIFO that handles concurrency correctly. The simplest implementations are blocking, more complex ones use non-blocking algorithms based on compare-and-swap instructions found on most processors these days.
Yes, if the reader and writer interact with the FIFO queue from different threads.
Depending on implementation, but most likely. You don't want reader to read partially written data.
Yes, unless its documentation explicitly says otherwise.
(It is possible to implement a specialized FIFO that doesn't need synchronization if there is only one reader and one writer thread, e.g. on Windows using InterlockedXXX functions.)
Try this code for concurrent fifo usage:
public class MyObjectQueue {
private static final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
private static final ReadLock readLock;
private static final WriteLock writeLock;
private static final LinkedList<MyObject> objects;
static {
readLock = lock.readLock();
writeLock = lock.writeLock();
objects = new LinkedList<MyObject>();
}
public static boolean put(MyObject p) {
writeLock.lock();
try {
objects.push(p);
return objects.contains(p);
} finally {
writeLock.unlock();
}
}
public static boolean remove(MyObject p) {
writeLock.lock();
try {
return objects.remove(p);
} finally {
writeLock.unlock();
}
}
public static boolean contains(MyObject p) {
readLock.lock();
try {
return objects.contains(p);
} finally {
readLock.unlock();
}
}
public MyObject get() {
MyObject o = null;
writeLock.lock();
try {
o = objects.getLast();
} catch (NoSuchElementException nse) {
//list is empty
} finally {
writeLock.unlock();
}
return o;
}
}

Resources