How to safely write to one file from many verticle instances in vert.x 3.2? - vertx3

Instead of using a logger or database server I'd like to append information to one file from possibly many verticle instances.
There are versions of methods for writing asynchronously to a file.
Can I assume that vertx handles the synchronisation between the writes so that these dont interfere when using those versions of methods marked as ¨async¨ ?
There seems to be a rule that one can rely on vertx providing all isolation between concurrent processing out of the box. But is that true in case of writing file access?
Could you please include a code snippet into the answer that shows how to open and write to one file from many verticle instances with finest possible granularity, e.g. for logging requests.

I wouldn't recommend writing to a single file with many different "writers". Regarding concurrent logging I would stick to the Single Writer principle.
Create a Verticle which subscribes to the Event Bus and listens for messages to be logged. Lets call this Verticle Logger which listens to system.logger.
EventBus eb = vertx.eventBus();
eb.consumer("system.logger", message -> {
// write to file
});
Verticles which like to log something need to send a message to the Logger Verticle:
eventBus.send("system.logger", "foobar");
Appending to a existing file work something like this (didn't test):
vertx.fileSystem().open("file.log", new OpenOptions(), result -> {
if (result.succeeded()) {
Buffer buff = Buffer.buffer(message); // message from consume
AsyncFile file = result.result();
file.write(buff, buff.length() * i, ar -> {
if (ar.succeeded()) {
System.out.println("done");
} else {
System.err.println("write failed: " + ar.cause());
}
});
} else {
System.err.println("open file failed " + result.cause());
}
});

Related

MassTransit - How to fault messages in a batch

I am trying to utilize the MassTransit batching technique to process multiple messages to reduce the individual queries to be database (read and write).
If there is an exception while processing one/more of the messages, then the expectation is to fault only required messages and have the ability to process the rest of the messages.
This is common scenario in my use case ,what I am trying to establish here is a way to perform batch processing that caters for poisoned messages
For example, if I have a batch size of 10 messages, 10 in the queue and 1 persistently fails, I still need a means of ensuring the other 9 can be processed successfully. It is fine if all 10 need to be returned to the queue and subset re-consumed - but the poisoned message needs to be eliminated somehow. Does this requirement discount the use of batching?
I have tried below, however did solve my use case.
catching the exception and raising NotifyFaulted for that specific message.
modified sample-twitch application, to throw an exception to something like below , based on https://github.com/MassTransit/Sample-Twitch/blob/master/src/Sample.Components/BatchConsumers/RoutingSlipBatchEventConsumer.cs
file.
public Task Consume(ConsumeContext<Batch<RoutingSlipCompleted>> context)
{
if (_logger.IsEnabled(LogLevel.Information))
{
_logger.Log(LogLevel.Information, "Routing Slips Completed: {TrackingNumbers}",
string.Join(", ", context.Message.Select(x => x.Message.TrackingNumber)));
}
for (int i = 0; i < context.Message.Length; i++)
{
try
{
if (i % 2 != 0)
throw new System.Exception("business error -message failed");
}
catch (System.Exception ex)
{
context.Message[i].NotifyFaulted(TimeSpan.Zero, "batch routing silp faulted", ex);
}
}
return Task.CompletedTask;
}
I have dig into a few more threads that look similar to the issue ,for reference.
Masstransit error handling for batch consumer
If you want to use batch, and have a message in that batch that cannot be processed, you should catch the exception and do something else with the poison message. You could write it someplace else, publish some type of event, or whatever else. But MassTransit does not allow you to partially complete/fault messages of a batch.

Is there a step that will not cause an error? bufferTimeout: "Could not emit buffer due to lack of requests"

API: Exception - reactor-code
The example works as follows:
Subscribes to the next one on departure. The incoming data comes from the Rabbit and it will be processed. This can take a relatively long time and the result will send into another Rabbit queue.
Because of bulk processing, I use buffer for 10 elements. If not receive enough elements for 10, I use a timeout (on buffer) to release for processing.
Problem: If processing or rabbit publisher is slow, bufferTimeout not receive "request", when timeout run out bufferTimeout would like to emit. Then I get the following Exception: "Could not emit buffer due to lack of requests"
Since I need all the data, I exclude next method usage: onBackPressureDrop or onBackPressureLatest. Using plain onBackPressure won't be good because it is not forward the received request number. (onBackPressure use request(unbound) not request(n))
Example Kotlin code:
#Configuration
class SpringCloudStreamRabbitProcessor {
#Bean
fun rabbitFunc() = Function<Flux<Int>, Flux<Int>> {
it.bufferTimeout(10, Duration.ofMinutes(1))
.concatMap { intList ->
// process
Mono.just(intList)
}
.flatMapIterable { intList ->
intList
}
}
}

how to read messages by packet with JmsTemplate of Sring

i have an outOfMemoryException while reading messages from a queue with 2 M of messages.
and i am trying to find a way to read messgages by 1000 for example .
here is my code
List<TextMessage> messages = jmsTemplate.browse(JndiQueues.BACKOUT, (session,browser) -> {
Enumeration<?> browserEnumeration = browser.getEnumeration().;
List<TextMessage> messageList = new ArrayList<TextMessage>();
while (browserEnumeration.hasMoreElements()) {
messageList.add((TextMessage) browserEnumeration.nextElement());
}
return messageList;
});
thanks
Perform the browse on a different thread and pass the results subset to the main thread via a BlockingQueue<List<TextMessage>>. e.g. a LinkedBlockingQueue with a small capacity.
The browsing thread will block when the queue is full. When the main thread removes an entry from the queue, the browser can add a new one.
Probably makes sense to have a capacity at least 2 so that the next list is available immediately.

http_listener cpprestsdk how to handle multiple POST requests

I have developed a client server application with casablanca cpprestskd.
Every 5 minutes a client send informations from his task manager (processes,cpu usage etc) to server via POST method.
The project should be able to manage about 100 clients.
Every time that server receives a POST request he opens an output file stream ("uploaded.txt") ,extract some initial infos from client (login,password),manage this infos, save all infos in a file with the same name of client (for example: client1.txt, client2.txt) in append mode and finally reply to client with a status code.
This is basically my POST handle code from server side:
void Server::handle_post(http_request request)
{
auto fileBuffer =
std::make_shared<Concurrency::streams::basic_ostream<uint8_t>>();
try
{
auto stream = concurrency::streams::fstream::open_ostream(
U("uploaded.txt"),
std::ios_base::out | std::ios_base::binary).then([request, fileBuffer](pplx::task<Concurrency::streams::basic_ostream<unsigned char>> Previous_task)
{
*fileBuffer = Previous_task.get();
try
{
request.body().read_to_end(fileBuffer->streambuf()).get();
}
catch (const exception&)
{
wcout << L"<exception>" << std::endl;
//return pplx::task_from_result();
}
//Previous_task.get().close();
}).then([=](pplx::task<void> Previous_task)
{
fileBuffer->close();
//Previous_task.get();
}).then([](task<void> previousTask)
{
// This continuation is run because it is value-based.
try
{
// The call to task::get rethrows the exception.
previousTask.get();
}
catch (const exception& e)
{
wcout << e.what() << endl;
}
});
//stream.get().close();
}
catch (const exception& e)
{
wcout << e.what() << endl;
}
ManageClient();
request.reply(status_codes::OK, U("Hello, World!")).then([](pplx::task<void> t) { handle_error(t); });
return;
}
Basically it works but if i try to send info from due clients at the same time sometimes it works sometimes it doen't work.
Obviously the problem if when i open "uploaded.txt" stream file.
Questions:
1)Is CASABLANCA http_listener real multitasking?how many task it's able to handle?
2)I didn't found in documentation ax example similar to mine,the only one who is approaching to mine is "Casalence120" Project but he uses Concurrency::Reader_writer_lock class (it seems a mutex method).
What can i do in order to manage multiple POST?
3)Is it possible to read some client infos before starting to open uploaded.txt?
I could open an output file stream directly with the name of the client.
4)If i lock access via mutex on uploaded.txt file, Server become sequential and i think this is not a good way to use cpprestsdk.
I'm still approaching cpprestskd so any suggestions would be helpful.
Yes, the REST sdk processes every request on a different thread
I confirm there are not many examples using the listener.
The official sample using the listener can be found here:
https://github.com/Microsoft/cpprestsdk/blob/master/Release/samples/CasaLens/casalens.cpp
I see you are working with VS. I would strongly suggest to move to VC++2015 or better VC++2017 because the most recent compiler supports co-routines.
Using co_await dramatically simplify the readability of the code.
Substantially every time you 'co_await' a function, the compiler will refactor the code in a "continuation" avoiding the penalty to freeze the threads executing the function itself. This way, you get rid of the ".then" statements.
The file problem is a different story than the REST sdk. Accessing the file system concurrently is something that you should test in a separate project. You can probably cache the first read and share the content with the other threads instead of accessing the disk every time.

routing files with zeromq (jeromq)

I'm trying to implement a "file dispatcher" on zmq (actually jeromq, I'd rather avoid jni).
What I need is to load balance incoming files to processors:
each file is handled only by one processor
files are potentially large so I need to manage the file transfer
Ideally I would like something like https://github.com/zeromq/filemq but
with a push/pull behaviour rather than publish/subscribe
being able to handle the received file rather than writing it to disk
My idea is to use a mix of taskvent/tasksink and asyncsrv samples.
Client side:
one PULL socket to be notified of a file to be processed
one DEALER socket to handle the (async) file transfer chunk by chunk
Server side:
one PUSH socket to dispatch incoming file (names)
one ROUTER socket to handle file requests
a few DEALER workers managing the file transfers for clients and connected to the router via an inproc proxy
My first question is: does this seem like the right way to go? Anything simpler maybe?
My second question is: my current implem gets stuck on sending out the actual file data.
clients are notified by the server, and issue a request.
the server worker gets the request, and writes the response back to the inproc queue but the response never seems to go out of the server (can't see it in wireshark) and the client is stuck on the poller.poll awaiting the response.
It's not a matter of sockets being full and dropping data, I'm starting with very small files sent in one go.
Any insight?
Thanks!
==================
Following raffian's advice I simplified my code, removing the push/pull extra socket (it does make sense now that you say it)
I'm left with the "non working" socket!
Here's my current code. It has many flaws that are out of scope for now (client ID, next chunk etc..)
For now, I'm just trying to have both guys talking to each other roughly in that sequence
Server
object FileDispatcher extends App
{
val context = ZMQ.context(1)
// server is the frontend that pushes filenames to clients and receives requests
val server = context.socket(ZMQ.ROUTER)
server.bind("tcp://*:5565")
// backend handles clients requests
val backend = context.socket(ZMQ.DEALER)
backend.bind("inproc://backend")
// files to dispatch given in arguments
args.toList.foreach { filepath =>
println(s"publish $filepath")
server.send("newfile".getBytes(), ZMQ.SNDMORE)
server.send(filepath.getBytes(), 0)
}
// multithreaded server: router hands out requests to DEALER workers via a inproc queue
val NB_WORKERS = 1
val workers = List.fill(NB_WORKERS)(new Thread(new ServerWorker(context)))
workers foreach (_.start)
ZMQ.proxy(server, backend, null)
}
class ServerWorker(ctx: ZMQ.Context) extends Runnable
{
override def run()
{
val worker = ctx.socket(ZMQ.DEALER)
worker.connect("inproc://backend")
while (true)
{
val zmsg = ZMsg.recvMsg(worker)
zmsg.pop // drop inner queue envelope (?)
val cmd = zmsg.pop //cmd is used to continue/stop
cmd.toString match {
case "get" =>
val file = zmsg.pop.toString
println(s"clientReq: cmd: $cmd , file:$file")
//1- brute force: ignore cmd and send full file in one go!
worker.send("eof".getBytes, ZMQ.SNDMORE) //header indicates this is the last chunk
val bytes = io.Source.fromFile(file).mkString("").getBytes //dirty read, for testing only!
worker.send(bytes, 0)
println(s"${bytes.size} bytes sent for $file: "+new String(bytes))
case x => println("cmd "+x+" not implemented!")
}
}
}
}
client
object FileHandler extends App
{
val context = ZMQ.context(1)
// client is notified of new files then fetches file from server
val client = context.socket(ZMQ.DEALER)
client.connect("tcp://*:5565")
val poller = new ZMQ.Poller(1) //"poll" responses
poller.register(client, ZMQ.Poller.POLLIN)
while (true)
{
poller.poll
val zmsg = ZMsg.recvMsg(client)
val cmd = zmsg.pop
val data = zmsg.pop
// header is the command/action
cmd.toString match {
case "newfile" => startDownload(data.toString)// message content is the filename to fetch
case "chunk" => gotChunk(data.toString, zmsg.pop.getData) //filename, chunk
case "eof" => endDownload(data.toString, zmsg.pop.getData) //filename, last chunk
}
}
def startDownload(filename: String)
{
println("got notification: start download for "+filename)
client.send("get".getBytes, ZMQ.SNDMORE) //command header
client.send(filename.getBytes, 0)
}
def gotChunk(filename: String, bytes: Array[Byte])
{
println("got chunk for "+filename+": "+new String(bytes)) //callback the user here
client.send("next".getBytes, ZMQ.SNDMORE)
client.send(filename.getBytes, 0)
}
def endDownload(filename: String, bytes: Array[Byte])
{
println("got eof for "+filename+": "+new String(bytes)) //callback the user here
}
}
On the client, you don't need PULL with DEALER.
DEALER is PUSH and PULL combined, so use DEALER only, your code will be simpler.
Same goes for the server, unless you're doing something special, you don't need PUSH with ROUTER, router is bidirectional.
the server worker gets the request, and writes the response back to
the inproc queue but the response never seems to go out of the server
(can't see it in wireshark) and the client is stuck on the poller.poll
awaiting the response.
Code Problems
In the server, you're dispatching files with args.toList.foreach before starting the proxy, this is probably why nothing is leaving the server. Start the proxy first, then use it; Also, once you call ZMQProxy(..), the code blocks indefinitely, so you'll need a separate thread to send the filepaths.
The client may have an issue with the poller. The typical pattern for polling is:
ZMQ.Poller items = new ZMQ.Poller (1);
items.register(receiver, ZMQ.Poller.POLLIN);
while (true) {
items.poll(TIMEOUT);
if (items.pollin(0)) {
message = receiver.recv(0);
In the above code, 1) poll until timeout, 2) then check for messages, and if available, 3) get with receiver.recv(0). But in your code, you poll then drop into recv() without checking. You need to check if the poller has messages for that polled socket before calling recv(), otherwise, the receiver will hang if there's no messages.

Resources