routing files with zeromq (jeromq) - zeromq

I'm trying to implement a "file dispatcher" on zmq (actually jeromq, I'd rather avoid jni).
What I need is to load balance incoming files to processors:
each file is handled only by one processor
files are potentially large so I need to manage the file transfer
Ideally I would like something like https://github.com/zeromq/filemq but
with a push/pull behaviour rather than publish/subscribe
being able to handle the received file rather than writing it to disk
My idea is to use a mix of taskvent/tasksink and asyncsrv samples.
Client side:
one PULL socket to be notified of a file to be processed
one DEALER socket to handle the (async) file transfer chunk by chunk
Server side:
one PUSH socket to dispatch incoming file (names)
one ROUTER socket to handle file requests
a few DEALER workers managing the file transfers for clients and connected to the router via an inproc proxy
My first question is: does this seem like the right way to go? Anything simpler maybe?
My second question is: my current implem gets stuck on sending out the actual file data.
clients are notified by the server, and issue a request.
the server worker gets the request, and writes the response back to the inproc queue but the response never seems to go out of the server (can't see it in wireshark) and the client is stuck on the poller.poll awaiting the response.
It's not a matter of sockets being full and dropping data, I'm starting with very small files sent in one go.
Any insight?
Thanks!
==================
Following raffian's advice I simplified my code, removing the push/pull extra socket (it does make sense now that you say it)
I'm left with the "non working" socket!
Here's my current code. It has many flaws that are out of scope for now (client ID, next chunk etc..)
For now, I'm just trying to have both guys talking to each other roughly in that sequence
Server
object FileDispatcher extends App
{
val context = ZMQ.context(1)
// server is the frontend that pushes filenames to clients and receives requests
val server = context.socket(ZMQ.ROUTER)
server.bind("tcp://*:5565")
// backend handles clients requests
val backend = context.socket(ZMQ.DEALER)
backend.bind("inproc://backend")
// files to dispatch given in arguments
args.toList.foreach { filepath =>
println(s"publish $filepath")
server.send("newfile".getBytes(), ZMQ.SNDMORE)
server.send(filepath.getBytes(), 0)
}
// multithreaded server: router hands out requests to DEALER workers via a inproc queue
val NB_WORKERS = 1
val workers = List.fill(NB_WORKERS)(new Thread(new ServerWorker(context)))
workers foreach (_.start)
ZMQ.proxy(server, backend, null)
}
class ServerWorker(ctx: ZMQ.Context) extends Runnable
{
override def run()
{
val worker = ctx.socket(ZMQ.DEALER)
worker.connect("inproc://backend")
while (true)
{
val zmsg = ZMsg.recvMsg(worker)
zmsg.pop // drop inner queue envelope (?)
val cmd = zmsg.pop //cmd is used to continue/stop
cmd.toString match {
case "get" =>
val file = zmsg.pop.toString
println(s"clientReq: cmd: $cmd , file:$file")
//1- brute force: ignore cmd and send full file in one go!
worker.send("eof".getBytes, ZMQ.SNDMORE) //header indicates this is the last chunk
val bytes = io.Source.fromFile(file).mkString("").getBytes //dirty read, for testing only!
worker.send(bytes, 0)
println(s"${bytes.size} bytes sent for $file: "+new String(bytes))
case x => println("cmd "+x+" not implemented!")
}
}
}
}
client
object FileHandler extends App
{
val context = ZMQ.context(1)
// client is notified of new files then fetches file from server
val client = context.socket(ZMQ.DEALER)
client.connect("tcp://*:5565")
val poller = new ZMQ.Poller(1) //"poll" responses
poller.register(client, ZMQ.Poller.POLLIN)
while (true)
{
poller.poll
val zmsg = ZMsg.recvMsg(client)
val cmd = zmsg.pop
val data = zmsg.pop
// header is the command/action
cmd.toString match {
case "newfile" => startDownload(data.toString)// message content is the filename to fetch
case "chunk" => gotChunk(data.toString, zmsg.pop.getData) //filename, chunk
case "eof" => endDownload(data.toString, zmsg.pop.getData) //filename, last chunk
}
}
def startDownload(filename: String)
{
println("got notification: start download for "+filename)
client.send("get".getBytes, ZMQ.SNDMORE) //command header
client.send(filename.getBytes, 0)
}
def gotChunk(filename: String, bytes: Array[Byte])
{
println("got chunk for "+filename+": "+new String(bytes)) //callback the user here
client.send("next".getBytes, ZMQ.SNDMORE)
client.send(filename.getBytes, 0)
}
def endDownload(filename: String, bytes: Array[Byte])
{
println("got eof for "+filename+": "+new String(bytes)) //callback the user here
}
}

On the client, you don't need PULL with DEALER.
DEALER is PUSH and PULL combined, so use DEALER only, your code will be simpler.
Same goes for the server, unless you're doing something special, you don't need PUSH with ROUTER, router is bidirectional.
the server worker gets the request, and writes the response back to
the inproc queue but the response never seems to go out of the server
(can't see it in wireshark) and the client is stuck on the poller.poll
awaiting the response.
Code Problems
In the server, you're dispatching files with args.toList.foreach before starting the proxy, this is probably why nothing is leaving the server. Start the proxy first, then use it; Also, once you call ZMQProxy(..), the code blocks indefinitely, so you'll need a separate thread to send the filepaths.
The client may have an issue with the poller. The typical pattern for polling is:
ZMQ.Poller items = new ZMQ.Poller (1);
items.register(receiver, ZMQ.Poller.POLLIN);
while (true) {
items.poll(TIMEOUT);
if (items.pollin(0)) {
message = receiver.recv(0);
In the above code, 1) poll until timeout, 2) then check for messages, and if available, 3) get with receiver.recv(0). But in your code, you poll then drop into recv() without checking. You need to check if the poller has messages for that polled socket before calling recv(), otherwise, the receiver will hang if there's no messages.

Related

Pushing data to websocket browser client in Lua

I want to use a NodeMCU device (Lua based top level) to act as a websocket server to 1 or more browser clients.
Luckily, there is code to do this here: NodeMCU Websocket Server
(courtesy of #creationix and/or #moononournation)
This works as described and I am able to send a message from the client to the NodeMCU server, which then responds based on the received message. Great.
My questions are:
How can I send messages to the client without it having to be sent as a response to a client request (standalone sending of data)? When I try to call socket.send() socket is not found as a variable, which I understand, but cannot work out how to do it! :(
Why does the decode() function output the extra variable? What is this for? I'm assuming it will be for packet overflow, but I can never seem to make it return anything, regardless of my message length.
In the listen method, why has the author added a queuing system? is this essential or for applications that perhaps may receive multiple simultaneous messages? Ideally, I'd like to remove it.
I have simplified the code as below:
(excluding the decode() and encode() functions - please see the link above for the full script)
net.createServer(net.TCP):listen(80, function(conn)
local buffer = false
local socket = {}
local queue = {}
local waiting = false
local function onSend()
if queue[1] then
local data = table.remove(queue, 1)
return conn:send(data, onSend)
end
waiting = false
end
function socket.send(...)
local data = encode(...)
if not waiting then
waiting = true
conn:send(data, onSend)
else
queue[#queue + 1] = data
end
end
conn:on("receive", function(_, chunk)
if buffer then
buffer = buffer .. chunk
while true do
local extra, payload, opcode = decode(buffer)
if opcode==8 then
print("Websocket client disconnected")
end
--print(type(extra), payload, opcode)
if not extra then return end
buffer = extra
socket.onmessage(payload, opcode)
end
end
local _, e, method = string.find(chunk, "([A-Z]+) /[^\r]* HTTP/%d%.%d\r\n")
local key, name, value
for name, value in string.gmatch(chunk, "([^ ]+): *([^\r]+)\r\n") do
if string.lower(name) == "sec-websocket-key" then
key = value
break
end
end
if method == "GET" and key then
acceptkey=crypto.toBase64(crypto.hash("sha1", key.."258EAFA5-E914-47DA-95CA-C5AB0DC85B11"))
conn:send(
"HTTP/1.1 101 Switching Protocols\r\n"..
"Upgrade: websocket\r\nConnection: Upgrade\r\n"..
"Sec-WebSocket-Accept: "..acceptkey.."\r\n\r\n",
function ()
print("New websocket client connected")
function socket.onmessage(payload,opcode)
socket.send("GOT YOUR DATA", 1)
print("PAYLOAD = "..payload)
--print("OPCODE = "..opcode)
end
end)
buffer = ""
else
conn:send(
"HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nContent-Length: 12\r\n\r\nHello World!",
conn.close)
end
end)
end)
I can only answer 1 question, the others may be better suited for the library author. Besides, SO is a format where you ask 1 question normally.
How can I send messages to the client without it having to be sent as a response to a client request (standalone sending of data)?
You can't. Without the client contacting the server first and establishing a socket connection the server wouldn't know where to send the messages to. Even with SSE (server-sent events) it's the client that first initiates a connection to the server.

How to safely write to one file from many verticle instances in vert.x 3.2?

Instead of using a logger or database server I'd like to append information to one file from possibly many verticle instances.
There are versions of methods for writing asynchronously to a file.
Can I assume that vertx handles the synchronisation between the writes so that these dont interfere when using those versions of methods marked as ¨async¨ ?
There seems to be a rule that one can rely on vertx providing all isolation between concurrent processing out of the box. But is that true in case of writing file access?
Could you please include a code snippet into the answer that shows how to open and write to one file from many verticle instances with finest possible granularity, e.g. for logging requests.
I wouldn't recommend writing to a single file with many different "writers". Regarding concurrent logging I would stick to the Single Writer principle.
Create a Verticle which subscribes to the Event Bus and listens for messages to be logged. Lets call this Verticle Logger which listens to system.logger.
EventBus eb = vertx.eventBus();
eb.consumer("system.logger", message -> {
// write to file
});
Verticles which like to log something need to send a message to the Logger Verticle:
eventBus.send("system.logger", "foobar");
Appending to a existing file work something like this (didn't test):
vertx.fileSystem().open("file.log", new OpenOptions(), result -> {
if (result.succeeded()) {
Buffer buff = Buffer.buffer(message); // message from consume
AsyncFile file = result.result();
file.write(buff, buff.length() * i, ar -> {
if (ar.succeeded()) {
System.out.println("done");
} else {
System.err.println("write failed: " + ar.cause());
}
});
} else {
System.err.println("open file failed " + result.cause());
}
});

NetMQ PUSH socket blocks indefinitely when it reaches HWM

I'm using NetMQ (Nuget 3.3.2.2) on .NET 4.5 and I have a single fast generator process with a PUSH socket, and a single slow consumer process using a PULL socket. If I send enough messages to hit the sending HWM, the sending process blocks the thread indefinitely.
Some contrived (generator) code which illustrates the problem:
using (var ctx = NetMQContext.Create())
using (var pushSocket = ctx.CreatePushSocket())
{
pushSocket.Connect("tcp://127.0.0.1:42404");
var template = GenerateMessageBody(i);
for (int i = 1; i <= 100000; i++)
{
pushSocket.SendMoreFrame("SampleMessage").SendFrame(Messages.SerializeToByteArray(template));
if (i % 1000 == 0)
Console.WriteLine("Sent " + i + " messages");
}
Console.WriteLine("All finished");
Console.ReadKey();
}
On my configuration, this will usually report it has sent about 5000 or 6000 messages, and will then simply block. If I set the send HWM set to a large value (or 0), then it sends all of the messages as expected.
It looks like it's waiting to receive another command before it tries again, here: (SocketBase.TrySend)
// Oops, we couldn't send the message. Wait for the next
// command, process it and try to send the message again.
// If timeout is reached in the meantime, return EAGAIN.
while (true)
{
ProcessCommands(timeoutMillis, false);
From what I've read in the 0MQ guide, blocking on a full PUSH sockeet is the correct behaviour (and is what I want it to do), however I would expect it to recover once the consumer has cleared its queue.
Short of using some sort of TrySend pattern and dealing with the block myself, is there some option I can set or some other facility I can use to have the PUSH socket attempt to resend blocked messages periodically?

How to synchronize data between multiple workers

I've the following problem that is begging a zmq solution. I have a time-series data:
A,B,C,D,E,...
I need to perform an operation, Func, on each point.
It makes good sense to parallelize the task using multiple workers via zmq. However, what is tripping me up is how do I synchronize the result, i.e., the results should be time-ordered exactly the way the input data came in. So the end result should look like:
Func(A), Func(B), Func(C), Func(D),...
I should also point out that time to complete,say, Func(A) will be slightly different than Func(B). This may require me to block for a while.
Any suggestions would be greatly appreciated.
You will always need to block for a while in order to synchronize things. You can actually send requests to a pool of workers, and when a response is received - to buffer it if it is not a subsequent one. One simple workflow could be described in a pseudo-language as follows:
socket receiver; # zmq.PULL
socket workers; # zmq.DEALER, the worker thread socket is started as zmq.DEALER too.
poller = poller(receiver, workers);
next_id_req = incr()
out_queue = queue;
out_queue.last_id = next_id_req
buffer = sorted_queue;
sock = poller.poll()
if sock is receiver:
packet_N = receiver.recv()
# send N for processing
worker.send(packet_N, ++next_id_req)
else if sock is workers:
# get a processed response Func(N)
func_N_response, id = workers.recv()
if out_queue.last_id != id-1:
# not subsequent id, buffer it
buffer.push(id, func_N_rseponse)
else:
# in order, push to out queue
out_queue.push(id, func_N_response)
# also consume all buffered subsequent items
while (out_queue.last_id == buffer.min_id() - 1):
id, buffered_N_resp = buffer.pop()
out_queue.push(id, buffered_N_resp)
But here comes the problem what happens if a packet is lost in the processing thread(the workers pool).. You can either skip it after a certain timeout(flush the buffer into the out queue), amd continue filling the out queue, and reorder when the packet comes later, if ever comes.

hornerq JMSbridge acknowledge

I learning Hornetq code recently, and have a doubt about JMSbridge.
you can see, there has a fun named "sendMessages()" in the JMSbridgeImpl.java. the fun send msg to the remote JMSServer, but without doing acknowledge.
but int the fun named "sendBatchNonTransacted()" , there just dong acknowledge with the last msg, such as "messages.getLast().acknowledge();"
so the question is: why not do ack by the each msg in the fun named "sendMessages()"?
apologize, my English is pool.
I'm online waiting for you help! thank you !
--------------------------------------------
oh, thanks "Moj Far" very much for the frist question, i got it.
but i have a other question: i modifid the hornetQ source codes,
i want to use ClientConsumer(have successful init) to get msg in the local HornetQ
and use JMS producer to send msg to the remote JMSServer.
In the "Run" function of the "SourceReceiver" class, i modyfy as this:
if (bridgeType == JMSBridgeImpl.ALL_NETTY_MODE ) {
msg = sourceConsumer.receive(1000);
} else { /* core client receive msg */
cmsg = localClientConsumer.receive(1000);
if (cmsg != null) {
hq_msg = HornetQMessage.createMessage(cmsg, localClientSession);
//hq_msg = HornetQMessage.createMessage(cmsg, null);
hq_msg.doBeforeReceive();
//cmsg.acknowledge(); do ack after send
msg = hq_msg;
}
}
but after 2 hours runing, the VM memary is Overflow.
i alsotry to not use localClientSession to create HornetQMessage,but the memory is also Overflow.
is there something wrong in my code?
We have two ways for guarantee of sending messages:
Transactional (used in sendBatchLocalTx() and sendBatchXA())
Non Transactional or Acknowledgement (used in sendBatchNonTransacted())
All of sendBatchLocalTx(), sedBatchXA(), and sendBatchNonTransacted(), use sendMessages() for sending messages, then guarantee sending messages with their own way (commit a transaction or ack messages)!
In sendBatchNonTransacted(), it uses Client Acknowledgement Mode; In this mode, with acking a message (here, the last message), all sent messages in current session will be acked!

Resources