I am using HornetQ for email sending.
File attachments are transmitted out-of-band (not as part of the message) using an object storage system. This adds some overhead that I want to avoid for small files by putting them into message properties directly.
I know that I can send huge message bodies, but for large files, object storage works well, this is about small files, and delivery by property value would be very convenient if it works.
What are the considerations for message property values? Can I make them a 100K byte array? Will this slow things down (or even break)?
Headers, Properties and the Body buffer themselves are all combined in relatively straightforward process into the overall buffer for the message, so there should not be significant performance issues from that perspective. You can see the core implementation here:
https://github.com/hornetq/hornetq/blob/master/hornetq-core-client/src/main/java/org/hornetq/core/message/impl/MessageImpl.java
One consideration would be the size of your consumer window size, which by default would only be 1MB. This is the size that will be buffered on the consumer, so if you are sending messages near this size your performance in reading may be much slower as you wait for data at the consumer. This can be changed with the consumer-window-size parameter. See http://docs.jboss.org/hornetq/2.4.0.Final/docs/user-manual/html/flow-control.html#d0e4023 for more information.
Pulling from comments, you'll probably also want to increase your journal size and buffer size. See You'd probably be close to the limits. You would want to site the journal buffer size larger for sure to get some headroom, and probably size up the journal itself as well. http://hornetq.sourceforge.net/docs/hornetq-2.1.1.Final/user-manual/en/html/persistence.html#configuring.message.journal.journal-buffer-size and https://developer.jboss.org/thread/154423
Related
Trying to implement stream demuxing/decoding in memory, using producer/consumer idiom.
The producer reads data from network and pushes it into a buffer. The consumer reads data from this buffer and forwards it to avformat.
According to ffmpeg docs, the only way to implement in-memory processing is using AVIOContext, providing read function for it.
The problem is that if there is too few data in buffer, read function returns 0 and avformat thinks that it's EOF, producing an error.
Specifically this happens on avformat_open_input/avformat_find_stream_info calls, where avformat tries to probe given data. Sometimes it work, sometimes not, depending on input buffer size.
I've tried to return EAGAIN from the read function (seemed reasonable), but it doesn't work.
As workaround I've increased input buffer size, accumulating more data before passing it to avformat, but it's waste of memory (there could be hundreds/thousands streams) and can't be a general solution since we don't know exactly how much data will be needed
Is there any way to tell avformat that there will be more data, but later?
I've implemented Scala Akka application that streams 4 different types of data from biomodule sensor (ECG, EEG, Breath and general data). These data (timestamp and value) are typically stored in 4 different CSV files. However, sometimes I have to store each sample in two different files with different timestamps, so application is writing in 8 different CSV files at the same time.
Initially I've implemented one Akka actor that is responsible for persisting data, which receive path to the file in which to write data, timestamp and value. However, this was a bottleneck, since a number of samples that I need to store is large (e.g. one ECG sample is received each 4ms). As a result, this actor had finished recording in very short experiment 1-2 minutes after experiment is over.
I've also tried with 4 actors for 4 different message types, with the idea to distribute work. I didn't notice significant improvement in performances.
I'm wondering if someone has an idea how to improve the performance. Is it better to use one actor for storing files, few actors or it is most efficient if I have one actor for each file? Or maybe, it doesn't make any difference? Could I improve my code for storing data?
This is my method responsible for storing data:
def processValue(sample: WaveformValue): Unit ={
val csvfilewriter=new PrintWriter(new BufferedWriter(new FileWriter(sample.filepath,true)))
csvfilewriter.append(sample.timestamp.toString)
csvfilewriter.append(",")
csvfilewriter.append(sample.value.toString)
csvfilewriter.append("\r\n")
csvfilewriter.flush()
csvfilewriter.close()
}
It seems to me that your bottleneck is I/O -- disk access. It looks like you are opening, writing to, and closing a file for each sample, which is very expensive. I would suggest:
Open each file just once, and close it at the end of all processing. You might need to store the file in a member variable, or if you have have an arbitrary collection of files then store them in a map in a member variable.
Don't flush after every sample write.
Use buffered writes for each file writer. This avoids flushing data to the filesystem with every write, which involves a system call and waiting for the data to be written to disk. I see that you're already doing this, but the benefit is lost since you are flushing/closing the file after each sample anyway.
I am trying to gauge the performance of RabbitMQ when my message size increases to a few MB. However, even when I sent a 32KB message, I get a Resource temporarily unavilable message from the Server. There's no error in the log files, there are no memory limit reaching errors... How do I go about debugging this issue?
If it's on any help, I'm running this on EC2 T1.micro instance.. So 592MB RAM.
According to the bug you linked, someone recently (looks like after you left the link to the bug) left a comment that they can reliably reproduce the bug when the message size is >=15821 bytes.
I would recommend that you see if that also holds true for you -- i.e. can you also reproduce at that threshold -- and then evaluate if under that amount -- thus avoiding the bug documented in the issue above -- is a sufficient size for your needs. If not, you may want to try pika (https://github.com/pika/pika) and see if that works better with larger messages (one of the other comments on that bug suggests that pika did work for them with larger message sizes).
Another option that may work, depending on your exact use case, would be to include in the rabbitmq message payload a key of sorts that points allows you to fetch the large blob of data from wherever it's stored (Postgres, MongoDB, etc.) when you consume the message, and therefore allow you to avoid the bug. Perhaps not ideal if you really want to encapsulate everything inside the payload, but may be a feasible workaround to the bug.
In terms of debugging, since it appears that this is a bug with rabbitpy itself, I think you would need to debug the actual rabbitpy library if you wanted to proceed on that front. Doable, but perhaps not feasible due to time, etc.
I fetch images with open-uri from a remote website and persist them on my local server within my Ruby on Rails application. Most of the images were shown without a problem, but some images just didn't show up.
After a very long debugging-session I finally found out (thanks to this blogpost) that the reason for this is that the class Buffer in the open-uri-libary treats files with less than 10kb in size as IO-objects instead of tempfiles.
I managed to get around this problem by following the answer from Micah Winkelspecht to this StackOverflow question, where I put the following code within a file in my initializers:
require 'open-uri'
# Don't allow downloaded files to be created as StringIO. Force a tempfile to be created.
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0
This works as expected so far, but I keep wondering, why they put this code into the library in the first place? Does anybody know a specific reason, why files under 10kb in size get treated as StringIO ?
Since the above code practically resets this behaviour globally for my entire application, I just want to make sure that I am not breaking anything else.
When one does network programming, you allocate a buffer of a reasonably large size and send and read units of data which will fit in the buffer. However, when dealing with files (or sometimes things called BLOBs) you cannot assume that the data will fit into your buffer. So, you need special handling for these large streams of data.
(Sometimes the units of data which fit into the buffer are called packets. However, packets are really a layer 4 thing, like frames are at layer 2. Since this is happening a layer 7, they might better be called messages.)
For replies larger than 10K, the open-uri library is setting up the extra overhead to write to a stream objects. When under the StringMax size, it just includes the string in the message, since it knows it can fit in the buffer.
I have built a simple project which use "Winsock" Tool.
When I receive any data I put it in a variable because i cann't put it in a textbox because
it is a file not a text.
But if i send a big file it gets me an error.
"Overflow"
Are there any way to fix this problem ?
A VB variable-length string can only in theory be 2GB in size, it's actual maximum size is depending on available virtual memory which is also limited to 2GB for the entire application. But since VB stores the string in unicode format it means that it can only contain 1GB of text.
(maximum length for string in VB6)
If this is your problem, try splitting incoming data by several strings.
Are you handling the SendComplete event properly before sending more data?
Otherwise you will get a buffer overflow from the WinSock control.
You need to split your data into smaller packets (around 2-5k each should do it) and send each packet individually, then re-construct your packets at the other end. You could add a unique character at the end of the data so that the receiving end know that all the data has been received for that transmission say Chr(0)?
This is quite a simplified solution to this problem - a better method would be to devise a simple protocol for data handshaking so you know each packet has been received.