Dozens of small messages lead to 1 GB chronicle queue file - chronicle

I write utf8 strings to chronicle queue with daily rolling. The default queue file size is 81920 KB. After I write dozens of messages (1 KB each), the file becomes over 1 GB quickly. How can I control the file size?

Chronicle Queue records every message. It is designed on the assumption that disk space is cheap. You can now by 1,000 GB enterprise SSD for a reasonable price. Some users retain over 100 TB in queues.
You can increase the roll rate to hourly and delete files you don't need. There is a store file listener so you can determine when files roll over.
The file shouldn't be much larger than the data you are storing. If it is can you a test case which reproduces the problem.

Related

nifi splitRecord hung

I'm testing nifi SplitRecord with a small file of only 11 records
However, SplitRecord hangs for a long time. I don't get a clue what it is doing.
Processor Hung
SPlitRecord Properties:
more properties
Is Records Per Split controlling
the maximum, or the minimum, or exact number of records per split?
if the total number of records is less than records per split, what's the behavior of SplitRecords? does it wait until a time-out and then put all on-hold records in to a single split?
After about 10 minutes or random number of start/stop/terminate/restart
it may trigger the processor to split the data sooner.
Records Per Split controls the maximum, see "SplitRecord.java" for the code. If there are fewer records than the RECORDS_PER_SPLIT value, it will immediately push them all out.
However, it does look like it is creating a new FlowFile, even if the total record count is less than the RECORDS_PER_SPLIT value, meaning it's doing disk writing regardless of whether a split really occured.
I would probably investigate two things:
Host memory - how much memory does the host have? How much is configured as NiFi max heap? How much total system memory is in use/free? NiFi performs best when plenty of system memory is left for file cache.
Host's disks, specifically the disk that has the Content Repository on it. Capacity? IO? Is it shared with other services? FlowFile content is written to the Content Repository, if the disk is shared with the OS, or other busy services (or other NiFi repos) it can really slow content modification down.
Note: your NiFi version over 3 years old, please consider upgrading.

Elasticsearch Ram recommendation

I'm deploying an Elasticsearch cluster with roughly 40GB a day with a time-to-live of 365 days. Write speed would be around 50 msgs/sec. Reads would be mostly driven by user dashboards, so the read frequency won't be high. What would be the best hardware requirements for this amount of data? How many master and data nodes may require in this situation?
obviously base on search index rate you should choose the hardware. 50 msg/sec is very low for elasticsearch. you have total 14.6TB data that is your 85 percent of total disk (base on 85% watermark). this means that you need 17TB disk. I think you can use one server with 128GB RAM and atleast 10 Core CPU and 17TB disk or have two server with half of this config. one server is master and data node and one server will be only data node.

How to calculate my applications iops utilization

I'm trying to figure out how I determine the IOPS my application is driving so I can property size our cloud infrastructure components. I understand what IOPS are between a database and the storage layer but I'd like to understand how I go about calculating what my application drives. Here are some of my applications characteristics:
1) 90% write and 10% read
2) We have a java based application that ultimately inserts into an HBase database
3) Process about 50 msg/sec where each message results in probably 2 HBase inserts
Here is what I'm not sure about:
1) Is the only way to calculate the IOPS is by running iostat or something on the actual server during load?
2) Is there a general way I can calculate what needed from the data volume/size coming in and not on the actual storage unit?
3) Is there any relationship to the # of transactions and the # of bytes in each transaction (I read somewhere an IO is usually 3K, most inserts don't contain that much info so it doesn't matter).
Any help would be greatly appreciated.
Not very familiar with Hbase. But from the documentation, it uses a log structure, which means the writes will be sequential writes. It also has compactions, which will cause both sequential reads and writes of multi-MB. The read queries will cause random reads on the storage layer.
So here is the answer to your questions:
As far as I know, yes. The only way to get IOPS is running iostat. You can probably get some compaction stats from the application level. But it is hard to extract IOPS level details.
Compaction will cause more storage than the entire data size. And if your application is write heavy(compaction might not catch up with the speed of inserts), the size of actual data volume will be much larger. Given the 50 msg/sec in your question, this should not be the case. I will provision disks double the size of expected data volume per instance.
As mentioned above, Hbase is log structured. Writes are accumulated in memory and flushed to disk together. So it doesn't matter the size of each transaction.

How much load can cassandra handle on m1.xlarge instance?

I setup 3 nodes of Cassandra (1.2.10) cluster on 3 instances of EC2 m1.xlarge.
Based on default configuration with several guidelines included, like:
datastax_clustering_ami_2.4
not using EBS, raided 0 xfs on ephemerals instead,
commit logs on separate disk,
RF=3,
6GB heap, 200MB new size (also tested with greater new size/heap values),
enhanced limits.conf.
With 500 writes per second, the cluster works only for couple of hours. After that time it seems like not being able to respond because of CPU overload (mainly GC + compactions).
Nodes remain Up, but their load is huge and logs are full of GC infos and messages like:
ERROR [Native-Transport-Requests:186] 2013-12-10 18:38:12,412 ErrorMessage.java (line 210) Unexpected exception during request java.io.IOException: Broken pipe
nodetool shows many dropped mutations on each node:
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 7
BINARY 0
READ 2
MUTATION 4072827
_TRACE 0
REQUEST_RESPONSE 1769
Is 500 wps too much for 3-node cluster of m1.xlarge and I should add nodes? Or is it possible to further tune GC somehow? What load are you able to serve with 3 nodes of m1.xlarge? What are your GC configs?
Cassandra is perfectly able to handle tens of thousands small writes per second on a single node. I just checked on my laptop and got about 29000 writes/second from cassandra-stress on Cassandra 1.2. So 500 writes per second is not really an impressive number even for a single node.
However beware that there is also a limit on how fast data can be flushed to disk and you definitely don't want your incoming data rate to be close to the physical capabilities of your HDDs. Therefore 500 writes per second can be too much, if those writes are big enough.
So first - what is the average size of the write? What is your replication factor? Multiply number of writes by replication factor and by average write size - then you'll approximately know what is required write throughput of a cluster. But you should take some safety margin for other I/O related tasks like compaction. There are various benchmarks on the Internet telling a single m1.xlarge instance should be able to write anywhere between 20 MB/s to 100 MB/s...
If your cluster has sufficient I/O throughput (e.g. 3x more than needed), yet you observe OOM problems, you should try to:
reduce memtable_total_space_mb (this will cause C* to flush smaller memtables, more often, freeing heap earlier)
lower write_request_timeout to e.g. 2 seconds instead of 10 (if you have big writes, you don't want to keep too many of them in the incoming queues, which reside on the heap)
turn off row_cache (if you ever enabled it)
lower size of the key_cache
consider upgrading to Cassandra 2.0, which moved quite a lot of things off-heap (e.g. bloom filters and index-summaries); this is especially important if you just store lots of data per node
add more HDDs and set multiple data directories, to improve flush performance
set larger new generation size; I usually set it to about 800M for a 6 GB heap, to avoid pressure on the tenured gen.
if you're sure memtable flushing lags behind, make sure sstable compression is enabled - this will reduce amount of data physically saved to disk, at the cost of additional CPU cycles

WebSphere MQ Transactional Log file system full

Transactional log file system(/var/mqm/log) become full and i am getting MQRC 2102 resource problem with Queue Manager while attempting client connection to this queue manager. What course of action we can do to resolve this?
LogPrimaryFiles=2
LogSecondaryFiles=8
LogFilePages=16384
LogType=CIRCULAR
LogBufferPages=0
LogPath=/var/mqm/log/QMGRA/
LogWriteIntegrity=TripleWrite
Is adding additional disk space to /var/mqm/log is the only solution?
I have few queues that were full,but queue storage file system were only 60% used.
Please give me some ideas on this.
Log file pages are 4096 bytes each so a setting of LogFilePages=16384 results in log files extents of 64MB each. With a setting of LogPrimaryFiles=2 and LogSecondaryFiles=8 there can be up to 10 log files for a total of 640MB. If the file system that the circular logs resides on is less than this amount, it may fill up.
The optimum solution here is to increase the size of the log file disk allocation to something a little larger than the log file extents require. If that is not possible or you need a temporary fix then it is necessary to change the size of the log file requirement by reducing the number of extents and restarting the QMgr. Note that you can adjust the number of log extents but not the size of the extents. If it becomes necessary to change the LogFilePages=16384 parameter then it is necessary to rebuild the QMgr.
The number and size of of extents represents the total amount of data that can be under syncpoint at once but 640MB is generous in most cases. In terms of time, it also limits the longest possible duration of a unit of work on an active QMgr. This is because an outstanding transaction will be rolled back if it happens that the head pointer in the log file ever overtakes the tail pointer. For example, suppose a channel goes into retry. This holds a batch of messages under syncpoint and holds that log extent active. As applications and other channels perform their normal operations, additional transactions drive the head pointer forward. Eventually all extents will be used, and although there may be very few outstanding transactions the oldest one will be rolled back to free up that extent and advance the tail pointer forward. If the error log shows many transactions are rolled back to free log space then you really would need to allocate more space to the log file partition and bump the number of extents.

Resources