I'm using Chronicle Queue v5.17.0 to process messages and my understanding is that the queue does not lose messages even if the java process dies (due to the fact that queue uses memory mapped file which is flushed by the OS).
Will some messages be lost if the VM dies or OS crashes before it flushes memory content to file?
Is there a way to control messages flush to disk?
Thank you!
Yes, If the data has not been flushed to disk it will be lost. There are also no guarantees that the disk that you are writing to has not become corrupted.
Even a force flush to disk cannot be relied on, as such, if you wish to guarantee that no messages are lost we recommend that you use chronicle-queue-enterprise to replicate your queues data to another host, Once the acknowledgement has been received ( for each message ) you now have a safe copy of each message, for more information on chronicle-queue enterprise please contact sales#chronicle.software
Related
Is it okay to hold large state in RocksDB when using Kafka Streams? We are planning to use RocksDB as an eventstore to hold billions of events for ininite of time.
Yes, you can store a lot of state there but there are some considerations:
The entire state will also be replicated on the changelog topics, which means your broker will need to have enough disk space for it. Note that this will NOT be mitigated by KIP-405 (Tiered Storage) as tiered storage does not apply for compacted topics.
As #OneCricketeer mentioned, rebuilding the state can take a long time if there's a crash. However, you can mitigate it via multiple ways:
Use a persistent store and re-start the application on a node with access to the same disk (StatefulSet + PersistentVolume in K8s works).
In exactly-once semantics, until KIP-844 is implemented upon an unclean shutdown the state will still be rebuilt from scratch. But once that PR is merged then only a small amount of data will have to be replayed.
Have standby replicas. They will enable failover as soon as the consumer session timeout expires once the kafka streams instance crashes.
The main limitation would be disk space, so sure, it can be done, but if the app crashes for any reason, you might be waiting for a while for the app to rebuild its state.
We are using IBM MQ and recently we faced an issue where some messages that were declared as sent to the MQ server by our client application were not consumed by our MQ consumer.
We lacked logging produced/consumed messages so we tried to check messages in MQ server log/data.
We found that messages are stored in /var/mqm/qmgrs/MQ_MANAGER/queues/ but we didn't find there all messages in the queue file (old messages were not found)
What is the rollover policy of IBM MQ and where does old queues files go?
That's not how the queue files work. They are not rollover logs. The same space is continually overwritten as needed to store messages, but messages may not be written there at all if they can be processed through memory caches etc.
PERSISTENT messages are usually logged in files under /var/mqm/log, but there are circumstances where even that can be avoided. Your qmgr's recovery logfile configuration (circular/linear etc) will determine whether historic information about PERSISTENT messages remains available.
NONPERSISTENT messages are never logged in those files.
In IBM MQ messages can be either persistent or non-persistent.
If a message is persistent it will normally be written to the transactional logs (usually under /var/mqm/log/MQ_MANAGER/active) before a commit completes or before the PUT completes if not done under a unit of work.
If a message is non-persistent it will not be written to the transactional logs.
At this point either type of message may reside only in memory and will only be written to the queue file (usually under /var/mqm/qmgrs/MQ_MANAGER/queues) if it needs to offload memory or if it is persistent and a check point is taken.
If the message is consumed in a timely manner it may never be written to the queue file.
The queue file will shrink in size if space taken up by messages that are no longer needed is in use, this happens automatically and is not configurable or documented by IBM as far as I know.
Non-persistent messages generally do not survive a queue manager restart.
Transactional logs can be configured as circular or linear. If circular the logs will be reused once they are no longer needed. If linear with automatic log management (introduced in 9.0.2) they will work similarly to circular. If linear without automatic log management, what happens to logs that are no longer needed would be based on your own log management.
If the message is still in the transactional log you may be able to view it as described in "Where's my message? Tool and instructions to use the MQ recovery log to find out what happened to your persistent MQ messages on distributed platforms".
I am using the following pipeline to forward data
Auditbeat ---> logstash ---> ES
Suppose if the logstash machine goes down, I want to know how the Auditbeat handles the situation.
I would like to know the specifics like
is there a retry mechanism?
how long will it retry?
what happens to the audit logs, will it be lost?
the reason that I ask question 3 is that, we enable auditbeat by disabling auditd service (which was generating the auditlogs under /var/log/audit/audit.log). SO
if logstash goes down there is no data forwarding happening and hence there is a chance of data loss. Please clarify.
if auditbeat is storing the data while logstash is down, where is it doing so? and what is the memory(disk space) allocated to this saving process?
Thanks in advance
Auditbeat has an internal queue which stores the events before sending it to the configured output, by default this queue is a memory queue that will store up to 4096 events.
If the queue is full, no more events will be stored until the output comes back and start to receive data from auditbeat, there is a risk of data loss here.
You can change the number of the events that the memory queue stores.
There is also the option to use a file queue, which will save the events to disk before sending to the configured output, but this feature is still in beta.
You can read about the internal queue in the documentation.
I am using circular logging. Because of human intervention, one of the
queue files is corrupted.
Since the circular logging is not having the ability of recovering the
corrupted queue files, what will be the next steps it will take?
Will queue manager create an empty queue file for that queue and start
enrolling the messages to it? Else, it will just show the pending
messages in the queue but not allow the applications to process?
As you correctly note, MQ cannot recover from a damaged queue file when it is configured for circular logging.
Will queue manager create an empty queue file for that queue and start enrolling the messages to it? Else, it will just show the pending messages in the queue but not allow the applications to process?
None of the above. The queue manager will return an error to any process attempting to access that queue.
When a queue file is damaged, it may or may not have had messages in it. There is no automatic recovery possible that would correctly reconcile the state of any messages that may have been enqueued, therefore no further processing is done on that queue and any access returns an error. Human intervention is required in that case and the fix is to delete and redefine the queue using runmqsc.
If additional queue recovery is required to make sure messages are not lost in such cases, linear logging is mandatory.
The queue manager is not going to create an new queue file automatically. If you truly have a corrupted queue then you may have to delete and recreate it. It would be helpful if you can provide more info about the error you see indicating the queue is corrupt. Also, what version of MQ are you using?
According to the information center from IBM MQ, we can backup queue manager data in order to Backup and restore QMGR. One of the step is to take the copy of qmgr data and log file directories.
My question is what the data and log file directories specially mean? is it my below understanding correct?
data directory ---- /var/mqm/qmgrs/QMGR01/
log directory ---- /var/mqm/log/QMGR01/
Another one is MQ has the non-persistent and persistent message type. As for the non-persistent, is the message only being stored in memory? Once whatever crash, it can not be recovered.Rgt? However, the persistent message can survive that crash. But where is the persistent message stored normally?
Please help me out. Thanks very much
Yes, you have the directories correct. Just make sure that if you take a filesystem backup that the QMgr is shut down at the time.
Be aware that point-in-time backups are usually not a good strategy for backing up a QMgr. Whatever messages are on the QMgr at the time will be redelivered when the QMgr is restored, unless you take measures to stop that from happening. If the QMgr is in a cluster, it will be out of synch with the cluster when restored.
Generally the approach to backing up a QMgr is to save the object definitions, the access control lists, any exits and their parm files. Restoring the QMgr is a matter of using crtmqm to create a new instance and running in all the definitions.
Non-persistent messages are stored in memory until they overflow memory and then they are stored to the queue file on disk. If the queue is marked as NPMCLASS(HIGH) then the QMgr will attempt to save and restore non-persistent messages through an orderly shutdown and restart but will discard them if the QMgr crashes.
Persistent messages are hardened to both the queue and log files before control is returned to the calling program if written out of syncpoint. If persistent messages are written under syncpoint, WMQ allows lazy cached writes of the messages but insures they are all flushed before returning control from the COMMIT command.