Can someone please explain Max Jobs and Flow Limit in TIBCO BW5 with a detailed example. How are Max Jobs and Flow Limit related. How can we Improve performance using these parmaeters ?
Related
I'm indexing docs in batches and try to find out what to prefer - reducing batches to fit existing max_content_length or enlarge limit, and index as much documents as possible per request.
What is recommended strategy for setting max_content_length for Elasticsearch? Is that ok to have 1GB limit, for example?
As the famous saying goes: It depends... :-)
There's no right answer to your question because the maximum size that you can send depends on what your cluster can handle based on the software/hardware specs it is running on.
The empirical way of figuring this out is to test different sizes and see which one offers the best throughput, while still allowing the cluster to serve user requests during peak times.
I have just started learning Elastic stack and I already have to diagnose production issue. Our setup from time to time has problems with pulling messages from ActiveMq to Elastic Search using Logstash. There is a lag which can be 1-3 hours.
One suspicion is that maybe load went up after latest release of our application.
Is there a way to find out total size of messages stored grouped by month? Not only their number but total size of them. Maybe documents' size went up not number of documents.
Start with setting up a production monitoring instance to provide detailed statistics on your cluster: https://www.elastic.co/guide/en/elastic-stack-overview/7.1/monitoring-production.html
This will allow you to get at those metrics like messages/month, average document size, index performance, buffer load, etc. A bit more detail on internal performance is available with https://visualvm.github.io/
While putting that piece together, you can also tweak Logstash performance e.g.
Tune Logstash worker settings:
Begin by scaling up the number of pipeline workers by using the -w flag. This will increase the number of threads available for filters and outputs. It is safe to scale this up to a multiple of CPU cores, if need be, as the threads can become idle on I/O.
You may also tune the output batch size. For many outputs, such as the Elasticsearch output, this setting will correspond to the size of I/O operations. In the case of the Elasticsearch output, this setting corresponds to the batch size.
From https://www.elastic.co/guide/en/logstash/current/performance-troubleshooting.html
I have a process with max flow limit enabled. The value being set at 10. Its a Asyn process and used to get thousands of messages daily. We noticed that at peak time, with the increase in messages in queue in EMS server, the performance of the tibco process decline. Is there is any dependency between slowness in Tibco with increased inflow of EMS messages. How to calculate the exact flow limit for a process ? do we have any standard procedure ?
The FlowLimit configuration setting is a BusinessWorks setting, so I am assuming that you have BusinessWorks engines that are consuming messages from an EMS queue.
The concept of flow control exists in order to ensure that the number of incoming evens to a BusinessWorks engine does not cause the JVM to exceed its available memory resources. BusinessWorks implements the flow control by temporarily disabling the process starter until the number of jobs in memory falls below a threshold. In the case of EMS-based process starters this entains closing the MessageConsumer, which causes EMS to stop delivering messages to the process. In high-volume messaging scenarios this will cause a backlog of messages on the EMS server. Additionally it will cause any message in the prefetch cache on the client-side to be re-prioritzed for re-delivery on the EMS server side. When this happens you will notice that your outbound message count is greater than you inbound message count in your EMS statistics.
You are best off avoiding getting into flow-controlled scenarios. Is your current FlowLimit parameter realistic for the heap size you are allotting your JVM and the message payload sizes you are working with? Can you increase your JVM heap size and also your FlowLimit? Are you able to run multiple instances of the BusinessWorks application dispatching off the same queue in order to increase scalability? The approaches may help you scale and avoid message backlogs.
I am a newbie of Storm. When I try trident with the tutorial example, they are usually a very small amount of tuples in one batch(usually no more than 10).
Trident aim to provide a high throughput,says millions of message per second.
So I want to ask how many tuples in one batches is reasonable in real world?
There is no straight forward answer to that question. It all depends on your workload and what kind of topology you are running. Once you have a desired topology, you can look at the overall throughput metrics and keep on bumping up the batch size till you start seeing some performance issues and debug that. If it's just due to the way your processing is structured and you cannot improve it any further then you can settle for a batch size smaller than that.
From:
https://groups.google.com/forum/?fromgroups=#!topic/storm-user/IfMR-kHvkBg
I am having a hard time understanding what is happening in our WebSphere 7 on AIX environment. We have a JDBC Datasource that has a connection pool with a Min/Max of 1/10.
We are running a Performance Test with HP LoadRunner and when our test finishes we gather the data for the JDBC connection pool.
The Max Pool sizes shows as 10, the Avg pool size shows as 9, the Percent Used is 12%. With just this info would you make any changes or keep things the same? The pool size is growing from 1 to 9 during our test but it says its only 12% used overall. The final question is everytime our test is in the last 15 min before stopping we see an Avg Wait time of 1.8 seconds and avg thread wait of .5 but the percent used is still 10%. FYI, the last 15 min of our test does not add additional users or load its steady.
Can anyone provide any clarity or recommendations on if we should make any changes? thx!
First, I'm not an expert in this, so take this for whatever it's worth.
You're looking at WebSphere's PMI data, correct? PercentUsed is "Average percent of the pool that is in use." The pool size includes connections that were created, but not all of those will be in-use at any point in time. See FreePoolSize, "The number of free connections in the pool".
Based on just that, I'd say your pool is large enough for the load you gave it.
Your decreasing performance at the end of the test, though, does seem to indicate a performance bottleneck of some sort. Have you isolated it enough to know for certain that it's in database access? If so, can you tell if your database server, for instance, may be limiting things?