is there a way to get the CPU utilization average result in specific timing from kibana graph instead of calculating it by myself ?
Tldr;
You can monitor your cluster with Metricbeat.
Although it will only give you the system load.
But you have no more details.
Then again there are modules such as the Linux one, which offer more information on the cpu usage.
If none of those offer what you need you might want to roll out your own solution ^^
Or wait for an update of someone with better knowledge.
Related
I'm indexing docs in batches and try to find out what to prefer - reducing batches to fit existing max_content_length or enlarge limit, and index as much documents as possible per request.
What is recommended strategy for setting max_content_length for Elasticsearch? Is that ok to have 1GB limit, for example?
As the famous saying goes: It depends... :-)
There's no right answer to your question because the maximum size that you can send depends on what your cluster can handle based on the software/hardware specs it is running on.
The empirical way of figuring this out is to test different sizes and see which one offers the best throughput, while still allowing the cluster to serve user requests during peak times.
I have just gotten into Kubernetes and really liking its ability to orchestrate containers. I had the assumption that when the app starts to grow, I can simply increase the replicas to handle the demand. However, now that I have run some benchmarking, the results confuse me.
I am running Laravel 6.2 w/ Apache on GKE with a single g1-small machine as the node. I'm only using NodePort service to expose the app since LoadBalancer seems expensive.
The benchmarking tool used are wrk and ab. When the replicas is increased to 2, requests/s somehow drops. I would expect the requests/s to increase since there are 2 pods available to serve the request. Is there a bottleneck occurring somewhere or perhaps my understanding is flawed. Do hope someone can point out what I'm missing.
A g1-small instance is really tiny: you get 50% utilization of a single core and 1.7 GB of RAM. You don't describe what your application does or how you've profiled it, but if it's CPU-bound, then adding more replicas of the process won't help you at all; you're still limited by the amount of CPU that GCP gives you. If you're hitting the memory limit of the instance that will dramatically reduce your performance, whether you swap or one of the replicas gets OOM-killed.
The other thing that can affect this benchmark is that, sometimes, for a limited time, you can be allowed to burst up to 100% CPU utilization. So if you got an instance and ran the first benchmark, it might have used a burst period and seen higher performance, but then re-running the second benchmark on the same instance might not get to do that.
In short, you can't just crank up the replica count on a Deployment and expect better performance. You need to identify where in the system the actual bottleneck is. Monitoring tools like Prometheus that can report high-level statistics on per-pod CPU utilization can help. In a typical database-backed Web application the database itself is the bottleneck, and there's nothing you can do about that at the Kubernetes level.
Hope you can help me with this!
What is the best approach to get and set request and limits resource per pods?
I was thinking in setting an expected number of traffic and code some load tests, then start a single pod with some "low limits" and run load test until OOMed, then tune again (something like overclocking) memory until finding a bottleneck, then attack CPU until everything is "stable" and so on. Then i would use that "limit" as a "request value" and would use double of "request values" as "limit" (or a safe value based on results). Finally scale them out for the average of traffic (fixed number of pods) and set autoscale pods rules for peak production values.
Is this a good approach? What tools and metrics do you recommend? I'm using prometheus-operator for monitoring and vegeta for load testing.
What about vertical pod autoscaling? have you used it? is it production ready?
BTW: I'm using AWS managed solution deployed w/ terraform module
Thanks for reading
I usually start my pods with no limits nor resources set. Then I leave them running for a bit under normal load to collect metrics on resource consumption.
I then set memory and CPU requests to +10% of the max consumption I got in the test period and limits to +25% of the requests.
This is just an example strategy, as there is no one size fits all approach for this.
The VerticalPodAutoScaler is more about making sure that a Pod can run. So it starts it low and doubles memory each time it gets OOMKilled. This can potentially lead to a Pod hogging resource. It is also limited as it doesn't take account of under-performance. If your app is under-resourced it might still respond but not respond in a timeframe you consider acceptable.
I think you are taking a good approach as you are looking at the application under load and assessing what it needs to perform as you want it to. I doubt I can suggest any tools you aren't already aware of but if it helps there is some more discussion in How to set the right cpu millicores for a container? and the threads that link from it
Please give me instructions about j-meter, how can i test performance of my site that its response is good and it can bear 500 to 1000 users at same time. Also please give me scenarios that can be performed to test performance of my site.
I have tested my site using j-meter but i cannot understand what these results means. Kindly tell me some perfect/final result (Response time, throughput, mean time, etc) of some sites which have good performance so that if those results come to me i will be satisfied that i am going well.
What should be avg response time, throughput, deviation, median, mean etc for a website normally?
Thanks
While load testing,you have to take help of some tools that will perform resource monitoring.
like in java , there is jvisualV
-- You may take help of jvisualvm path of this tool is Program Files\Java\jdk1.6.0_38\bin\jvisualvm.exe
You may use it to determine your cpu utilization and memory consumption.
Hope it may help you.
I have the requirement to insert 10,000 docs into marklogic in less than 10 seconds.
I tested in one single-node marklogic server in the following way:
use xdmp:spawn to pass the doc insertion task to task server;
use xdmp:document-insert without specify forest explicitly;
the task server has 8 theads to process tasks;
We have enabled CPF.
The performance is very bad: it took 2 minutes to finish the 10,000 doc creation.
I'm sure the performance will be better if I tested it in a cluster environment, but I'm not sure whether it can finish in less than 10 seconds.
Please advise the way of improving the performance.
I would start by gathering more information. What version of MarkLogic is this? What OS is it running on? What's the CPU? RAM? What's the storage subsystem? How many forests are attached to the database?
Then gather OS-level metrics, to see if one of the subsystems is an obvious bottleneck. For now I won't speculate beyond that.
If you need a fast load, I wouldn't use xdmp:spawn for each individual document, nor use CPF. But 2 minutes for 10k docs doesn't necessarily sound slow. On the other hand, I have reached up to 3k/sec, but without range indexes, transforms, whatsoever. And a very fast disk (e.g. ssd)..
HTH!
Assuming 2 socket server, 128GB-256GB of ram, fast IO (400-800MB/sec sustained)
Appropriate number of forests (12 primary or 6 primary/6 secondary)
More than 8 threads assuming enough cores
CPF off
Turn on perf history, look in metrics, and you will see where the bottleneck is.
SSD is not required - just IO throughput...which multiple spinning disks provide without issue.