Lets consider dozens/hundreds slaves on one machine. Is there a point in setting up multiple Jenkins masters on the same host to get better performance? Or maybe one big master does it better?
Thanks,
Michael
One big master is always better #M.Stefanczuk
Related
I'm new on working with the ELK stack and I'm working on 10 TB stocked on physical servers, so if there is recommendation on how many data nodes, Master nodes .. should I need to use , the best practice to configure our cluster to work smoothly in production and if there is other tools or technologies used with Elasticsearch for to improve performance
#ameur you can refer to these pages :
https://www.elastic.co/guide/en/elasticsearch/reference/current/general-recommendations.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-search-speed.html
Regarding master nodes, you should have minimum 3 nodes(Go for 5 nodes if possible).
For data nodes , there are multiple factors involved -
for ex:
resources like RAM,CPU, disk
throughput like qpa,wps etc.
so there is no straightforward answer to that, you will need to do some performance test to get the right number.
don't forget to read about sharding strategy https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html
I would like to deploy Elasticsearch, logstash and kibana in 3 different flavors, the characteristics of each one:
o 32vCPU, 384Go RAM et 900Go HDD
I would like to supervise 100 servers so approximately 33 servers in each flavors.
Do you think it's a good idea to use this configuration? and it's not a problem to use this huge capacity of memory?
Another question how many nodes should I use?
without details its hard to give you global advice but Elasticsearch recommend to never cross 31Gb for RAM. Here are the reasons why
You should read all the page, they explain why it is generally far better to have a lot of small/medium hosts instead of a few big ones.
I also recommend you to read this post, it will give you some insight on how to design an Elastic Cluster especially the distinction between roles in a cluster and the difference in hardware needed.
For your question :
Another question how many nodes should I use?
There is no good answer without knowing the volume of data, read/write etc etc...
And last, I hardly doubt that using the same configuration for kibana / logstash / elastic hosts is a good idea. They just don't do the same sort of processing. You should start with small configuration and update it incrementally when you will have real data.
I want to use one computer as a data node in two different hadoop clusters. I tried changing the ports, but that did not work. Please tell me if there are any port changes needed.
If you have one 'box' that you want to use as 2 computers, you could consider running 2 virtual machines on it.
This is probably not the only way, but I am quite confident that it will work because I have seen 2 VMs in a cluster before.
I have been working on a project where the Zookeepers are colocated on the same server as my Accumulo cluster / HDFS. Everything works in regards to them communicating, but now I am going to begin to rework some of the other infrastructure and might look more into this.
I am wondering is this the best practice, because I had a thought where maintenance might be easier if things are broken up. I know HDFS/Accumulo need to be together, but as far as the Zookeepers go, should the remain on the same machine, or be placed on another, or separate ones for each (probably no reason to do this)? Are there any benefits in terms of autoscaling where if HDFS/Accumulo are by themselves and are 'uninterrupted' by the Zookeepers you could say could perform better?
I assume you're talking about the master nodes (Namenode, AccumuloMaster, etc). If so, then theres no problem (with 2 caveats). If you're talking about datanodes, then its pretty bad practice and ZooKeeper should be moved to (at least) the master nodes.
There are two things that absolutely kill ZooKeeper performance: swapping and seeking. So, as long as theres enough memory and a dedicated device (not mount) for ZooKeeper you should be fine.
Is it a recommended practice to run multiple Elasticsearch nodes in one physical (virtual) machine? I'm speaking about production environment.
I currently have three virtual machines that unicast each other. Setup:
node.name:"VM1"
master:true
data:true
node.name:"VM2"
master:true
data:true
node.name:"VM3"
master:false
data:true
There's a request to have a dedicated master node in first virtual machine (next to VM1). I'm trying to avoid that and looking for strong arguments that I shouldn't do this.
Please advice.
Having a dedicated master makes sense in a larger environment to me. I would say if your nodes are not that busy having a data node also be a master would not be the end of the world. I would be more comfortable having 3 data nodes for high availability.