Increasing spark workers and cassandra nodes takes more time - performance

So, I have a small cluster with 3 Spark workers(2 executors each) and on the same nodes I have also installed Cassandra in order to achieve data locality. In order to evaluate the speed and times(from SparkUI) I run the same code with, firstly one spark-cassandra node, then two and then three spark-cassandra nodes for 3 times in every case. The results are below, but I do not understand why does it take more time with 3 nodes than 2?
I am not sure what to check. For the above times spark.sql.shuffle.partitions was 96, but I tried also the "3 / 3" with 18 partitions and it was still the same (3min 13s, 3min 5s, 3min 19s)
What could be happening and why? Please, let me know if you need more information.
Edit1
The only difference between the first 2 cases and the 3rd is the replication factor in Cassandra db. For the first 2 is 1 and for the 3rd case is 3. Could that be the reason?network traffic and latencies?
Edit2
Below are some pictures from the Stages Tab of SparkUI with 3 spark-cassandra nodes (3rd case).

Related

Multiple datacenter replication and local quorum?

I created a cluster from 6 nodes.
3 nodes in Eu west1 and 3 nodes in EU west2
I set the locality for every group of nodes like : --locality=region=europe,datacenter=west1
I also set the replica to 6 to have all ranges and all data on every node.
What will happen if the connection between data centers is lost the whole cluster goes down ?
I tried to kill 3 nodes in one of the datacenters and cluster is not operational because the majority of the nodes are down and quorum is less that 4.
Is it possible to make the 2 datacentes to work with their local quorum 2/3
I also played a bit with replications settings and sometimes cluster is healthy if I kill 3 nodes from 6 and was I was able to write to the cluster. Sometimes I can only read from the cluster. Cluster is working with replica of 5 and 3 nodes killed from 6. Still paying with this but if someone can give me more information will be very helpful.
To be able to replicate across datacentes is very cool feature but if I lost the whole cluster when one of the datacenters is down ruin the whole good idea at least for me.
CockroachDB requires a majority of replicas to be fully operational, which means > half, not >= half. In order to survive the loss of a full datacenter or region, you must have three DCs/regions, not two. Try running two nodes in each of three regions instead of three nodes in two regions.
Is it possible to make the 2 datacenters to work with their local quorum 2/3
Not for a single table (because it would be impossible to guarantee consistency if each datacenter were able to act in isolation from the other). You've configured the data to be replicated across all six replicas, which means four replicas are required to make a quorum. If you want each datacenter to be able to operate independently of the other, you would need two separate tables, with each one configured to be located within one of the datacenters.
Thanks for the answer just to clear few thing. But looks like you got my point and what I want to accomplish.
But as far as I understand if I have 2x3 node in 2 different DC's if one DC goes down. I have 3 live nodes for the quorum I need at least 4 . N/2 +1.
So if I have 3x3 I can lost one DC because if I have 2 DC's live I will have a quorum .
And one last question if I don't set replication to 9 if I loose 3 nodes some in one DC some ranges will be not available right ?

hadoop - definitive guide - why is a block in hdfs so large

I came across the following paragraph from the definitive guide(HDFS Concepts - blocks) and could not understand.
Map tasks in MapReduce normally operate on one block at a time, so if you have too few tasks (fewer than nodes in the cluster), your jobs will run slower than they could otherwise.
I am wondering how the jobs would be slower when the tasks are few when compared to the total number of nodes in the cluster. Say there are 1000 nodes in the cluster and 3 tasks(By tasks I took blocks as each block is sent to a node for a single task), the time it takes to get the result will always be less than the scenario that has say 1000 nodes and 1000 tasks right?
I couldn't get convinced by the paragraph given in the definitive guide.
The paragraph you quoted from book basically says "utilize as much nodes as you can." If you have 1000 nodes and only 3 blocks or tasks, only 3 nodes are running on your tasks, and all other 997 nodes do nothing about your tasks. If you have 1000 nodes and 1000 tasks, and each of these 1000 nodes has some part of your data, all 1000 nodes will be utilized on you tasks. You also take advantage of data locality since each node will first work on local data.

ElasticSearch replica and nodes

I am testing now clustering with ElasticSearch and have question about the replicas between the nodes.
As you can see in the screenshot from Head I have 2 indexes.
movies has 5 shards and 2 replica
students has 5 shards and 1 replica
Which one is better and which one is faster with 3 active nodes and why?
Costs of having more number of replicas would be
more storage space required(Obviously)
less indexing performance
while the advantage from it would be
better search performance
better resiliency
Note that even though you have 2 replicas, it does not mean that your cluster can endure 2 nodes going down since all indexing request would fail if only one out of 3 copies of shards is available.(because of indexing quorum)
For detailed explanation please refer to this official document
"Better" is subjective.
With two replicas, you can handle two of the three machines in your cluster going down, though at the price of writing all the data to every machine. Read performance should also be higher as the cluster has more nodes from which to request the data.
With one replica, you can only survive the outage of one machine in your cluster, but you'll get a performance boost by writing 2 copies of the data across 3 servers (less IO on each server).
So it comes down to risk and performance. Hope that helps.

what is the relevence of mentioning no of tasks in storm

I just wanted to know what is the actual relevance of using tasks in storm with respect to the output or performance since it does not have to do anything with parallelism, so on choosing more than 1 task for a component will make any change in output? or what will be the flow than? Or if i choose no of tasks > executors how does that make difference in flow or the output (here i am just taking the basic word count example).
It would be very helpful if anybody could explain me this with or without example.
for example say-
I have a topology with 3 bolts and 1 spout, and i have mentioned only 2 workers port,than that means that all these 4 components(1 spot and 3 bolts will get run on these workers only) now i have mentioned 2 executors for 1st bolt than it means that there will be 2 thread of that bolt will be running in parallel.Now if i mention the no of task=3 how will this make difference whether in output or performance?
And if i have mentioned the field grouping than the grouping will be there in different executors(plz correct me if m wrong)?
Did you read this article? https://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html
To pick up your example: If you set #tasks=3 and specify 2 executors using fieldsGrouping the data will be partitioned into 3 substreams (= #tasks). 2 substreams go to one executor and the third to the second executor. However, using 3 tasks and 2 executors, allows you to increase the number of executors to 3 using rebalance command.
As long as you do not want to increase the number of executors during execution, #tasks should be equal to #executors (ie, just don't specify #tasks).
For your example (if you don't want to change the parallelism at runtime), you most likely can an imbalance workload for both executors (one executor processed 33% of the data, the other 66%). However, this is only a problem in this special case and not in general. If you assume you have 4 tasks, each executors processed 2 substreams and no inbalance occurs.

Cassandra: 6 node cluster, RF=2: What to do when 2 nodes crash?

Good Day
We have a 6 node casssandra cluster witha replication factor of 3 on our keyspaces. Our applications make use of QUORUM so we can survive the loss of a single node wihtout it affecting the application.
Lets assume I lose 2 nodes at the same time. If my application was using consistency level of ONE then it would have been fine and my application would have run without any issues but we would like to keep the level at QUORUM.
My question is if 2 nodes crash at the same time and I do a nodetool removenode for each of the crashed nodes, will the cluster then rebalance the data over the remaining 4 nodes (and getting ir back to a 3 replica) and if done should my application then be able to work again usinng QUORUM?
In title you write RF=2, in text RF=3. You did not specify Cassandra version and if you are using single-token or vnodes. Quorum CL means, in a RF = 3 that 2 nodes must write/read before returning. It is possible that you face minimal issues/no issue even if 2 nodes dies, it depends on how many common ranges (partitions) the nodes shares.
Give a look at this distribution example that is exactly like the one you describe: RF3, 6 nodes.
using single tokens:
if you loose couples like (1,4) - (2,5) - (3,6) -- your cluster should allow all writes and reads, no issues. A good client will recognize nodes down and won't use them anymore as coordinators. Other situations, for example loss of nodes (1,6) might lead to a situation in which any r/w of F and E tokens will fail (assuming an equal distribution about 33% r/w operation will fail)
using vnodes:
here the situation is slightly different and also depends on couples you loose -- now if you repeat the worst scenario above -- you loose couple of nodes like (1,6) only B tokens will be affected in r/w operations since it's the only token shared between them.
Said that, just to clarify the possible scenarios, here's your answer. Nodetool removenode should be used like explained in this document. Use removenode IF AND ONLY IF you want reduce the cluster size (here what to do if you want replace a dead node). Once you did that your application will start working again using Quorum since other nodes will be responsible for partitions previously assigned to a dead node.
If you are using the official Datastax Java Driver you might want to let the driver temporary fight your monsters specifying a DowngradingConsistencyRetryPolicy
HTH,
Carlo

Resources