TMs on the same nodemanager which leads to high pressure on HDFS - hadoop

we have a 100-node hadoop cluster. Currently I write a Flink App to write many files on HDFS by BucktingSink. When I run Flink App on yarn I found that all task managers is distributed on the same nodemanager which means all subtasks is running on this node. It opens many file descriptors on the datanode of this busy node. (I think flink filesystem connector connect to local datanode in precedence) This leads to high pressure on that node which easily fails the job.
Any good idea to solve this problem? Thank you very much!

This sounds like a Yarn scheduling problem. Please take a look at Yarn's capacity scheduler which allows you to schedule containers on nodes based on the available capacity. Moreover you could tell Yarn to also consider virtual cores for scheduling. This allows to define a different resource dimension compared to memory only.

Related

How to allocate physical resources for a big data cluster?

I have three servers and I want to deploy Spark Standalone Cluster or Spark on Yarn Cluster on that servers.
Now I have some questions about how to allocate physical resources for a big data cluster. For example, i want to know whether i can deploy Spark Master Process and Spark Worker Process on the same node. Why?
Server Details:
CPU Cores: 24
Memory: 128GB
I need your help. Thanks.
Of course you can, just put host with Master in slaves. On my test server I have such configuration, master machine is also worker node and there is one worker-only node. Everything is ok
However be aware, that is worker will fail and cause major problem (i.e. system restart), then you will have problem, because also master will be afected.
Edit:
Some more info after question edit :) If you are using YARN (as suggested), you can use Dynamic Resource Allocation. Here are some slides about it and here article from MapR. It a very long topic how to configure memory properly for given case, I think that these resources will give you much knowledge about it
BTW. If you have already intalled Hadoop Cluster, maybe try YARN mode ;) But it's out of topic of question

How does Apache Spark handles system failure when deployed in YARN?

Preconditions
Let's assume Apache Spark is deployed on a hadoop cluster using YARN. Furthermore a spark execution is running. How does spark handle the situations listed below?
Cases & Questions
One node of the hadoop clusters fails due to a disc error. However replication is high enough and no data was lost.
What will happen to tasks that where running at that node?
One node of the hadoop clusters fails due to a disc error. Replication was not high enough and data was lost. Simply spark couldn't find a file anymore which was pre-configured as resource for the work flow.
How will it handle this situation?
During execution the primary namenode fails over.
Did spark automatically use the fail over namenode?
What happens when the secondary namenode fails as well?
For some reasons during a work flow the cluster is totally shut down.
Will spark restart with the cluster automatically?
Will it resume to the last "save" point during the work flow?
I know, some questions might sound odd. Anyway, I hope you can answer some or all.
Thanks in advance. :)
Here are the answers given by the mailing list to the questions (answers where provided by Sandy Ryza of Cloudera):
"Spark will rerun those tasks on a different node."
"After a number of failed task attempts trying to read the block, Spark would pass up whatever error HDFS is returning and fail the job."
"Spark accesses HDFS through the normal HDFS client APIs. Under an HA configuration, these will automatically fail over to the new namenode. If no namenodes are left, the Spark job will fail."
Restart is part of administration and "Spark has support for checkpointing to HDFS, so you would be able to go back to the last time checkpoint was called that HDFS was available."

Differences between existing MapReduce and YARN (MRv2)

Would anyone tell me, which are the differences between existing MapReduce and YARN, because I do not find all clearly differences between these two?
P.S: I'm asking for something like a comparison between these.
Thanks!
MRv1 uses the JobTracker to create and assign tasks to data nodes, which can become a resource bottleneck when the cluster scales out far enough (usually around 4,000 nodes).
MRv2 (aka YARN, "Yet Another Resource Negotiator") has a Resource Manager for each cluster, and each data node runs a Node Manager. For each job, one slave node will act as the Application Master, monitoring resources/tasks, etc.
MRv1 which is also called as Hadoop 1 where the HDFS (Resource management and scheduling) and MapReduce(Programming Framework) are tightly coupled.
Because of this non-batch applications can not be run on the hadoop 1.
It has single namenode so, it doesn't provides high system availability and scalability.
MRv2 (aka Hadoop 2) in this version of hadoop the resource management and scheduling tasks are separated from MapReduce which is separated by YARN(Yet Another Resource Negotiator).
The resource management and scheduling layer lies beneath the MapReduce layer.
It also provides high system availability and scalability as we can create redundant NameNodes.
The new feature of snapshot through which we can take backup of filesystems which helps disaster recovery.

Is it possible to add node automatically when hadoop program is on running application

I'm beginner programmer and hadoop learner.
I'm testing hadoop full distribute mode using 5 PC(has Dual-core cpu and ram 2G)
before starting maptask and hdfs, I knew that I must configure file(etc/hosts on Ip, hostname and hadoop folder/conf/masters,slaves file) so I finished configured that file
and when debating on seminar in my company, my boss and chief insisted that even if hadoop application running state, if hadoop need more node or cluster, automatically, hadoop will add more node
Is it possible? When I studied about hadoop clusturing, Many hadoop books and community site insisted that after configuration and running application, We can't add more node or cluster.
But My boss said to me that Amazon said adding node on running application is possible.
Is really true?
hadoop master users on stack overflow community, Please tell me detail about the truth.
Yes it indeed is possible.
Here is the explanation in hadoop's wiki.
Also Amazon's EMR enables one to add 100s of nodes on-the-fly in an alreadt running cluster and as soon as the machines are up they are delegated tasks(unstarted mapper and/or reducer tasks) by the master.
So, yes, it is very much possible and is in use and not just in theory.

When will HDFS be unavailable?

Name node is the single point of failure for HDFS. Is this correct?
Then what about Jobtracker? If Jobtracker fails, is HDFS available?
HDFS is completely independent of the Jobtracker. As long as at least the NN is up, HDFS is nominally usable, with overall degradation dependent on the number of Datanodes that are down.
As Ambar mentioned HDFS as in the file system does not depend on the JobTracker. The current released version of Hadoop does not support Namenode high availability out of the box but you can work around it (e.g. deploy the namenode using a traditional clustering solution of active/passive with shared storage).
The next release (2.0/0.23) does fix the namenode availability issue.
You can read more about it in a blog post by Aaron Myers "High Availability for the Hadoop Distributed File System (HDFS)"
If the JobTracker is not available you cannot execute map/reduce jobs

Resources