The RAM requirements for modifying the topology are invalid in Heron - mesos

I modified the RAM requirement in the Heron example topology named WordCountTopology.java and rebuild the file using mvn assembly:assembly command. When I submitted the modified WordCountTopology to Heron cluster, I found the RAM requirement of Heron Instance did not changed.
The process of building .jar is succeed. The default RAM requirement of the WordCountTopology as following:
// configure component resources
conf.setComponentRam("word",
ByteAmount.fromMegabytes(ExampleResources.COMPONENT_RAM_MB * 2));
conf.setComponentRam("consumer",
ByteAmount.fromMegabytes(ExampleResources.COMPONENT_RAM_MB * 2));
// configure container resources
conf.setContainerDiskRequested(
ExampleResources.getContainerDisk(2 * parallelism, parallelism));
conf.setContainerRamRequested(
ExampleResources.getContainerRam(2 * parallelism, parallelism));
conf.setContainerCpuRequested(2);
In the above code. ExampleResources.COMPONENT_RAM_MB = 512mb. The default value of parallelism is 1.
The content about ExampleResources as following:
static ByteAmount getContainerDisk(int components, int containers) {
return ByteAmount.fromGigabytes(Math.max(components / containers, 1));
}
static ByteAmount getContainerRam(int components, int containers) {
final int componentsPerContainer = Math.max(components / containers, 1);
return ByteAmount.fromMegabytes(COMPONENT_RAM_MB * componentsPerContainer);
}
My changed the value of ExampleResources.COMPONENT_RAM_MB=512mb to 256mb.
However, the requirement of the topology showed in the Aurora scheduler as following:
And all instances in the aurora is FAILED:
My Questions: What should I do to effectively change the RAM requirement in the topology?And I don't know why tasks failed running in the mesos and aurora. Thanks for your help.

Do you know which version of Heron are you using? We recently cut over to a new packing algorithm called Resource Compliant Round Robin scheduling. Eventually, the resource allocation will be automatic.

Related

Apache Nutch 2.3.1, increase reducer memory

I have setup a small size cluster if Hadoop with Hbase for Nutch 2.3.1. The hadoop version is 2.7.7 and Hbase is 0.98. I have customized a hadoop job and now I have to set memory for reducer task in driver class. I have come to know, in simple hadoop MR jobs, you can use JobConf method setMemoryForReducer. But there isn't any option available in Nutch. In my case , currently, reducer memory is set to 4 GB via mapred-site.xml (Hadoop configuration). But for Nutch, I have to double it.
Is it possible without changing hadoop conf files, either via driver class or nutch-site.xml
Finally, I was able to found the solution. NutchJob does the objective. Following is the code snippet
NutchJob job = NutchJob.getInstance(getConf(), "rankDomain-update");
int reducer_mem = 8192;
String memory = "-Xmx" + (int) (reducer_mem * 0.8)+ "m";
job.getConfiguration().setInt("mapreduce.reduce.memory.mb", reducer_mem);
job.getConfiguration().set("mapreduce.reduce.java.opts", memory );
// rest of code below

Spark job just hangs with large data

I am trying to query from s3 (15 days of data). I tried querying them separately (each day) it works fine. It works fine for 14 days as well. But when I query 15 days the job keeps running forever (hangs) and the task # is not updating.
My settings :
I am using 51 node cluster r3.4x large with dynamic allocation and maximum resource turned on.
All I am doing is =
val startTime="2017-11-21T08:00:00Z"
val endTime="2017-12-05T08:00:00Z"
val start = DateUtils.getLocalTimeStamp( startTime )
val end = DateUtils.getLocalTimeStamp( endTime )
val days: Int = Days.daysBetween( start, end ).getDays
val files: Seq[String] = (0 to days)
.map( start.plusDays )
.map( d => s"$input_path${DateTimeFormat.forPattern( "yyyy/MM/dd" ).print( d )}/*/*" )
sqlSession.sparkContext.textFile( files.mkString( "," ) ).count
When I run the same with 14 days, I got 197337380 (count) and I ran the 15th day separately and got 27676788. But when I query 15 days total the job hangs
Update :
The job works fine with :
var df = sqlSession.createDataFrame(sc.emptyRDD[Row], schema)
for(n <- files ){
val tempDF = sqlSession.read.schema( schema ).json(n)
df = df(tempDF)
}
df.count
But can some one explain why it works now but not before ?
UPDATE : After setting mapreduce.input.fileinputformat.split.minsize to 256 GB it works fine now.
Dynamic allocation and maximize resource allocation are both different settings, one would be disabled when other is active. With Maximize resource allocation in EMR, 1 executor per node is launched, and it allocates all the cores and memory to that executor.
I would recommend taking a different route. You seem to have a pretty big cluster with 51 nodes, not sure if it is even required. However, follow this rule of thumb to begin with, and you will get a hang of how to tune these configurations.
Cluster memory - minimum of 2X the data you are dealing with.
Now assuming 51 nodes is what you require, try below:
r3.4x has 16 CPUs - so you can put all of them to use by leaving one for the OS and other processes.
Set your number of executors to 150 - this will allocate 3 executors per node.
Set number of cores per executor to 5 (3 executors per node)
Set your executor memory to roughly total host memory/3 = 35G
You got to control the parallelism (default partitions), set this to number of total cores you have ~ 800
Adjust shuffle partitions - make this twice of number of cores - 1600
Above configurations have been working like a charm for me. You can monitor the resource utilization on Spark UI.
Also, in your yarn config /etc/hadoop/conf/capacity-scheduler.xml file, set yarn.scheduler.capacity.resource-calculator to org.apache.hadoop.yarn.util.resource.DominantResourceCalculator - which will allow Spark to really go full throttle with those CPUs. Restart yarn service after change.
You should be increasing the executor memory and # executors, If the data is huge try increasing the Driver memory.
My suggestion is to not use the dynamic resource allocation and let it run and see if it still hangs or not (Please note that spark job can consume entire cluster resources and make other applications starve for resources try this approach when no jobs are running). if it doesn't hang that means you should play with the resource allocation, then start hardcoding the resources and keep increasing resources so that you can find the best resource allocation you can possibly use.
Below links can help you understand the resource allocation and optimization of resources.
http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
https://community.hortonworks.com/articles/42803/spark-on-yarn-executor-resource-allocation-optimiz.html

How to make Hadoop/EMR use more containers per node

I'm in the process of moving our application from Hadoop 1.0.3 to 2.7, on EMR v5.1.0. I got it running, but I'm still having problems getting my head around the resource-allocation system in Yarn. With the default settings provided by EMR, Hadoop only allocates one container per node, even if I select a larger instance type for the nodes. This is a problem, since we'll now be using twice as many nodes to do the same amount of work.
I want to squeeze more containers into one node, and ensure that we're using all the available resources. I assume that I shouldn't touch yarn.nodemanager.resource.memory-mb or yarn.nodemanager.resource.cpu-vcores, since those are set by EMR to reflect the actual available resources. Which settings do I have to change?
Your container sizes are defined by setting the memory (default criteria for a container) and vcores. The following can be configured:
yarn-scheduler.minimum-allocation-mb
yarn-scheduler.maximum-allocation-mb
yarn-scheduler.increment-allocation-mb
yarn-scheduler.minimum-allocation-vcores
yarn-scheduler.maximum-allocation-vcores
yarn-scheduler.increment-allocation-vcores
All the following criteria must be satified (they are per container, except for yarn.nodemanager.resource.cpu-vcores and yarn.nodemanager.resource.memory-mb which are per NodeManager hence per DataNode):
1 <= yarn-scheduler.minimum-allocation-vcores <= yarn-scheduler.maximum-allocation-vcores
yarn-scheduler.maximum-allocation-vcores <= yarn.nodemanager.resource.cpu-vcores
yarn-scheduler.increment-allocation-vcores = 1
1024 <= yarn-scheduler.minimum-allocation-mb <= yarn-scheduler.maximum-allocation-mb
yarn-scheduler.maximum-allocation-mb <= yarn.nodemanager.resource.memory-mb
yarn-scheduler.increment-allocation-mb = 512
You can also see this helpful link https://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_ig_yarn_tuning.html

Get Hbase region size via API

I am trying to write a balancer tool for Hbase which could balance regions across regionServers for a table by region count and/or region size (sum of storeFile sizes). I could not find any Hbase API class which returns the regions size or related info. I have already checked a few of the classes which could be used to get other table/region info, e.g. org.apache.hadoop.hbase.client.HTable and HBaseAdmin.
I am thinking, another way this could be implemented is by using one of the Hadoop classes which returns the size of the directories in the fileSystem, for e.g. org.apache.hadoop.fs.FileSystem lists the files under a particular HDFS path.
Any suggestions ?
I use this to do managed splits of regions, but, you could leverage it to load-balance on your own. I also load-balance myself to spread the regions ( of a given table ) evenly across our nodes so that MR jobs are evenly distributed.
Perhaps the code-snippet below is useful?
final HBaseAdmin admin = new HBaseAdmin(conf);
final ClusterStatus clusterStatus = admin.getClusterStatus();
for (ServerName serverName : clusterStatus.getServers()) {
final HServerLoad serverLoad = clusterStatus.getLoad(serverName);
for (Map.Entry<byte[], HServerLoad.RegionLoad> entry : serverLoad.getRegionsLoad().entrySet()) {
final String region = Bytes.toString(entry.getKey());
final HServerLoad.RegionLoad regionLoad = entry.getValue();
long storeFileSize = regionLoad.getStorefileSizeMB();
// other useful thing in regionLoad if you like
}
}
What's wrong with the default Load Balancer?
From the Wiki:
The balancer is a periodic operation which is run on the master to redistribute regions on the cluster. It is configured via hbase.balancer.period and defaults to 300000 (5 minutes).
If you really want to do it yourself you could indeed use the Hadoop API and more specifally, the FileStatus class. This class acts as an interface to represent the client side information for a file.

Run #Scheduled task only on one WebLogic cluster node?

We are running a Spring 3.0.x web application (.war) with a nightly #Scheduled job in a clustered WebLogic 10.3.4 environment. However, as the application is deployed to each node (using the deployment wizard in the AdminServer's web console), the job is started on each node every night thus running multiple times concurrently.
How can we prevent this from happening?
I know that libraries like Quartz allow coordinating jobs inside clustered environment by means of a database lock table or I could even implement something like this myself. But since this seems to be a fairly common scenario I wonder if Spring does not already come with an option how to easily circumvent this problem without having to add new libraries to my project or putting in manual workarounds.
We are not able to upgrade to Spring 3.1 with configuration profiles, as mentioned here
Please let me know if there are any open questions. I also asked this question on the Spring Community forums. Thanks a lot for your help.
We only have one task that send a daily summary email. To avoid extra dependencies, we simply check whether the hostname of each node corresponds with a configured system property.
private boolean isTriggerNode() {
String triggerHostmame = System.getProperty("trigger.hostname");;
String hostName = InetAddress.getLocalHost().getHostName();
return hostName.equals(triggerHostmame);
}
public void execute() {
if (isTriggerNode()) {
//send email
}
}
We are implementing our own synchronization logic using a shared lock table inside the application database. This allows all cluster nodes to check if a job is already running before actually starting it itself.
Be careful, since in the solution of implementing your own synchronization logic using a shared lock table, you always have the concurrency issue where the two cluster nodes are reading/writing from the table at the same time.
Best is to perform the following steps in one db transaction:
- read the value in the shared lock table
- if no other node is having the lock, take the lock
- update the table indicating you take the lock
I solved this problem by making one of the box as master.
basically set an environment variable on one of the box like master=true.
and read it in your java code through system.getenv("master").
if its present and its true then run your code.
basic snippet
#schedule()
void process(){
boolean master=Boolean.parseBoolean(system.getenv("master"));
if(master)
{
//your logic
}
}
you can try using TimerManager (Job Scheduler in a clustered environment) from WebLogic as TaskScheduler implementation (TimerManagerTaskScheduler). It should work in a clustered environment.
Andrea
I've recently implemented a simple annotation library, dlock, to execute a scheduled task only once over multiple nodes. You can simply do something like below.
#Scheduled(cron = "59 59 8 * * *" /* Every day at 8:59:59am */)
#TryLock(name = "emailLock", owner = NODE_NAME, lockFor = TEN_MINUTE)
public void sendEmails() {
List<Email> emails = emailDAO.getEmails();
emails.forEach(email -> sendEmail(email));
}
See my blog post about using it.
You don't neeed to synchronize your job start using a DB.
On a weblogic application you can get the instanze name where the application is running:
String serverName = System.getProperty("weblogic.Name");
Simply put a condition two execute the job:
if (serverName.equals(".....")) {
execute my job;
}
If you want to bounce your job from one machine to the other, you can get the current day in the year, and if it is odd you execute on a machine, if it is even you execute the job on the other one.
This way you load a different machine every day.
We can make other machines on cluster not run the batch job by using the following cron string. It will not run till 2099.
0 0 0 1 1 ? 2099

Resources