I'm using several matrix-jobs which usually contain the following steps:
Build, Install, Test
The Build-Step is set as a Touchdown-Step. The other steps are using the binaries created by Build.
I recently added another node to my system which should build these matrix-jobs too. Now my problem is, that Jenkins is distributing the steps of my job to these nodes.
Example:
1. Slave A runs the `Build` step and succeeds
2. Slave B runs the `Install` step and fails due to its dependency on the `Build`-results.
3. Slave A runs the `Test` step and succeeds, cause the dependencies are existing.
The execution of the matrix-job fails, cause its steps are distributed.
My question now is if there is any way to bind the execution of a matrix-job to just one node. It's no problem if different executions are done on different nodes, but the steps of a certain execution should be done on a certain node.
It is no solution to bind the matrix-job to just one node. It still should be bound to a group of nodes.
Since you have these steps as individual jobs, in your "label" axis:
make sure you choose individual nodes, not labels, for each of your steps.
This will make sure that each of your steps runs on each individual slave, and therefore each step will have it's predecessor's workspace.
See:
http://imagebin.org/163627
==========================================================================
Based on comments:
At this point, you have two options:
You can use : https://wiki.jenkins-ci.org/display/JENKINS/Copy+Artifact+Plugin. You can add everything required as an artifact of your "build" step, and have your "install" step copy them over using the plugin. Do the same for "install" => "test".
Combine your steps into a single job, since there is no guarantee that the same node will be "least used" for each step if they are different jobs. The only way to force all jobs to use the same node is by selecting individual node and not label.
Hope that helps...
Related
I have a Fully-Distributed Hadoop cluster with 4 nodes.When I submit my job to Jobtracker which decide 12 map tasks will be cool for my job,something strange happens.The 12 map tasks always running on a single node instead of running on the entire cluster.Before I ask the question ,I have already done the things below:
Try different Job
Run start-balance.sh to rebalance the cluster
But it does not work,so I hope someone can tell me why and how to fix it.
If all the blocks of input data files are in that node, the scheduler with prioritize the same node
Apparently the source data files is in one data node now. It could't be the balancer's fault. From what I can see, your hdfs must only have one replication or you are not in a Fully-Distributed Hadoop cluster.
Check how your input is being split. You may only have one input split, meaning that only one Node will be used to process the data. You can test this by adding more input files to your stem and placing them on different nodes, then checking which nodes are doing the work.
If that doesn't work, check to make sure that your cluster is configured correctly. Specifically, check that your name node has paths to your other nodes set in its slaves file, and that each slave node has your name node set in its masters file.
I have bind terracotteJobStore with Quartz-Scheduler
how can terracotteJobStore determine which job should next for which node for execution?
which algorithm uses for node selection in terracotteJobStore any idea ??
If 'Quartz Scheduler' is used with 'TerracotteJobStore' ,and there is any Job next to execute then selection of Node for that Job will be Random.
Using 'Qurtz Where' it is possible to make Job on criteria base.
Means if u want to make a Job that must run on a Node which have core at least 2 or
to make a Job which run on a Node which have 70% CPU load average or
to make a Job which run on a Node which have at least Java Heap Free memory 330 MB
in such case 'Quartz Where' is useful.
It is predictable on which Node , Job will execute only in the case of "Quartz Where'.
With OS Terracotta's JobStore you don't get to decide which node the job will be executed on. Not that it really happens randomly, but the scheduler behaves as in non-clustered mode. So basically, every node will, at a regular interval and, based on the next trigger to fire when to acquire the next trigger(s). Since all the nodes in cluster behave the same way, the first to acquire the lock, will also be able to acquire triggers first.
Terracotta EE comes with the Quartz Where feature that lets you describe where jobs should be fired. You learn more on Quartz Where by watching this short screencast I did: http://www.codespot.net/blog/2011/03/quartz-where-screencast/
Hope that'll help.
In the figure below, I want each level of jobs to run in parallel (as many as they can simultaneously on executors), and IF one arbitrary job fails, after fixing the problem I want the things to run normal again (as if the job didn't fail). I mean if the failed job is build successfully after fixing, I want the jobs at lower levels to start automatically.
I have seen that Build Flow Plugin cannot realize that. I hope someone has some brilliant ideas to share.
Thanks for your time.
For Further Clarification:
All the jobs at level x must be successful before all the jobs at level x+1. If some job at level x fails, I do not want any job at level x+1 to start. After fixing the the problem, re-run the job, and if it succeeds (and all the other at level x also have succeeded), then I want level x+1 to start building.
Referencing your diagram, I'll restate the requirements of your question (to make sure I understand it).
At Level 1, you want all of the jobs to run in parallel (if possible)
At Level 2, you want all of the jobs to run in parallel
At Level 3, you want all of the jobs to run in parallel
Any successful build of a Level 1 job should cause all Level 2 jobs to build
Any successful build of a Level 2 job should cause all Level 3 jobs to build
My answer should work if you do not require "Any failure at Level 1 will prevent all Level 2 jobs from running."
I don't believe this will require any other plugins. It just uses what is built into Jenkins.
For each Level 1 job, configure a Post Build action of "Build other projects"
The projects to build should be all of your Level 2 jobs. (The list should be comma separated list.)
Check "Trigger only if build succeeds"
For each Level 2 job, configure a Post Build action of "Build other projects"
The projects to build should be all of your Level 3 jobs.
I run some batch jobs with data inputs that are constantly changing and I'm having problems provisioning capacity. I am using whirl to do the intial setup but once I start, for example, 5 machines I don't know how to add new machines to it while its running. I don't know in advance how complex or how large the data will be so I was wondering if there was a way to add new machines to a cluster and have it take effect right away(or with some delay but don't want to have to bring down the cluster and bring it up with the new nodes).
There is exact explanation how to add node:
http://wiki.apache.org/hadoop/FAQ#I_have_a_new_node_I_want_to_add_to_a_running_Hadoop_cluster.3B_how_do_I_start_services_on_just_one_node.3F
In the same time - I am not sure that already running jobs will take advantages of these nodes since planning where to run each task happens during job start time (as far as I understand).
I also think that it is more practical to run Task Trackers only on these transient nodes.
Check the files referred by the below parameters:
dfs.hosts => dfs.include
dfs.hosts.exclude
mapreduce.jobtracker.hosts.filename => mapred.include
mapreduce.jobtracker.hosts.exclude.filename
You can add the list of hosts to the files dfs.include and mapred.include and then run
hadoop mradmin -refreshNodes ;
hadoop dfsadmin -refreshNodes ;
That's all.
BTW, 'mradmin -refreshNodes' facility was added in 0.21
Nikhil
I have separated a big Hudson job into smaller jobs. Job A does the main build and Job B does another build with a different configuration. I have configured Hudson, so that the A triggers B and that works fine, the problem is that Job A has the original build number and B has started from 1.
My question is: Is it possible to pass the BUILD_NUMBER environment variable somehow from Job A to Job B? The build number is used in the build artifact names, hence it would be nice to have the numbers match between artifacts.
Thanks.
Use the parametrized Parameterized Trigger Plugin, which will allow you to pass the build number from A to B. You will not be able to actually set the build number in job B, but you will have the build number from A to generate your version number.
If you want to synchronize the build number, you can edit the file nextBuildNumber in the job directory to match the number from job A. Be aware that these numbers will drift apart over the time since when A fails B will not be started.
EDIT I just stumbled across the Next Build Number Plugin. Have a look, If this one helps you.