I have installed HTcondor on my cluster of Dell Optiplex 390s they all are running Centos 8 and I am not able to run condor_status I get the following error --> Error: can't find collector
I am new to using condor and all I want to be able to do is have a master node that can manage jobs and execute them and for the rest to just execute the jobs. I have opened port 9618/tcp on all the nodes to run the daemon.
Ok, well there are two possibilities: One, the collector isn't running, and two, it is running, but condor_status can't find it.
Let's start with potential problem number one. If you run
ps auxww | grep condor_collector
on you machine that should be the central manager, is there a collector process running?
If so, that's good.
problem 2 is to set the condor_config variable COLLECTOR_HOST to point to this machine e.g.
COLLECTOR_HOST = my_central_manager
Related
I have got an assignment. The assignment is "Write a shell script to install and configure docker swarm(one master/leader and one node) and automate the process using Jenkins." I am new to this technology and finding it difficult to proceed. Can anyone help me in explaining step-by-step process of how to proceed?
#Rajnish Kumar Singh, Have you tried to check resources online? I understand you are very new to this technology, but googling some key words like
what is docker swarm
what is jenkins , etc would definitely helps
Having said that, Basically you need to do below set of steps to complete your assignment
Pre-requisites
2 or more - Ubuntu 20.04 Server
(You can use any linux distros like ubuntu, Redhat etc, But make sure your install and execute commands change accordingly.
Here we need two nodes mainly to configure the master and worker node cluster)
Eg :
manager --- 132.92.41.4
worker --- 132.92.41.5
You can create these nodes in any of public cloud providers like AWS EC2 instances or GCP VMs etc
Next, You need to do below set of steps
Configure Hosts
Install Docker-ce
Docker Swarm Initialization
You can refer this article for more info https://www.howtoforge.com/tutorial/ubuntu-docker-swarm-cluster/
This completes first part of your assignment.
Next, You can create one small shell script and include all those install and configuration commands in that script. Basically shell script is collection of set of linux commands. Instead of running each commands separately , you will run script alone and all set up will be done for you.
You can create small script using touch command
touch docker-swarm-install.sh
Specify proper privileges to script to make it executable
chmod +x docker-swarm-install.sh
Next include all your install + configure commands, which you have used earlier to do docker swarm set up in scripts (You can refer above shared link)
Now, when your script is ready, you can configure this script in jenkins job and whenever jenkins job is run, script will get execute and docker swarm cluster will be created
You need a jenkins server. Jenkins is open source software, you can install it in any of public cloud instance (Aws EC2)
Reference : https://devopsarticle.com/how-to-install-jenkins-on-aws-ec2-ubuntu-20-04/
Next once installation is completed. You need to configure job in jenkins
Reference : https://www.toolsqa.com/jenkins/jenkins-build-jobs/
Add your 'docker-swarm-install.sh' as build step in created job
Reference : https://faun.pub/jenkins-jobs-hands-on-for-the-different-use-cases-devops-b153efb483c7
If all set up is successful and now when you run your jenkins job, your docker swarm cluster must be get created.
I just installed Jenkins 2.46.2 on a Windows 2012 Server \o/. It runs as a system service.
I created a job that execute a windows batch (.bat) script to build a code project. This batch results in executing 2 mingw32-make.exe commands to clean and then build a full binary from source code.
Executing the batch manually on the machine, located on the same filesystem (same workspace as used by the Jenkins' job, local disk - not network disk), the clean-build takes ~50 seconds.
But when executed by Jenkins, the job takes more than 20x more time longer! (~19 minutes). It terminates succesfully with the same behavior as executed manually in cmd.exe.
I changed the launch arguments for the jvm in the jenkins.xml file with "-Xmx1024m -XX:MaxPermSize=512m" options as I have read in the documentation to improve performance. But it does not fix anything :-(
Also when I monitors the CPU/disk/RAM usages they all stay very very low while building, so I deduce that brute performances of the machine are not in cause.
Whether I invoke the batch with call statement in the Jenkins job build step or not does not change anything : the job always last 19 minutes.
Can anybody help me to investigate why so slowness ?
Thanks in advance :)
I had a similar problem. I noticed that .bat files with echo Hello World ran fast and with no problem.
But once I tried to launch any grep.exe from a batch script, it took 24 seconds (in my case) to run even with no input files. If launched manually it finishes in no time.
I used grep.exe version 2.5.4 from MSys 1.0 distribution.
The solution in my case was rather unexpected - I updated grep to version 2.24, and now, being launched from Jenkins, it takes less than one second to process over 1 MB log file.
For a couple of day investigation, I finally find the cause.
In my case, it is the reason of Jenkins agent.
When I install Jenkins agent as a windows service in the slave agent, the consuming time is so huge,but when I try to start Jenkins agent via windows command line, the consuming time is as normal as executing the batch file manually.
My env:
master: CentOS7
slave agent: win 7
And I also test this case in a slave agent of win 10 for comparison.
The time executing via Jenkins is approximately the same as executing the batch file manually on the agent machine.
So I guess this is the compatibility issue between win 7 and Jenkins.
But for that the Jenkins official said that Jenkins not support win 7 anymore (Microsoft does not support Windows 7), we temporarily put it aside.
Anyway we find a way to conquer this. Hope this will help you for similar scenario.
Bamboo-jmeter task: Should time gap there before starting the jmeter master/slave. We have created bamboo task (SSH task1-with slave host, SSHtask2-with 2nd slave host, SSH task3-with master host and run commands). When first time the the task getting an error remote engine is not able to configured whereas able to telnet the hosts, also jmeter-server is already started.
However when disable SSHtask1 and task2 for the 2nd time run, it is able to run successfully and getting results also.
Should jmeter master start after 1/2 minutes of server started? Please suggest
Able to overcome the problem by adding sleep time of 120 seconds after running the jmeter-server before starting the jmeter master.
I'm trying to reset-and-launch a Windows VM (in vsphere) during a Jenkins job. I successfully installed the vSphere Cloud Plugin. I've followed instructions to setup the Windows machine as a jenkins-mvn-slave, and have it setup to run as a service.
If I click on the button in Jenkins for Launch Slave Agent, I can see (in vsphere) that the VM does a revert snapshot, and then it does a power on virtual machine. If I attach to the machine, I can see that the Jenkins service starts automatically. However, back in Jenkins, it tells me that the Slave did not come online in allowed time.
Some key settings for my slave:
Force VM launch: Checked
Wait for VMTools: Not checked
Delay between launch and boot complete: 120
Secondary launch method: Launch slave agents view Java Web Start
Versions:
Jenkins: 1.596.2
vSphere: 5.5.0
Windows: Server 2012 R2 Standard, Build 9600
vSphere plugin: 2.7
What am I missing?
I've done a lot of messing around since I posted, but I think the following is what I was doing wrong. I first got the VM working as a normal slave agent. Once I had that working, then I tried to setup the same as a vsphere-cloud-slave-agent. I wasn't realizing that setting up a host as a slave agent is "agent-name specific".
So, I uninstalled the Jenkins service, launched the "vsphere cloud slave agent", logged into the machine, and ran javaws (as specified in the previously mentioned instructions.
A couple of other gotchas that I encountered (not relevant to the initial post, but maybe relevant to someone who reads this):
I originally installed git with a password manager. Unfortunately, since jenkins jobs aren't interactive, it was hanging on the git clone command. I tried uninstalling and re-installing git, but it didn't fix the problem for whatever user the jenkins slave was running as. I ended up having to revert to a previous slave image and install git from there. (I probably could have also figured out what user was running the jenkins slave, and entered the desired password there.)
I wanted to run a clean VM for each job. I never figured out this one. If I set Availability to Take this slave on-line when in demand and off-line when idle, that was a good start. However, if I set the times to 0 and 0, then the machine was constantly rebooting. If I set the times to 1 and 1, then the machine does mostly what I want, unless there are back-to-back jobs queued to run.
I am trying to get Jenkins to start a virtual machine on a Jenkins slave. The VM itself will then act as a Jenkins slave.
In order to do so I need to boot the VM and keep it running, even after the Jenkins job terminates. I have tried to create a freestyle project which runs a batch script on the slave and checks if the VM is running:
"C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe" -T ws start "D:\VM\MyVM.vmx"
"C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe" list
The second command shows me that the VM is actually up and running, but apparently it directly shuts down again since I can't see the node that corresponds to the VM as online.
The Jenkins Slave agent is installed as a Windows service on the VM's host and logs in as a domain user.
If I switch the first command to
"C:\Program Files (x86)\VMware\VMware Workstation\vmware.exe" -x "D:\VM\MyVM.vmx"
the VM powers on, the node gets connected to Jenkins. This is because somehow the batch script gets stuck after this command and does not terminate, so the VM remains powered on. However, if I log on the host with the same user the Jenkins service uses, I cannot see the VM running.
Ironically, I can in fact power OFF any virtual machine that I have started locally on the host from Jenkins by creating a project with the batch command
"C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe" -T ws stop "D:\VM\MyVM.vmx" soft
So, to summarize:
I want to create a Jenkins job that powers on a VM so I can use it as a slave agent. The VM has to remain powered on even after the job is done, I will shut it down with a different job as needed.
But only the shutdown job is working as intended.
try to start your VM with START command:
START "C:\Program Files (x86)\VMware\VMware Workstation\vmrun.exe" -T ws start "D:\VM\MyVM.vmx"
After playing around with VMs and Jenkins today I learned that vmrun works perfectly if the Jenkins slave does not run as a Windows service but is launched via the Java Webstart application.
Besides, one can prevent processes from getting killed by altering the BUILD_ID env. variable since Jenkins is using this variable to track the processes the build launched. So by changing the value of BUILD_ID before spawning processes they won't get killed after the Job finishes.