I'm trying to mount a local hadoop cluster using docker and ambari, the problem im having is that ambari install check shows NTP is not running, and it is needed to know if the services installed with ambari are working. I checked ntpd in the containers and tried to launch them but it failed
[root#97ea7075ca78 ~]# service ntpd start
Starting ntpd: [ OK ]
[root#97ea7075ca78 ~]# service ntpd status
ntpd dead but pid file exists
Is there a way to start ntp daemon in those containers?
In docker you don't use the service command as there is no init system. Just run the ntpd command and it should work
ntpd by default goes to background. If that was not the case you would need to use ntpd &
Related
I have an ICp installation on some bare metal to educate myself with. So I don't need to keep it running all the time. What is the proper way to shut it down while I am not using it? I have two physical nodes; master and worker. Currently I just ssh into each and issue a sudo shutdown now command.
When I bring the cluster back on line later, the I can't get to the admin UI. It responds with a 502 bad gateway error. When I load https://master:9443 I get the Welcome to Liberty page (indicating that at least the web server is running).
If you stop docker containers or the docker runtime, then the kubelet will attempt to restart them.
If you want to shutdown the system, you must stop the kubelet on each node. On Ubuntu, you would use systemctl:
sudo systemctl stop kubelet
sudo systemctl stop docker
Confirm that all processes are shutdown:
top
And that all related network ports are no longer in use:
netstat -antp
(Note that netstat's "-p" option requires root privileges to inspect the pid holding onto the port).
To restart the cluster, start docker and then the kubelet. Again for Ubuntu:
sudo start docker
sudo start kubelet
And of course you can follow the logs for the kubelet:
sudo journalctl -e -u kubelet
Stop Docker to shut it down, I hope this helped.
systemctl stop docker
iam using hadoop apache 2.7.1 on centos 7
and my cluster is ha cluster and iam using zookeeper quorum for automatic failover
but i want to automate zookeeper start process and ofcourse in the shell script we have to stop firewall first in order to let other quorum elements able to contact current zookeeper element
iam writing the following script in /etc/rc.d/rc.local
hostname jn1
systemctl stop firewalld
ZOOKEEPER='/usr/local/zookeeper-3.4.9/'
source /etc/rc.d/init.d/functions
source $ZOOKEEPER/bin/zkEnv.sh
daemon --user root $ZOOKEEPER/bin/zkServer.sh start
but iam facing the problem that when iam issuing the command
systemctl stop firewalld
in rc.local
and issuing zkServer status after host boots iam getting the error
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.9/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
but if i execute the same commands with out a script i mean after my host boots as normal process
systemctl status firewalld
zkServer start
there is no problem and zkstatus shows its mode
i have noticed the difference in zookeeper.out log between executing rc.local script and normal commands after the host boots
and the difference is reading server environments in normal commands execute
what could be the effect of stopping firewall at rc.local script to server environment and how to handle it
?
i have abig headache about stopping and restarting firewall scenarios
and i discovered that stopping firewall at rc.local is a fake stopping
so because idon't want fire wall to work at all i ended up with the following solution
systemctl disable firewalld
https://www.rootusers.com/how-to-disable-the-firewall-in-centos-7-linux/
so firewall is not going to work again at any boot
I have followed the steps at https://coreos.com/kubernetes/docs/latest/kubernetes-on-vagrant.html to launch a multi-node Kubernetes cluster using Vagrant and CoreOS.
But,I could not find a way to set an insecure docker registry for that environment.
To be more specific, when I run
kubectl run api4docker --image=myhost:5000/api4docker:latest --replicas=2 --port=8080
on this set up, it tries to get the image thinking it is a secure registry. But, it is an insecure one.
I appreciate any suggestions.
This is how I solved the issue for now. I will add later if I can automate it on Vagrantfile.
cd ./coreos-kubernetes/multi-node/vagrant
vagrant ssh w1 (and repeat these steps for w2, w3, etc.)
cd /etc/systemd/system/docker.service.d
sudo vi 50-insecure-registry.conf
add below line to this file
[Service]
Environment=DOCKER_OPTS='--insecure-registry="<your-registry-host>/24"'
after adding this file, we need to restart the docker service on this worker.
sudo systemctl stop docker
sudo systemctl daemon-reload
sudo systemctl start docker
sudo systemctl status docker
now, docker pull should work on this worker.
docker pull <your-registry-host>:5000/api4docker
Let's try to deploy our application on Kubernetes cluster one more time.
Logout from the workers and come back to your host.
$ kubectl run api4docker --image=<your-registry-host>:5000/api4docker:latest --replicas=2 --port=8080 —env="SPRING_PROFILES_ACTIVE=production"
when you get the pods, you should see the status running.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api4docker-2839975483-9muv5 1/1 Running 0 8s
api4docker-2839975483-lbiny 1/1 Running 0 8s
I have a setup with Mesos and Aurora, I have dockerized my application which I need to deploy, now i have to start mesos slave with the docker support, but I'm not able to start the mesos slave with docker support, I'm trying the following:
sudo service mesos-slave --containerizers=docker,mesos start
this gives me
mesos-slave: unrecognized service
but if I try :
sudo service mesos-slave start
the slave gets activated.
Can anyone let me know how to solve this issue.
You should also inform people about what OS you're using, otherwise it's mostly guesswork.
Normally, your /etc/mesos-slave/containerizers should contain the following to enable Docker support:
docker,mesos
Then, you'd have to restart the service:
sudo service mesos-slave restart
References:
https://open.mesosphere.com/getting-started/install/#slave-setup
https://mesosphere.github.io/marathon/docs/native-docker.html
https://open.mesosphere.com/advanced-course/deploying-a-web-app-using-docker/
I have stopped the ntpd and restarted it again. Have done a ntpdate pool.ntp.org. the error went once and the hosts were healthy but after sometime again got a clock offset error.
Also I observed that after doing a ntpdate the web interface of cloudera stopped working. It says potential mismatch configuration fix and restart hue.
I have the cloudera quick start vm with centos setup on VMware.
Check if /etc/ntp.conf file is the same across all nodes/masters
restart ntp
add deamon with chkconfig and set it to on
You can fix it by restarting the NTP service which syncronizes the time with a central source.
You can do this by logging in as root from the commandline and running service ntpd restart.
After about a minute the error in CM shoud go away.
Host Terminal
sudo su
service ntpd restart
Clock offset Error occur on Cloudera Manager if host\node's NTP service could not located or did not respond to a request for the clock offset.
Solution:
1)Identify NTP Server IP or Get details of NTP Server IP for your hadoop Cluster
2)On your Hadoop Cluster Nodes Edit-> /etc/ntp.conf
3)Add entries in ntp.conf
server [NTP Server IP]
server xxx.xx.xx.x
4)Restart Services.Execute
Service ntpd restart
5) Restart Cluster From Cloudera Manager
Note: If Problem Still Persist .Reboot you Hadoop Nodes & Check Process.
Check $ cat /etc/ntp.conf make sure configuration file is same as others (nodes)
$ systemctl restart ntpd
$ ntpdc -np
$ ntpdate -u 0.centos.pool.ntp.org
$ hwclock --systohc
$ systemctl restart cloudera-scm-agent
After that wait a few seconds to let it auto configure.