I am trying to configure ambari on single node. I am at confirm hosts phase. Its is showing installing from more than 30 minutes. I am running Ubuntu 14.04 64 bit on virtual box(RAM = 4GB). Is this normal?
It shouldn't take 30 minutes to register a single host. If you click the Installing link under the status column it will drill down into the log of what registration is doing. This may provide more details on what's going wrong with the registration process.
Related
I have installed a DSX 3 node cluster on RHEL 7.4, all notebooks and r-studio code work fine. However, model creation gives this error:
Load Data
Error: The provided kernel id was not found. Verify the input spark service credentials
All kubernetes pods seem to be up and running. Any ideas on how to fix this?
If you are in the Sept release, suggest stop kernels and restart. There was a limit of of 10 kernels in that release. You will see the active green button across notebooks/models with option to stop.
I have setup a Hadoop Cluster with Hortonworks Data Platform 2.5. I'm using 1 master and 5 slave (worker) nodes.
Every few days one (or more) of my worker nodes gets a high load and seem to restart the whole CentOS operating system automatically. After the restart the Hadoop components don't run anymore and have to be restarted manually via the Amabri management UI.
Here a screenshot of the "crashed" node (reboot after the high load value ~4 hours ago):
Here a screenshot of one of other "healthy" worker node (all other workers have similar values):
The node crashes alternate between the 5 worker nodes, the master node seems to run without problems.
What could cause this problem? Where are these high load values coming from?
This seems to be a Kernel problem, as the log file (e.g. /var/spool/abrt/vmcore-127.0.0.1-2017-06-26-12:27:34/backtrace) says something like
Version: 3.10.0-327.el7.x86_64
BUG: unable to handle kernel NULL pointer dereference at 00000000000001a0
After running a sudo yum update I had the kernel version
[root#myhost ~]# uname -r
3.10.0-514.26.2.el7.x86_64
Since the operating system updates the problem didn't occur anymore. I will observe the issue and give feedback if neccessary.
I set-up a new Hadoop Cluster with Hortonworks Data Platform 2.5. In the "old" cluster (installed HDP 2.4) I was able to see the information about running Spark jobs via the History Server UI by clicking the link show incomplete applications:
Within the new installation this link opens the page, but it always sais No incomplete applications found! (when there's still an application running).
I just saw, that the YARN ResourceManager UI shows two different kind of links in the "Tracking UI" column, dependent on the status of the Spark application:
application running: Application Master
this link opens http://master_url:8088/proxy/application_1480327991583_0010/
application finished: History
this link opens http://master_url:18080/history/application_1480327991583_0009/jobs/
Via the YARN RM link I can see the running Spark app infos, but why can't I access them via Spark History Server UI? Was there somethings changed from HDP 2.4 to 2.5?
I solved it, it was a network problem: Some of the cluster hosts (Spark slaves) couldn't reach each other due to a incorrect switch configuration. Found it out, as I tried to ping each host from each other.
Since all hosts can ping each other hosts the problem is gone and I can see active and finished jobs in my Spark History server UI again!
I didn't noticed the problem, because the ambari-agents worked on each host, and the ambari-server was also reachable from each cluster host! However, since ALL hosts can reach each other the problem is solved!
I am installing a new hadoop cluster(total 5 nodes) using the Ambari dashboard. While deploying the cluster it fails but with warnings of disk space issues and error messages like '/' needs atleast 2GB of diskspace for mount. But I have allocated total 50GB of disk to each node. Upon googling for the solution I found that I need to make diskspacecheck=0 in the etc/yum.conf file as suggested in the below link(point 3.6):
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_ambari_troubleshooting/content/_resolving_ambari_installer_problems.html
But I am using ubuntu image in the nodes and there is no yum file. And I didn't get any file with "diskspacecheck" parameter. Can anybody tell me how to solve this issue and successfully deploy my cluster?
I have installed HDP Ambari with three nodes in VM, i restarted one of three nodes i.e., datanode2 after that, i lost heart beat from that node in Ambari. I restarted ambari-agent in all three nodes, then also not working. Kindly find me a solution.
Well the provided information is not sufficient, anyway i will try to tell you the normal approach I take to debug this.
First check if all the ambari-agents are running, use the command ambari-agent status.
Check the logs of both ambari-agent and ambari-server. Normally the logs are available at /var/log/ambari-agent and /var/log/ambari-server. Logs should tell you the exact reason for heartbeat lost.
Most common reasons for the agent failure would be Connection issues between the machines, version mismatch or corrupt database entry.
I think log files should help you.