spark-cassandra thrift server on ec2 throws SparkException on query from beeline - amazon-ec2

I installed cassandra spark-hadoop cluster on 3 ec2 nodes. Yesterday, I was able to start the spark thrift server on node0, and actually executed a simple sql statement in beeline. Today, after a schema change, I restarted the thrift server, now I get a
SparkException java.lang.IllegalArgumentException: ip-172-30-4-140
at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:212)
the ip-172-30-4-140 is simply the private ip of that node
I tried running the same sequence from the other two cassandra nodes, and for those, the sql statement gets stuck and never returns.
What is this error? any one knows?

This is not ip . you have to put either private ip or public dns .

ok, I found the problem.
The default value for the host parameter points to the internal ip DNS of the ec2, which causes the exception. It needs to be explicitly declared
sudo dse spark-sql-thriftserver start hive.server2.thrift.bind.host=your-ec2-private-ip

Related

Cluster configuration on MonetDB: cannot discover other nodes

I have installed and configured a 3 monetdb nodes cluster on 3 virtual machines on my MacBook (Using Oracle Virtual Box). I use MonetDB 5 server 11.37.7
I have followed the Cluster Management documentation of MonetDB, but the monetdb discover command only returns the dbfarm of the local instance. Each node still isn't aware of other nodes.
I can connect to any nodes from any other node using monetdb -h [host] -P [passphrase], I can also discover the remote farms of a specific host by using monetdb -h host -P passphrase discover
The answer to this question monetdb cluster management can't setup helped me in setting the listenaddr property to 0.0.0.0, but still, the discover command only returns the local monetdb farm.
EDIT
Thanks to Jennie suggestion below, I noticed that the monetdb log file contains error while sending broadcast message: Network is unreachable.
I used netcat utility to brodcast UDP message from one node to the other 2 and it worked, I can ping, ssh and the 3 nodes are part of the same network configured with virtualbox, but the error is still there.
All your VMs must be in the same LAN environment. monetdb discover basically goes over all IP addresses under the same subnet.
Can you some how verify that's the case?
I got it working, thanks to #Jennie's post. For anyone using VirtualBox:
Use the first network adapter of each configured node with Bridge access instead of NAT
Configure the following property of your dbfarm: listenaddr=0.0.0.0
For testing purpose, it may be worth reducing the property discoveryttl to less than the default 10mns

Why can't standalone slaves connect to master on separate Mac OS boxes?

I have two Macs (both OS X EI Caption) at home, both are connected to same wifi. I want to install an spark cluster (with two workers) on this two computers.
Mac1 (192.168.1.2) is my master, with Spark 1.5.2, it is up and working well, and I can see the Spark UI at http://localhost:8080/ (also I see spark://Mac1:7077)
I also have run one slave on this machine (Mac1), and I see it under workers in the Spark UI.
Then, I have copied the Spark on the second machine (Mac2), and I am trying to run another Slave on Mac2 (192.168.2.9) by this command:
./sbin/start-slave.sh spark://Mac1:7077
But, it does not work: Looking at log it shows:
Failed to connect to master Mac1:7077
Actor not found for: ActorSelection[Anchor(akka.tcp://sparkMaster#Mac1:7077/),Path(/User/Master)]
Networking-wise, at Mac1, I can SSH to Mac2, and vice versa, but I cannot telnet to Mac1:7077.
I will appreciate it if you help me to solve this problem.
tl;dr Use -h option for ./sbin/start-master.sh, i.e. ./sbin/start-master.sh -h Mac1
Optionally, you could do ./sbin/start-slave.sh spark://192.168.1.2:7077 instead.
The reason is that binding to ports in Spark is very sensitive to what names and IPs are used. So, in your case, 192.168.1.2 != Mac1. They're different "names" in Spark, and that's why you can use ssh successfully as it uses name resolver on OS while it does not work at Spark level where the above condition holds, i.e. the "names" are not equal.
Likely a networking/firewall issue on the mac.
Also, your error message you copy/pasted reference port 7070. is this the issue?
using IP addresses in conf/slaves works somehow, but I have to use IP everywhere to address the cluster instead of host name.
SPARK + Standalone Cluster: Cannot start worker from another machine

How does one install etcd in a cluster?

Newbie w/ etcd/zookeeper type services ...
I'm not quite sure how to handle cluster installation for etcd. Should the service be installed on each client or a group of independent servers? I ask because if I'm on a client, how would I query the cluster? Every tutorial I've read shows a curl command running against localhost.
For etcd cluster installation, you can install the service on independent servers and form a cluster. The cluster information can be queried by logging onto one of the machines and running curl or remotely by specifying the IP address of one of the cluster member node.
For more information on how to set it up, follow this article

How to restart single node hadoop cluster on ec2

I have installed a single node haodoop cluster on using Hortonworks/Ambari on Amazon's ec2 host.
Since I don't want this cluster running 24/7, I stop the instance when done. When I reboot the instance later, I get a new IP address and then ambari no longer is able to start the Hadoop related services.
Is there a way other than completely redeploying to reconfigure the cluster so the services will start?
It looks like the IP address lives in various xml files under /etc, in the postgres database table ambari, and possibly other places I haven't found yet.
I tried updating the xml files and postgres database with updated versions of the ip address, internal and external dns names as I could find them, but to no avail. I have not been able to restart the services.
The reason I am doing this is to possibly save the deployment time and data configuration on hdfs and other project specific setup each time I restart the host.
Any suggestions?
Thanks!
Elastic IP can be used. Also, since you mentioned it being a single node cluster - you can use localhost or private IP.
If you use elastic IP, your UIs will always be on the same public IP. However, if you use private IP or localhost and do not associate your instance with an elastic IP you will have to look for public IP everytime you start the instance and then connect to the web UI using the IP.
Thanks for the help, both Harman and TJ are correct. I haven't used an elastic IP because I might have more than one of these running and a time, and for now at least, I don't mind looking up the public ip address.
Harman's suggestion of using "localhost" as the fqdn when setting up ambari in the first place is a really good idea in retrospect. Unless I go through the whole setup again, that's water under the bridge for me, but I recommend this to others who might read this post.
In my case, I figured this out on my own before coming back to the page. The specific step I took was insanely simple after all, thanks to Occam's Razor.
I added the following line in /etc/hosts:
<new internal IP> <old internal dns name>
and then did
ambari-server restart. from the command line. Then I am able to restart all services after logging into ambari.

Apache Spark error : Could not connect to akka.tcp://sparkMaster#

This is our first steps using big data stuff like apache spark and hadoop.
We have a installed Cloudera CDH 5.3. From the cloudera manager we choose to install spark. Spark is up and running very well in one of the nodes in the cluster.
From my machine I made a little application that connects to read a text file stored on hadoop HDFS.
I am trying to run the application from Eclipse and it displays these messages
15/02/11 14:44:01 INFO client.AppClient$ClientActor: Connecting to master spark://10.62.82.21:7077...
15/02/11 14:44:02 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#10.62.82.21:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#10.62.82.21:7077
15/02/11 14:44:02 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#10.62.82.21:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /10.62.82.21:7077
The application is has one class the create a context using the following line
JavaSparkContext sc = new JavaSparkContext(new SparkConf().setAppName("Spark Count").setMaster("spark://10.62.82.21:7077"));
where this IP is the IP of the machine spark working on.
Then I try to read a file from HDFS using the following line
sc.textFile("hdfs://10.62.82.21/tmp/words.txt")
When I run the application I got the
Check your Spark master logs, you should see something like:
15/02/11 13:37:14 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster#mymaster:7077]
15/02/11 13:37:14 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkMaster#mymaster:7077]
15/02/11 13:37:14 INFO Master: Starting Spark master at spark://mymaster:7077
Then when your connecting to the master, be sure to use exactly the same hostname as found in the logs above (do not use the IP address):
.setMaster("spark://mymaster:7077"));
Spark standalone is a bit picky with this hostname/IP stuff.
When you create your Spark master using the shell command "sbin/start-master.sh". go the the address http://localhost:8080 and check the "URL" row.
I notice no accepted answer, just for info I thought I'd mention a couple things.
First, in the spark-env.sh file in the conf directory, the SPARK_MASTER_IP and SPARK_LOCAL_IP settings can be hostnames. You don't want them to be, but they can be.
As noted in another answer, Spark can be a little picky about hostname vs. IP address, because of this resolved bug/feature: See bug here. The problem is, it's not clear if they "resolved" is simply by telling us to use IP instead of hostname?
Well I am having this same problem right now, and the first thing you do is check the basics.
Can you ping the box where the Spark master is running? Can you ping the worker from the master? More importantly, can you password-less ssh to the worker from the master box? Per 1.5.2 docs you need to be able to do that with a private key AND have the worker entered in the conf/slaves file. I copied the relevant paragraph at the end.
You can get a situation where the worker can contact the master but the master can't get back to the worker so it looks like no connection is being made. Check both directions.
Finally of all the combinations of settings, in a limited experiment just now I only found one that mattered: On the master, in spark-env.sh, set the SPARK_MASTER_IP to the IP address, not hostname. Then connect from the worker with spark://192.168.0.10:7077 and voila it connects! Seemingly none of the other config parameters are needed here.
Here's the paragraph from the docs about ssh and slaves file in conf:
To launch a Spark standalone cluster with the launch scripts, you
should create a file called conf/slaves in your Spark directory, which
must contain the hostnames of all the machines where you intend to
start Spark workers, one per line. If conf/slaves does not exist, the
launch scripts defaults to a single machine (localhost), which is
useful for testing. Note, the master machine accesses each of the
worker machines via ssh. By default, ssh is run in parallel and
requires password-less (using a private key) access to be setup. If
you do not have a password-less setup, you can set the environment
variable SPARK_SSH_FOREGROUND and serially provide a password for each
worker.
Once you have done that, using the IP address should work in your code. Let us know! This can be an annoying problem, and learning that most of the config params don't matter was nice.

Resources