JanusGraph query exception:org.apache.tinkerpop.gremlin.groovy.plugin.RemoteException: Could not find type for id

JanusGraph query exception:org.apache.tinkerpop.gremlin.groovy.plugin.RemoteException: Could not find type for id - janusgraph

I can login and connect the tinkerpop server successfully, but when i execute a gremlin, there come out one strange exception：org.apache.tinkerpop.gremlin.groovy.plugin.RemoteException: Could not find type for id: 137481, i use g.V(137481), there also throws an exception, but when i execute g.V(137481).valueMap(true), it return an node, here is the gremlin
execution result:
[root#docker9 janusgraph-0.2.0-hadoop2]# bin/gremlin.sh
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: janusgraph.imports
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/janusgraph-0.2.0-hadoop2/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/janusgraph-0.2.0-hadoop2/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
09:51:06 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.tinkergraph
gremlin> :remote connect tinkerpop.server conf/remote.yaml session
==>Configured cdh-slave1/192.168.66.149:8182-[f699751a-c046-472f-8d84-a22a7897b241]
gremlin> :remote console
==>All scripts will now be sent to Gremlin Server - [cdh-slave1/192.168.66.149:8182]-[f699751a-c046-472f-8d84-a22a7897b241] - type ':remote console' to return to local mode
gremlin> g
==>graphtraversalsource[standardjanusgraph[cassandrathrift:[192.168.66.149]], standard]
gremlin> g.V(137481).valueMap()
==>{}
gremlin> g.V(137481)
Server could not serialize the result requested. Server error - Error during serialization: Could not find type for id: 137481. Note that the class must be serializable by the client and server for proper operation.
Type ':help' or ':h' for help.
Display stack trace? [yN]
gremlin>
gremlin> g.V(137481).valueMap(true)
==>{id=137481, label=vertex}
I'm sure that the vertex who's 'uri'='/0/85' is already exists!
gremlin> g.V().has('uri','/0/85').valueMap()
Could not find type for id: 137481
Type ':help' or ':h' for help.
Display stack trace? [yN]y
org.apache.tinkerpop.gremlin.groovy.plugin.RemoteException: Could not find type for id: 137481
at org.apache.tinkerpop.gremlin.console.groovy.plugin.DriverRemoteAcceptor.submit(DriverRemoteAcceptor.java:175)
at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:99)
at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:122)
at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:95)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:124)
at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:59)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:132)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:152)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:83)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:166)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:478)

It happens if multiple instances of Gremlin Server are running
because a Gremlin Server was not shutdown or killed properly.
It can happen because the VM on which Gremlin Server is running might have restarted.
The solution is login to the Gremlin Console and run your commands based on your backend; in my case it's Cassandra and ElasticSearch
So, I ran
Method 1
:remote connect tinkerpop.server conf/remote.yaml session
:remote console session
or
graph=JanusGraphFactory.open('conf/janusgraph-cql-es.properties');
g=graph.traversal()
and if you are running containers then your command should be similar to this
graph=JanusGraphFactory.open('/etc/opt/janusgraph/janusgraph.properties');
g=graph.traversal()
Now, after running those you can run
mgmt = graph.openManagement()
mgmt.getOpenInstances()
it will display all the instances
eg
ac12000231-a9ffbcbb0e921
ac12000230-a9ffbcbb0e921(current)
Except that current instance, you should close the other instances
mgmt.forceCloseInstance('ac12000231-a9ffbcbb0e921')
After closing all the instances commit the changes
mgmt.commit()
Now restart your Gremlin Server and run your query, and it should work
Method 2
If the problem persists, just kill your Gremlin Server and start it again few times and it should work.
Another reason why this happens is if the data is not restored properly. If you are using a cluster take the backup on all the nodes then restore on your destination node or nodes.
I used nodetool for backup and sstableloader for restoring data .

Related

Hive remote postgres metastore

I was doing multi-node setup using Apache distribution .I was able to complete hadoop installation successfully (Hadoop 2.7.3).
When I tried hive (Hive 2.3),its working without issues with the default metastore(derby).Then I changed the hive-site.xml to point to my external postgresDB
I gave host,username,password as per the tutorial .But when I ran the schemainit it is faliling as bellow ,still showing derby details and initialization
is failing .Anybody faced the same issue ever?
bash-4.2$ /data/hive/bin/schematool -initSchema -dbType postgres --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.postgres.sql
Connecting to jdbc:derby:;databaseName=metastore_db;create=true
Connected to: Apache Derby (version 10.10.2.0 - (1582446))
Driver: Apache Derby Embedded JDBC Driver (version 10.10.2.0 - (1582446))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:derby:> !autocommit on
Autocommit status: true
0: jdbc:derby:> SET statement_timeout = 0
Error: Syntax error: Encountered "statement_timeout" at line 1, column 5. (state=42X01,code=30000)
Closing: 0: jdbc:derby:;databaseName=metastore_db;create=true
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:590)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:563)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:980)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:959)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:586)
... 8 more
*** schemaTool failed ***

Apache Drill (Embedded): Failure setting up ZK for client

I am new to Apache Drill, and currently I am following the instructions from this link here to learn about it:
Drill in 10 minutes
However, after checking that I had the pre-requisites, I hit an error when I execute the steps in 'Start Drill on Windows' section.
Open Command Prompt.
Open the apache-drill- folder.
Go to the bin directory. For example: cd bin
Type the following command on the command line: sqlline.bat -u "jdbc:drill:zk=local"
Error: Failure in connecting to Drill:
org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for
client. (state= ,code=0) java.sql.SQLException: Failure in connecting
to Drill: org.apache.drill.exec.rpc.RpcException: Failure setting up
ZK for client.
at org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:167)
at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:72)
at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:143)
at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213)
at sqlline.Commands.connect(Commands.java:1083)
at sqlline.Commands.connect(Commands.java:1015)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
at sqlline.SqlLine.dispatch(SqlLine.java:742)
at sqlline.SqlLine.initArgs(SqlLine.java:528)
at sqlline.SqlLine.begin(SqlLine.java:596)
at sqlline.SqlLine.start(SqlLine.java:375)
at sqlline.SqlLine.main(SqlLine.java:268)
Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for
client.
at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:329)
at org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:158)
... 18 more
Caused by: java.io.IOException: Failure to connect to the zookeeper cluster service within the allotted time of 10000 mi
lliseconds.
at org.apache.drill.exec.coord.zk.ZKClusterCoordinator.start(ZKClusterCoordinator.java:123)
at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:327)
... 19 more
local (The system cannot find the file specified)
apache drill 1.11.0
Where is the 'local' file, and where can I get it?

Try drill bit in the command instead of zk because zookeeper has nothing to do if you are using the drill in embedded mode
"jdbc:drill:drillbit=local"

I had this issue, but was using Powershell, instead of command prompt.
Try running cmd /r 'sqlline.bat -u "jdbc:drill:zk=local"'

Unable to start Hive CLI Hadoop(MapR)

I am trying to access hive CLI. However, it is failing to start with the following AccessControl issue.
Strangly enough, I am able to query hive data from Hue without the AccessControl issue. However, hive CLI is not working.
I am on a MapR cluster.
Any help is much appreciated.
[<user_name>#<edge_node> ~]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/hive/hive-2.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/opt/mapr/hive/hive-2.1/conf/hive-log4j2.properties Async: true
2017-09-23 23:52:08,988 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-api-jdo-4.2.4.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-api-jdo-4.2.1.jar."
2017-09-23 23:52:08,993 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-core-4.1.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-core-4.1.6.jar."
2017-09-23 23:52:09,004 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-rdbms-4.1.19.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-rdbms-4.1.7.jar."
2017-09-23 23:52:09,038 INFO [main] DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
2017-09-23 23:52:09,039 INFO [main] DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-09-23 23:52:14,2251 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:2172 Thread: 20235 mkdirs failed for /user/<user_name>, error 13
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:617)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:531)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:646)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1256)
at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1276)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1913)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getDefaultDestDir(DagUtils.java:823)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getHiveJarDirectory(DagUtils.java:917)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createJarLocalResource(TezSessionState.java:616)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:256)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.beginOpen(TezSessionState.java:220)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:614)
... 10 more

The error is saying you're defined access to create a directory in the file system. This is likely /user/<user name>, which will need to be added by the HDFS / MapR FS super user.
I am able to query hive data from Hue without the AccessControl
Hue communicates via Thrift and HiveServer2.
Hive CLI bypasses HiveServer2 and is deprecated.
You should use Beeline instead.
beeline -n $(whoami) -u jdbc:hive2://hiveserver:10000/default
And if you're in a kerberized cluster, then you'll need some extra options there.

Hive does not start: Error creating path /hive/cluster/delegation/METASTORE/keys

I ran into a problem on a kerberized cluster where hive would not start.
Symptoms:
Services start succesfully (and did not stop)
In Ambari an alert appeared which mentioned that the Hive metastore failed
Starting hive on the command line did not succeed (it just kept hanging)
Via beeline I was able to see metadata, but not get actual data
I found the following error in /var/log/hive/hivemetastore.log
2016-08-29 10:12:49,047 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5934)) - Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hive/cluster/delegation/METASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:469)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server.startDelegationTokenSecretManager(HadoopThriftAuthBridge.java:444)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6015)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5930)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hive/cluster/delegation/METASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more

Note that I actually tried several things, so I am not sure whether this is the full solution, but here is the final step, which I believe to be the critical one:
After a long search I indirectly found this site: https://community.hortonworks.com/articles/49040/hive-metastore-crashes-on-nullpointerexception-wit.html
Here is the relevant fragment that helped me resolve the issue:
This is a known issue being tracked in the following Hortonworks bug:
https://hortonworks.jira.com/browse/BUG-42602
WORKAROUND:
Set the hive.cluster.delegation.token.store.class to the following:
hive.cluster.delegation.token.store.class=org.apache.hadoop.hive.thrift.DBTokenStore
If using Ambari, this setting can be changed by clicking on the Hive
service on the Ambari Dashboard, navigating to the "Configs" tab, and
modifying the parameter in the "Advanced Hive-site" section of the
Hive configs. Save the changes and restart Hive from the Ambari User
Interface when prompted.
If not using ambari, this setting can be located in the
/etc/hive/conf/hive-site.xml file. Make sure this change is made on
all applicable nodes on the cluster. Once the changes are made, the
Hive services must be restarted.

Why do the Spark examples fail to spark-submit on EC2 with spark-ec2 scripts?

I downloaded spark-1.5.2 and I setup a cluster on ec2 using the spark-ec2 doc here.
After that I went to examples/ and run mvn package and packaged the examples in a jar.
In the end I run the submit with:
bin/spark-submit --class org.apache.spark.examples.JavaTC --master spark://url_here.eu-west-1.compute.amazonaws.com:7077 --deploy-mode cluster /home/aki/Projects/spark-1.5.2/examples/target/spark-examples_2.10-1.5.2.jar
Instead of it running, I get the error:
WARN RestSubmissionClient: Unable to connect to server spark://url_here.eu-west-1.compute.amazonaws.com:7077.
Warning: Master endpoint spark://url_here.eu-west-1.compute.amazonaws.com:7077 was not a REST server. Falling back to legacy submission gateway instead.
15/12/22 17:36:07 WARN Utils: Your hostname, aki-linux resolves to a loopback address: 127.0.1.1; using 192.168.10.63 instead (on interface wlp4s0)
15/12/22 17:36:07 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/12/22 17:36:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:116)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at org.apache.spark.deploy.Client$$anonfun$7.apply(Client.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.deploy.Client$.main(Client.scala:233)
at org.apache.spark.deploy.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 21 more

Are you sure the URL to master contains "url-here"?
spark://url_here.eu-west-1.compute.amazonaws.com:7077
Or maybe you are trying to obfuscate it for this post.
If you can you connect the Spark UI at
http://url_here.eu-west-1.compute.amazonaws.com:4040 or depending on your spark version http://url_here.eu-west-1.compute.amazonaws.com:8080, make sure you are using the URL variable seen on the Spark UI for your spark://...:7070 command line argument

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

JanusGraph query exception:org.apache.tinkerpop.gremlin.groovy.plugin.RemoteException: Could not find type for id - janusgraph

Related

Hive remote postgres metastore

Apache Drill (Embedded): Failure setting up ZK for client

Unable to start Hive CLI Hadoop(MapR)

Hive does not start: Error creating path /hive/cluster/delegation/METASTORE/keys

Why do the Spark examples fail to spark-submit on EC2 with spark-ec2 scripts?

Categories

Resources