Getting error in Talend job with tHiveConnection component - hadoop

I am getting this error while executing the talend job with tHiveConnection.
I am using Java 1.7, Hadoop 2.2 & talend open studio for big data 6.0
Please help me in identify this error.
Please find below the error details
Starting job CH04_01_HIVE_PROCESSING_HASH_TAGS at 09:15 09/08/2015.
[statistics] connecting to socket on port 3662
[statistics] connected
Exception in component tHiveConnection_1
java.lang.ClassNotFoundException: org.apache.hadoop.hive.jdbc.HiveDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at packt_big_data.ch04_01_hive_processing_hash_tags_0_1.CH04_01_HIVE_PROCESSING_HASH_TAGS.tHiveConnection_1Process(CH04_01_HIVE_PROCESSING_HASH_TAGS.java:689)
at packt_big_data.ch04_01_hive_processing_hash_tags_0_1.CH04_01_HIVE_PROCESSING_HASH_TAGS.runJobInTOS(CH04_01_HIVE_PROCESSING_HASH_TAGS.java:2084)
at packt_big_data.ch04_01_hive_processing_hash_tags_0_1.CH04_01_HIVE_PROCESSING_HASH_TAGS.main(CH04_01_HIVE_PROCESSING_HASH_TAGS.java:1833)
[statistics] disconnected
Job CH04_01_HIVE_PROCESSING_HASH_TAGS ended at 09:15 09/08/2015. [exit code=1]

org.apache.hadoop.hive.jdbc.HiveDriver was used to connect to the original HiveServer.
With HiveServer2 use org.apache.hive.jdbc.HiveDriver -- and reading some documentation could do no harm, especially the comment that reads "we need the following jars in the classpath"

Related

Fiware Cosmos Hive Authorization Issue

I'm using a shared instance of Fiware Cosmos (meaning I don't have root privileges). I have until today successfully acessed and managed tables in hive both remotely using jdbc, and Hive CLI.
But now I'm getting this error when starting Hive CLI:
log4j:ERROR Could not instantiate class [org.apache.hadoop.hive.shims.HiveEventCounter].
java.lang.RuntimeException: Could not load shims in class org.apache.hadoop.log.metrics.EventCounter
at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:123)
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:115)
at org.apache.hadoop.hive.shims.ShimLoader.getEventCounter(ShimLoader.java:98)
at org.apache.hadoop.hive.shims.HiveEventCounter.<init>(HiveEventCounter.java:34)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:357)
at java.lang.Class.newInstance(Class.java:310)
at org.apache.log4j.helpers.OptionConverter.instantiateByClassName(OptionConverter.java:330)
at org.apache.log4j.helpers.OptionConverter.instantiateByKey(OptionConverter.java:121)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:664)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:647)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:544)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:440)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:476)
at org.apache.log4j.PropertyConfigurator.configure(PropertyConfigurator.java:354)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:127)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:77)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:58)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:641)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.log.metrics.EventCounter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:171)
at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:120)
... 27 more
log4j:ERROR Could not instantiate appender named "EventCounter".
Logging initialized using configuration in jar:file:/usr/local/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
I can however perform select and create in the Hive CLI.
If I then try to access Hive remotely, I get this:
Connecting to jdbc:hive://x.x.x.x:10000/default?user=user&password=XXXXXXXXXX
Could not establish connection: java.net.ConnectException: Connection refused
I didn't do any changes in code or commands before the errors appeared, and after googling around I haven't found any working solutions.
If anyone can guide me to where the problem is, or how to find it, or even better how to solve it, I'd be grateful.
Thanks in advance!
HiveServer2 (the Hive JDBC service) is a very unstable piece of shoftware. In our Prod cluster we have a CRON job to restart each instance every day, and even then, sometimes it blows OutOfMemory errors then just hangs saying Connection refused like you show. Open a ticket to your Hadoop admin so that he/she retarts the damn service.
On the other hand, the org.apache.hadoop.log.metrics.EventCounter message smells like someone tried to change a shared config somewhere (or tried to upgrade some JARs) and now Hive believes that it runs on a very, very old version of Hadoop
=> e.g. comments in Hive-4133 or that MapR support post
The cause of these issues were Hive upgrades in Cosmos. A more thorough explanation and solution is found here:
My Hive client stopped working with Cosmos instance

HiveServer Class Not Found Exception

I'm trying to run hive from the command prompt it is working absolutely fine. But when I try running hiveserver using "hive --service hiveserver" command, I'm getting the following exception.
Starting Hive Thrift Server
Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.hive.service.HiveServer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
So I then tried with the command "hive --service hiveserver2"; still I'm not finding any solution.
Can anybody please suggest a solution for this problem.
May be another process (another hiveserver) already listening on port 10000.
can you check it by :
netstat -ntulp | grep ':10000' to see it and if found then kill the process.
Otherwise start the server on another port.
By the way which version you are using ?
This error occurred to me when it can't find hive-service-*.jar in hadoop classpath. Just copy the hive-service-*.jar to your hadoop lib folder or export classpath in hadoop-env.sh. I have mentioned how to add classpath below.
Add this line in hadoop-env.sh:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/hive-*.jar
I have mentioned the path for hive as /usr/local/hive since i have hive installed at that location. Change it to point to your hive installation.

HiveServer: ClassNotFound (HiveServer)

first of all: I am new to Hive.
I just installed Hive and when I run "hive" the server starts up and brings me into the CLI. But when I try to start it as a service/server with "hive --service hiveserver" I get:
Starting Hive Thrift Server
Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.hive.service.HiveServer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Two questions:
Is that the right way to start up a Hive server? (My understanding is that HCat and WebHCat are included automatically!?)
Why does this problem show up and how can I resolve it?
Thanks and regards!
Try to run bin/hive --service hiveserver2 instead of hive --service hiveserver for this version of apache hive

Unable to find partitioner class - Cassandra

Can some help me in fixing the below issue facing with Cassandra, when i run my application on Hadoop.
When i run the application, i am getting the below error with respect to the partitioner class we mentioned in the application.
Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: Unable to find partitioner class 'org.apache.cassandra.dht.RandomPartitioner'
at org.apache.cassandra.hadoop.ConfigHelper.getInputPartitioner(ConfigHelper.java:426)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.validateConfiguration(AbstractColumnFamilyInputFormat.java:85)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.validateConfiguration(ColumnFamilyInputFormat.java:74)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:122)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at com.test.cassandratest.WcJob.run(WcJob.java:96)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.test.cassandratest.WcJob.main(WcJob.java:104)
... 10 more
Caused by: org.apache.cassandra.exceptions.ConfigurationException: Unable to find partitioner class 'org.apache.cassandra.dht.RandomPartitioner'
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:458)
at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:470)
at org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:416)
at org.apache.cassandra.hadoop.ConfigHelper.getInputPartitioner(ConfigHelper.java:422)
... 26 more
Caused by: java.lang.NoClassDefFoundError: org/github/jamm/MemoryMeter$Guess
at org.apache.cassandra.utils.ObjectSizes.<clinit>(ObjectSizes.java:34)
at org.apache.cassandra.dht.RandomPartitioner.<clinit>(RandomPartitioner.java:45)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.cassandra.utils.FBUtilities.classForName(FBUtilities.java:450)
... 29 more
Caused by: java.lang.ClassNotFoundException: org.github.jamm.MemoryMeter$Guess
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 34 more
I met the same problem when upgraded Cassandra to 2.1 in our system, and the root cause is as following.
The jamm version Cassandra 2.1 uses is 3.0.0, and older Cassandra was using 2.5. So please update the jamm version you use, and you problem may be fixed.
http://mvnrepository.com/artifact/com.github.jbellis/jamm/0.3.0

Unable to load Hive-JDBC driver when accessed through MapReduce program on Amazon's Elastic MapReduce

I have written a MapReduce program in which I am storing some part of output data into Hive table.
I have used Hive-JDBC driver to access Hive table via MapReduce code.
This program has compiled successfully on local machine.
After this, I created a JAR file and uploaded it on S3. Then I created an elasticmapreduce cluster and started it.
However, it is resulting into below mentioned errors:
java.lang.Throwable: Child Error at
org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused
by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
attempt_201407161054_0001_m_000001_0: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.jdbc.HiveDriver
attempt_201407161054_0001_m_000001_0: at
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
attempt_201407161054_0001_m_000001_0: at
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
attempt_201407161054_0001_m_000001_0: at
java.security.AccessController.doPrivileged(Native Method)
attempt_201407161054_0001_m_000001_0: at
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
attempt_201407161054_0001_m_000001_0: at
java.lang.ClassLoader.loadClass(ClassLoader.java:424)
attempt_201407161054_0001_m_000001_0: at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
attempt_201407161054_0001_m_000001_0: at
java.lang.ClassLoader.loadClass(ClassLoader.java:357)
attempt_201407161054_0001_m_000001_0: at
java.lang.Class.forName0(Native Method)
attempt_201407161054_0001_m_000001_0: at
java.lang.Class.forName(Class.java:190)
attempt_201407161054_0001_m_000001_0: at
HubAndAuthority.InputHubMapper.configure(InputHubMapper.java:38)
attempt_201407161054_0001_m_000001_0: at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
attempt_201407161054_0001_m_000001_0: at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
attempt_201407161054_0001_m_000001_0: at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
attempt_201407161054_0001_m_000001_0: at
java.lang.reflect.Method.invoke(Method.java:606)
It appears to be an issue of missing Hive-JDBC driver and it should get resolved by adding Hive-JDBC driver in classpath. However, I am not aware of the exact step to do this on Amazon's EMR.
Could you please let me know what is missing from my end and how to resolve it?
Thanks and Regards,
Prafulla
I'm not sure enough, but you should try this:
"Note
If you want your custom classpath to override the original class path, you should set the environment variable, HADOOP_USER_CLASSPATH_FIRST to true so that the HADOOP_CLASSPATH value specified in hadoop-user-env.sh is first."
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-config.html
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-config_hadoop-user-env.sh.html
Regards,
revet

Resources