I am trying to run a simple Pig job using Oozie 4.0. I am running into the following error. I have created the sharelib folder on the Oozie server at the location /usr/lib/oozie/share/.
I have also pushed the same folder to HDFS at the following locations:
/user/hdfs
/user/oozie
/user/mapred
I have restarted the Hadoop services on the NameNode, Job Tracker and the DataNodes. I did this with the Oozie service stopped and the brought the Oozie service back up using the command:
sudo -u oozie /usr/lib/oozie/bin/oozie-start.sh
I am still getting this error.
cluster_conf.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
<property>
<name>nameNode</name>
<value>hdfs://localhost:8020</value>
</property>
<property>
<name>resourceManager</name>
<value>localhost:8050</value>
</property>
<property>
<name>oozie.wf.application.path</name>
<value>/tmp/workflow.xml</value>
</property>
OozieTestScript.pig
A = load '/tmp/passwd' using PigStorage(':');
B = foreach A generate $0 as id;
dump B;
store B into '/tmp/id5.out';
Pig command is simple as shown below. Just note I am keeping the private IPs out of the code copied here.
<pig xmlns="uri:oozie:workflow:0.2">
<job-tracker>172.xx.xx.xx:8050</job-tracker>
<name-node>hdfs://172.xx.xx.xx:8020</name-node>
<script>OozieTestScript.pig</script>
</pig>
Logs:
2014-05-15 19:47:47,872 WARN PigActionExecutor:542 - USER[mapred] GROUP[-] TOKEN[] APP[Oozie_workflow_test] JOB[0000000-140515194709114-oozie-oozi-W] ACTION[0000000-140515194709114-oozie-oozi-W#pigAction] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], main() threw exception, org/apache/pig/Main
2014-05-15 19:47:47,873 WARN PigActionExecutor:542 - USER[mapred] GROUP[-] TOKEN[] APP[Oozie_workflow_test] JOB[0000000-140515194709114-oozie-oozi-W] ACTION[0000000-140515194709114-oozie-oozi-W#pigAction] Launcher exception: org/apache/pig/Main
java.lang.NoClassDefFoundError: org/apache/pig/Main
at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:324)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:219)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.ClassNotFoundException: org.apache.pig.Main
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 17 more
2014-05-15 19:47:47,916 DEBUG ActionCheckXCommand:545 - USER[mapred] GROUP[-] TOKEN[] APP[Oozie_workflow_test] JOB[0000000-140515194709114-oozie-oozi-W] ACTION[0000000-140515194709114-oozie-oozi-W#pigAction] ENDED ActionCheckXCommand for wf actionId=0000000-140515194709114-oozie-oozi-W#pigAction, jobId=0000000-140515194709114-oozie-oozi-W
Related
I am trying to do an Sqoop export, the sqoop command works just fine in the local Servers, however, when I try to use the same command as an Oozie action, I am getting the following error, any help would be appreciated.
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], main() threw exception, org.apache.hadoop.hive.ql.io.AcidUtils.isTablePropertyTransactional(Ljava/util/Map;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.io.AcidUtils.isTablePropertyTransactional(Ljava/util/Map;)Z
at org.apache.hive.hcatalog.mapreduce.FosterStorageHandler.configureInputJobProperties(FosterStorageHandler.java:134)
at org.apache.hive.hcatalog.common.HCatUtil.getInputJobProperties(HCatUtil.java:458)
at org.apache.hive.hcatalog.mapreduce.InitializeInput.extractPartInfo(InitializeInput.java:161)
at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:137)
at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88)
at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:349)
at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:433)
at org.apache.sqoop.manager.SQLServerManager.exportTable(SQLServerManager.java:192)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:81)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:179)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://EAP-DR/user/sps_hpe_ibproetl/oozie-oozi/0005868-200326034412660-oozie-oozi-W/sqoop_2--sqoop/action-data.seq
7378 [main] INFO org.apache.hadoop.io.compress.zlib.ZlibFactory - Successfully loaded & initialized native-zlib library
7379 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate]
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager#4440750 to null
Oozie Launcher ends
When you run sqoop command from local it uses jars from /usr/lib/sqoop/lib.
And when you use oozie sqoop action it uses jars from hdfs:///user/oozie/share/lib/lib_*/sqoop/
Now looking at your error, you either are
Missing org.apache.hadoop.hive.ql.io.AcidUtils on the classpath
Have the wrong version of org.apache.hadoop.hive.ql.io.AcidUtils on the classpath
Have multiple versions of org.apache.hadoop.hive.ql.io.AcidUtils on the classpath
Spark version 1.6.2
Hadoop version 2.7.3
when running spark on standalone cluster mode
Command wordcount example:
spark-submit --class org.apache.spark.examples.JavaWordCount --master spark://IP:7077 spark-examples-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar file.txt output
getting following error
INFO cluster.SparkDeploySchedulerBackend: Executor app-20161125052710-0012/10 removed: java.io.IOException: Failed to create directory /usr/hdp/2.5.0.0-1245/spark/work/app-20161125052710-0012/10
ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
This stopped SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
org.apache.spark.examples.JavaWordCount.main(JavaWordCount.java:44)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
The currently active SparkContext was created at:
(No active SparkContext.)
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106)
at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1602)
at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2203)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:579)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at org.apache.spark.examples.JavaWordCount.main(JavaWordCount.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/11/25 04:24:48 INFO spark.SparkContext: SparkContext already stopped.
in spark master node url, I see to two workers in ALIVE state
seems like Failed to create directory /usr/hdp/2.5.0.0-1245/spark/work was the root cause. After giving permission to /usr/hdp/2.5.0.0-1245/spark/work path, it worked fine
I've got a problem with running sqoop job in Oozie using Hue. I have 4 nodes cluster based on Hortonworks HDP.
My Sqoop job looks like below:
import
--options-file ./dss_conn_parms.txt
--table BD.BD_TABLE
--target-dir /user/user_1/DMS
--m 1
--hive-import
--hive-table BD.BD_TABLE_HIVE
The data from Oracle database was succesfully downloaded and inserted to HDFS. Unfortunately, Hive import doesn't work. The error is associated with permission:
73167 [main] INFO org.apache.sqoop.hive.HiveImport - Loading uploaded data into Hive
2016-10-17 09:42:55,203 INFO [main] hive.HiveImport (HiveImport.java:importTable(195)) - Loading uploaded data into Hive
73180 [main] ERROR org.apache.sqoop.tool.ImportTool - Encountered IOException running import job: java.io.IOException: Cannot run program "hive": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:528)
at org.apache.sqoop.util.Executor.exec(Executor.java:76)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:391)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:344)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:245)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 31 more
2016-10-17 09:42:55,216 ERROR [main] tool.ImportTool (ImportTool.java:run(613)) - Encountered IOException running import job: java.io.IOException: Cannot run program "hive": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:528)
at org.apache.sqoop.util.Executor.exec(Executor.java:76)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:391)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:344)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:245)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: error=13, Permission denied
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 31 more
Intercepting System.exit(1)
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
Do you have any idea why Sqoop job cannot run Hive import command?
UPDATE
I executed Sqoop job with hive import options in command line and I know what is the problem. In command line I can see this info:
Logging initialized using configuration in jar:file:/usr/hdp/2.4.2.0-258 /hive/lib/hive-common-1.2.1000.2.4.2.0-jar!/hive-log4j.properties
OK
The problem is with access to hive-common-1.2.1000.2.4.2.0-jar which is located on local file system. Any idea what should I do?
1) Try Adding
<env-var>HADOOP_USER_NAME=YOUR_USERNAME</env-var>
within the <shell xmlns="uri:oozie:shell-action:0.2"> section of your job definition in workflow.xml
2) In core-site.xml add
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.YOUR_USERNAME.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.YOUR_USERNAME.groups</name>
<value>*</value>
</property>
and restart all services, then try again
I install HDFS, YARN through Ambari and try to deploy spark on yarn.
But When I execute follow script, Spark has error
How to deploy spark on yarn.
Would you mind explaining how to deploy spark on yarn step by step?
I set HADOOP_CONF_DIR, YARN_CONF_DIR in spark-env.sh and spark.master in spark-defaults.conf.
execute script
./bin/spark-shell --master yarn-client
Error
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.security.SimpleUserGroupsMapping not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2106)
at org.apache.hadoop.security.Groups.<init>(Groups.java:70)
at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2136)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2136)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2136)
at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:214)
at org.apache.spark.repl.SparkIMain.<init>(SparkIMain.scala:118)
at org.apache.spark.repl.SparkILoop$SparkILoopInterpreter.<init>(SparkILoop.scala:187)
at org.apache.spark.repl.SparkILoop.createInterpreter(SparkILoop.scala:217)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:949)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.security.SimpleUserGroupsMapping not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2098)
... 33 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.security.SimpleUserGroupsMapping not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 34 more
16/02/19 22:07:20 INFO util.ShutdownHookManager: Shutdown hook called
16/02/19 22:07:20 INFO util.ShutdownHookManager: Deleting directory
Check if the class is present in your hadoop classpath.
find $HADOOP_HOME/* -name *.jar -print |xargs grep "org.apache.hadoop.security.SimpleUserGroupsMapping" -0
if present then check if the class is present in spark distribution
grep "org.apache.hadoop.security.SimpleUserGroupsMapping" $SPARK_HOME/lib/*
If the jar is present in hadoop distribution try copy it to $SPARK_HOME/lib/.
If none of the above works try changing
hadoop.security.group.mapping org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
in core-site.xml and restart hadoop and spark.
I'm using sqoop-1.4.6 to import data from MSSQL to hadoop-2.7.1
Using sqoop itself I can successfully list the table in MSSQL which mean it works fine. But when I tried to import to hadoop, following error message raised:
ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/libjars/opencsv-2.3.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
So I check the log file of datanode, it gave the following infomation:
org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client. Perhaps the client is running an older version of Hadoop which does not support encryption.
Any idea how to change the configuration or how to deal with this problem?
Update:
It turns out that after I changed some configuration file, that problem begins. And the problem is not only about sqoop but hive has the same problem.
Configuration that I changed:
core-site.xml
<property>
<name>hadoop.rpc.protection</name>
<value>privacy</value>
</property>
hdfs-site.xml
<property>
<name>dfs.encrypt.data.transfer</name>
<value>true</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.suites</name>
<value>AES/CTR/NoPadding</value>
</property>
<property>
<name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>
<value>256</value>
</property>
Thanks
Try set dfs.block.access.token.enable to true.