Set up HADOOP_HOME variable in windows - windows

I am trying to use Spark along with Hadoop in my Windows 8. However no matter what my code is, I receive this error:
15/08/25 19:29:58 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104)
at org.apache.hadoop.security.Groups.<init>(Groups.java:86)
at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2162)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:301)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnectionun(GatewayConnection.java:207)
at java.lang.Thread.run(Unknown Source)
As you can see:
null\bin\winutils.exe
The hadoop home path is null. I tried to set HADOOP_HOME as an environment variable but that did not resolve this issue.

I managed to resolve this problem using the following part of code in the begining:
import sys
import os
os.environ['HADOOP_HOME'] = "C:/Mine/Spark/hadoop-2.6.0"
sys.path.append("C:/Mine/Spark/hadoop-2.6.0/bin")
Hope this helps someone and also if anyone has a better idea, I would definitely appreciate that.

Related

Java3D: JNI Error - java.lang.NoClassDefFoundError: javax/media/j3d/Canva s3D: Windows 7

OS: Windows 7 64-bit
Java: C:\Program Files\Java\jdk1.8.0_202 [Installed jdk-8u202-windows-x64.exe]
Java 3d: C:\Program Files\Java\Java3D\1.5.1 [Installed java3d-1_5_1-windows-amd64.exe]
Environment Varaibles:
JAVA_HOME C:\Program Files\Java\jdk1.8.0_202
JRE_HOME C:\Program Files\Java\jdk1.8.0_202\jre
PATH=%PATH%:C:\Program Files\Java\jdk1.8.0_202\bin;C:\Program Files\Java\Java3D\1.5.1\bin;C:\Program Files\Java\Java3D\1.5.1\lib;
And also copied j3d jars[j3dcore.jar, j3dutils.jar and vecmath.jar] to C:\Program Files\Java\jdk1.8.0_202\jre\lib\ext. j3dcore-ogl.dll to C:\Program Files\Java\jdk1.8.0_202\bin
Running the classic HelloJava3D Program and getting the below error. I can understand this is looking for Canvas3d class but that already present in one of the java3d jar file. Have come across similar thread in stackoverflow where they mentioned to check the classpath.Above i've shared the path which i set. Anything i missed out?
Java3D - Some classes not found but classpath is set correctly
But in my case its showing some JNI error is it because some architecture mismatch?
D:\JavaSample>java HelloJava3D
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: javax/media/j3d/Canva
s3D
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
at java.lang.Class.privateGetMethodRecursive(Unknown Source)
at java.lang.Class.getMethod0(Unknown Source)
at java.lang.Class.getMethod(Unknown Source)
at sun.launcher.LauncherHelper.validateMainClass(Unknown Source)
at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)
Caused by: java.lang.ClassNotFoundException: javax.media.j3d.Canvas3D
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 7 more
One of the tool in my project is dependent on Java3d and i'm from c++ background and new to this project. I couldn't figure out what i am missing. Please suggest.

Apache Spark installation on windows 7 32 bit

I have just began studying apache spark. First thing which i did was i tried to install spark on my machine. I downloaded the pre built spark 1.5.2 with hadoop 2.6. When i ran spark shell i got following erros
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.ClientWrapper.<init> (ClientWrapper.scala:171)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala :163)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:168)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at $iwC$$iwC.<init>(<console>:9)
at $iwC.<init>(<console>:18)
at <init>(<console>:20)
at .<init>(<console>:24)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:132)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
I searched for this error and got that i have to download winutils.exe which i did, I set the path HADOOP_HOME = "c:\Hadoop" and then ran the command
C:\Hadoop\bin\winutils.exe chmod 777 /tmp/hive
but i got following error
This version of C:\Hadoop\bin\winutils.exe is not compatible with the version of
Windows you're running. Check your computer's system information to see whether
you need a x86 (32-bit) or x64 (64-bit) version of the program, and then contac
t the software publisher.
I tried to search 32 bit version of winutils.exe but i couldnt get it.. Please help me with this installation.
Thank You in advance
The following links may be helpful.
https://issues.apache.org/jira/browse/HADOOP-9922
https://issues.apache.org/jira/browse/HADOOP-11784
Not able to find winutils.exe for hadoop 2.6.0 for 32 bit windows

Hadoop distcp exception

We are using dictcp to copy data from CDH4 to CDH5. When we run the command on CDH5 destination namenode, we get the following exception. Please let me know if you have already encountered the problem and know the solution. Thanks.
5/01/05 18:15:47 ERROR tools.DistCp: Exception encountered
org.apache.hadoop.ipc.RemoteException(java.lang.NoSuchMethodError): org.apache.hadoop.net.NetworkTopology.pseudoSortByDistance(Lorg/apache/hadoop/net/Node;[Lorg/apache/hadoop/net/Node;)V
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:354)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1618)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:482)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:246)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1179)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1169)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1159)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:270)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:237)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:230)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1457)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:301)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:297)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:297)
at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1832)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1752)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773)
at org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:2825)
at org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:2785)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2733)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2774)
at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:356)
at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:145)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:353)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:160)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)
Seems like this is because of protocol mismatch between two clusters.
As Hadoop version used in the source and destinations are different, you cannot use distcp by using hdfs://Cluster2-Namenode1:Port/ in source or destination, you have to use webhdfs:// instead of simple hdfs as follows.
hadoop distcp /source-directory webhdfs://Cluster2-Namenode1:50070/dir/
Please note 50070 is the default namenode Web UI, If you have configured different port for HDFS namenode WebUI, modify 50070 to your modified port.

Debugging instal4j installer

I've be struggling in debugging an install4j installer where I'm trying to introduce some complicated condition expression that is failing for some reason.
However when I try to use the debug_installer.sh script I get the following error:
java.io.FileNotFoundException: /Applications/install4j/resource/MessagesDefault (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at com.install4j.runtime.util.FileResourceBundle.<init>(Unknown Source)
at com.install4j.runtime.installer.frontend.Messages.createMessagesInternal(Unknown Source)
at com.install4j.runtime.installer.frontend.Messages.createMessages(Unknown Source)
at com.install4j.runtime.installer.frontend.Messages.getMessages(Unknown Source)
at com.install4j.runtime.installer.frontend.GUIHelper.showMessageInternal(Unknown Source)
at com.install4j.runtime.installer.frontend.GUIHelper.access$100(Unknown Source)
at com.install4j.runtime.installer.frontend.GUIHelper$2.run(Unknown Source)
at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:199)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:682)
at java.awt.EventQueue.access$000(EventQueue.java:85)
at java.awt.EventQueue$1.run(EventQueue.java:643)
at java.awt.EventQueue$1.run(EventQueue.java:641)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:87)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:652)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:296)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:211)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:201)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:196)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:188)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:122)
The file actually doesn't exist but I have no idea of what that file should contain. My install4j version is 4.2.8
In the debug installer start script, replace
-cp i4jruntime.jar:user.jar:user/*.jar
with
-cp 'i4jruntime.jar:user.jar:user/*'
Then it should work. This bug was fixed in 5.0.1.

Why is WEKA not running from the command line?

I apologize, this seems like it is really simple but I can't seem to get it.
I've downloaded the Windows version of WEKA and installed it, but I can't seem to call it from the command line.
I've added a WEKAHOME environment variable, pointing to the directory containing weka.jar, and added that to my path with /weka.jar appended to it..
I am trying this command: java weka.classifiers.j48.J48 -t %WEKAHOME%/data/iris.arff
I then get the following error output.
Exception in thread "main" java.lang.NoClassDefFoundError:
weka/classifiers/j48/ J48 Caused by: java.lang.ClassNotFoundException:
weka.classifiers.j48.J48
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: weka.classifiers.j48.J48. Program will exit.
What can I do to fix this?
It's because of this weka.classifiers.j48.J48, that is an error in the Weka documentation, it should be: weka.classifiers.trees.J48
(Note: the comments below are no longer relevant. The answer here works, and remember to set the classpath as Thomas Jungblut says below.)

Resources