Apache Spark can not Run on Windows - hadoop

I had downloaded spark-2.0.1-bin-hadoop2.7 and installed it. I installed JAVA and set JAVA_HOME in System Variables.
But in running I have this Error:
How to it can be fixed ?

I think the problem is with whitespaces in your path.
Try to place downloaded spark in for example. F:\Msc\BigData\BigDataSeminar\Spark\
Also check whether SPARK_HOME, JAVA_HOME and HADOOP_HOME are placed in the path without whitespaces.

Related

configurate of hadoop and spark

I install hadoop and spark in my windows 11
I put the path in my env variable
"C:\BigData\spark-3.1.2-bin-hadoop3.2"
"C:\BigData\spark-3.1.2-bin-hadoop3.2\sbin"
"C:\BigData\spark-3.1.2-bin-hadoop3.2\bin"
"C:\BigData\hadoop-3.2.2\bin"
and i install jdk 1.8 and i put the path java_home
and when i was to excute the spark whith this cmd
"spark-shell"
i have this problem
"he system cannot find the path specified."
what is the solution ?

JAVA_HOME points to an invalid installation during ElasticSearch installation

I'm trying to run the following command through powershell to install ElastiSearch but it the service batch file can't find the JAVA_HOME path.
I've added JAVA_HOME to my system variables and I can see the path when I echo JAVA_HOME through command line. Not sure why the batch file is pointing to an empty path.
Invoke-Expression -command "C:\elasticsearch-6.5.4\elasticsearch-6.5.
4\bin\elasticsearch-service install"
Installing service : "elasticsearch-service-x64"
Using JAVA_HOME (64-bit): ""
JAVA_HOME points to an invalid Java installation (no jvm.dll found in "").
Exiting...
Obviously, your variable in system variables JAVA_HOME is pointing to the wrong location. Make sure to provide the right path for JAVA_HOME to be: .\Java\jdk1.8.0

hadoop installation in window 10 path error

I am trying to install Hadoop 2.6 in Windows 10, while doing that i'm getting below error
C:\hadoop-2.6.2\bin>hadoop -version
The system cannot find the path specified.
Error: JAVA_HOME is incorrectly set.
Please update C:\hadoop-2.6.2\conf\hadoop-env.cmd
'-Xmx512m' is not recognized as an internal or external command, operable program or batch file.
I had the same issue, and it can be fixed by 2 ways.
check your environment variable, if JAVA_HOME setup for the user or not and path setup correctly or not.
Please remove system JAVA_HOME and setup for user.
Go to command line and setup Java Home and Path.
set JAVA_HOME = your jdk home directory
set PATH = %JAVA_HOME%/bin

Apache Spark with Hadoop distribution failing to run on Windows

I tried running spark-1.5.1-bin-hadoop2.6 distribution (and newer versions of Spark with same results) on Windows using Cygwin.
When trying to execute spark-shell script in the bin folder, I get below output:
Error: Could not find or load main class org.apache.spark.launcher.Main
I tried to set CLASSPATH to the location of lib/spark-assembly-1.5.1-hadoop2.6.0.jar but to no avail.
(FYI: I am able to run the same distribution fine on my MAC with no extra setup steps required)
Please assist in finding resolution for Cygwin execution on Windows.
I ran into and solved a similar problem with cywin on Windows 10 and spark-1.6.0.
build with Maven (maybe you're past this step)
mvn -DskipTests package
make sure JAVA_HOME is set to a JDK
$ export JAVA_HOME="C:\Program Files\Java\jdk1.8.0_60"
$ ls "$JAVA_HOME"
bin include LICENSE THIRDPARTYLICENSEREADME.txt ....
use the Windows batch file. Launch from PowerShell or CommandPrompt if you have terminal problems with cygwin.
$ chmod a+x bin/spark-shell.cmd
$ ./bin/spark-shell.cmd
My solution to the problem was to move the Spark installation into a path that didn't have spaces in it. Under Program Files I got the above error, but moving it directly under C:\ and running spark-shell.bat file cleared it up.

Problems running Mahout and Hadoop

I'm new at Mahout and Hadoop.
I've successfully installed Hadoop Cluster with 3 machines, and the cluster is running fine, and I just installed Mahout on the Main namenode for "testing purposes", and I followed the instructions of installation and set the JAVA_HOME, but when I try to run classify-20newsgroups.sh it goes and download the dataset but after that I get the following error:
Error: JAVA_HOME is not set
Then I've revised the .bashrc and confirmed that the JAVA_HOME is set correctly, but it doesn't help.
Also how do I verify that Mahout is configured to run on Hadoop correctly and do you know of any example that can verify this configuration or environment?
The .bashrc is only read by a shell that is non-login, otherwise is read .bash_profile.
So you could set to read .bashrc from .bash_profile (see here What's the difference between .bashrc, .bash_profile, and .environment?) or just a set JAVA_HOME in .bash_profile.
There are another several possibilities to set the JAVA_HOME:
1) set .bashrc from terminal
~$ source .bashrc
2) set JAVA_HOME in open terminal before running classify-20newsgroups.sh
~$ JAVA_HOME=/path
~$ classify-20newsgroups.sh
3) run classify-20newsgroups.sh with JAVA_HOME, i.e.
~$ JAVA_HOME=/path classify-20newsgroups.sh
As for question about Mahout configuration for run on Hadoop. Standart example with classify-20newsgroups should work on hadoop if HADOOP_HOME is set.
You might need to explicitly set JAVA_HOME in hadoop-env.sh
In hadoop-env.sh, look for the comment "#The java implementation to use", and modify the JAVA_HOME path under it.
It should look something like this:
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
Of course fix the path of JAVA_HOME.

Resources