Item Recommendation mahout Java error

Item Recommendation mahout Java error - macos

I'm trying to make an item based recommendation but i got this error:
Lore:~ lorenzoferri$ mahout parallelALS --input prova1.csv --output output.csv --lambda 0.1 --implicitFeedback true --alpha 0.8 --numFeatures 2 --numIterations 5 --numThreadsPerSolver 1 --tempDir tmp
No MAHOUT_CONF_DIR found
Running on hadoop, using /usr/local/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/Cellar/mahout/0.9/libexec/mahout-examples-0.9-job.jar
15/02/09 17:32:25 WARN driver.MahoutDriver: No parallelALS.props found on classpath, will use command-line arguments only
15/02/09 17:32:26 INFO common.AbstractJob: Command line arguments: {--alpha=[0.8], --endPhase=[2147483647], --implicitFeedback=[true], --input=[prova1.csv], --lambda=[0.1], --numFeatures=[2], --numIterations=[5], --numThreadsPerSolver=[1], --output=[output.csv], --startPhase=[0], --tempDir=[tmp]}
15/02/09 17:32:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/09 17:32:26 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/02/09 17:32:26 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
15/02/09 17:32:26 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.mahout.common.HadoopUtil.getCustomJobName(HadoopUtil.java:174)
at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:614)
at org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.run(ParallelALSFactorizationJob.java:166)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.main(ParallelALSFactorizationJob.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Lore:~ lorenzoferri$
I also tried:
Lore:~ lorenzoferri$ mahout recommenditembased -i '/Users/lorenzoferri/Desktop/prova1.csv' -o '/Users/lorenzoferri/Desktop/output.csv' -n 100 -s SIMILARITY_COOCCURRENCE
No MAHOUT_CONF_DIR found
Running on hadoop, using /usr/local/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/Cellar/mahout/0.9/libexec/mahout-examples-0.9-job.jar
15/02/09 17:37:02 WARN driver.MahoutDriver: No recommenditembased.props found on classpath, will use command-line arguments only
15/02/09 17:37:02 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[/Users/lorenzoferri/Desktop/prova1.csv], --maxPrefsInItemSimilarity=[500], --maxPrefsPerUser=[10], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], --numRecommendations=[100], --output=[/Users/lorenzoferri/Desktop/output.csv], --similarityClassname=[SIMILARITY_COOCCURRENCE], --startPhase=[0], --tempDir=[temp]}
15/02/09 17:37:02 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[/Users/lorenzoferri/Desktop/prova1.csv], --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]}
15/02/09 17:37:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/09 17:37:03 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
15/02/09 17:37:03 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
15/02/09 17:37:03 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.mahout.common.HadoopUtil.getCustomJobName(HadoopUtil.java:174)
at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:614)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:73)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
I installed mahout via brew, on mac os x 10.10, but i got the same problem on ubuntu. On the web i can only find tutorial on eclipse, but i would like to run it through the terminal.

No MAHOUT_CONF_DIR found
add to your .bash_profile or .zshrc:
export MAHOUT=/path/to/mahout
export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
In order to run Mahout locally
add to your .bash_profile or .zshrc:
export MAHOUT_LOCAL="not-an-empty-string"
set to anything other than an empty string to force mahout to run locally
even if HADOOP_CONF_DIRand HADOOP_HOME are set.
In order to get up and running quickly I will recommend for you to:
Download mahout as a tar.gz: http://apache.mivzakim.net/mahout/0.9/mahout-distribution-0.9.tar.gz
Follow steps 1 and 2 as mentioned, the /path/to/mahout is where you extracted mahout-distribution-0.9
Run the mahout command from mahout-distribution-0.9/bin/mahout
i.e: ./bin/mahout recommenditembased -i '/Users/lorenzoferri/Desktop/prova1.csv' -o '/Users/lorenzoferri/Desktop/output.csv' -n 100 -s SIMILARITY_COOCCURRENCE
In mahout-distribution-0.9/examples you can run and explore some shell scripts
https://svn.apache.org/repos/asf/mahout/trunk/bin/mahout

Related

File not found error while executing a sqoop job

When I execute a sqoop job it throws a FileNotFoundException error as below
18/05/29 06:18:59 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/0ce66d1f09ce960a71c165855afbe42c/QueryResult.jar
18/05/29 06:18:59 INFO mapreduce.ImportJobBase: Beginning query import.
18/05/29 06:18:59 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
18/05/29 06:18:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/05/29 06:18:59 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/05/29 06:19:01 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
18/05/29 06:19:01 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/05/29 06:19:01 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
18/05/29 06:19:01 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
18/05/29 06:19:01 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/hduser1354549662/.staging/job_local1354549662_0001
18/05/29 06:19:01 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://svn-server:54310/home/hduser/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib/postgresql-9.2-1002-jdbc4.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:196)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:169)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:266)
at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:729)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
`
It should look for jars and other dependencies in local sqoop/lib directory but it looks it in HDFS with the same file path as my local sqoop lib path. As per project requirement, I need sqoop to look into my local library. How can I achieve this? Thanks.

Have you setup the SQOOP_HOME and PATH environment variables for the sqoop in .bashrc file
if not Please add
vi ~/.bashrc
include the below lines
export SQOOP_HOME=/your/path/to/the/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
save the file and execute the following command
source ~/.bashrc
Hope this is helpfull to you!!!

Not able to run mahout jar

I am executing a command to run mahout jar for input file to generate an output file. But I am facing several errors. I have placed the input file in hdfs. The command is:
mahout recommenditembased -s SIMILARITY_COOCCURRENCE -i /input.txt -o /output --booleanData true
I am facing error:
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.12.0-job.jar
16/07/05 00:23:47 WARN driver.MahoutDriver: No recommenditembased.props found on classpath, will use command-line arguments only
16/07/05 00:23:48 INFO common.AbstractJob: Command line arguments: {--booleanData=[true], --endPhase=[2147483647], --input=[/input.txt], --maxPrefsInItemSimilarity=[500], --maxPrefsPerUser=[10], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], --numRecommendations=[10], --output=[/output], --similarityClassname=[SIMILARITY_COOCCURRENCE], --startPhase=[0], --tempDir=[temp]}
16/07/05 00:23:48 INFO common.AbstractJob: Command line arguments: {--booleanData=[true], --endPhase=[2147483647], --input=[/input.txt], --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]}
16/07/05 00:23:48 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
16/07/05 00:23:48 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
16/07/05 00:23:48 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
16/07/05 00:23:49 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-15-19.us-west-2.compute.internal/172.31.15.19:8032
16/07/05 00:23:51 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1467669538614_0015
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: /input.txt
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:317)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:352)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:77)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:168)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Thank you in advance.

From error trace it is clear that your mahout job is not able to find input file location (/input.txt).
Check your mahout configuration and hadoop config
If you are running it Locally you can try running with
file:///input.txt ( file protocol is for local file system) or if you are running on hdfs you can use hdfs://input.txt ( hdfs is for hdfs file system)

Debugging hadoop Wordcount program in eclipse in windows

Trying to run Wordcount program in hadoop in eclipse (windows 7). and passing these argument in eclipse only
E:\hadoop\eclipse-hadoop-pro\workspace-hadoop\WordCountPro\input\word.txt
E:\hadoop\eclipse-hadoop-pro\workspace-hadoop\WordCountPro\output
I have created input file in project only like input folder and inside it word.txt file
But it is throughing below excption
2015-04-08 15:30:09,947 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-04-08 15:30:10,238 ERROR [main] util.Shell (Shell.java:getWinUtilsPath(373)) - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable E:\hadoop\hadoop-HADOOP_HOME\hadoop-2.6.0\bin\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104)
at org.apache.hadoop.security.Groups.<init>(Groups.java:86)
at org.apache.hadoop.security.Groups.<init>(Groups.java:66)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)
at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:72)
at org.apache.hadoop.mapreduce.Job.<init>(Job.java:144)
at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:187)
at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:206)
at com.WordCount.main(WordCount.java:52)
2015-04-08 15:30:11,039 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1049)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2015-04-08 15:30:11,041 INFO [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/E:/hadoop/eclipse-hadoop-pro/workspace-hadoop/WordCountPro/output already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:562)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at com.WordCount.main(WordCount.java:61)

I doubt if Hadoop is installed correctly. Check in your machine if all the daemons are running or not.If not, then consider re-checking or re-installing what you are missing.
ERROR [main] util.Shell (Shell.java:getWinUtilsPath(373)) - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable

not able to run pig on windows 7 using cygwin

I configured pig as directed in the documentation.
Enviornment: Windows 7, Hadoop-0.20.2, pig 0.13.0, Cygwin
But when i type pig (mapreduce) on the command prompt it just displays below thing. I am not sure whether pig is started or not. I don't see GRUNT shell to execute script.
Btw, Hadoop is running on the same node.
Can someone please help?
$ pig
Find hadoop at /hadoop-0.20.2/bin/hadoop
dry run:
HADOOP_CLASSPATH: C:\cygwin64\pig-0.13.0\conf;C;C:\Program Files\Java\jdk1.6.0_25\lib\tools.jar;C;C:\cygwin64\hadoop-0.20.2\conf;C:\cygwin64\pig-0.13.0\lib\accumulo-core-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-fate-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-server-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-start-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-trace-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\avro-1.7.5.jar;C:\cygwin64\pig-0.13.0\lib\avro-mapred-1.7.5.jar;C:\cygwin64\pig-0.13.0\lib\avro-tools-1.7.5-nodeps.jar;C:\cygwin64\pig-0.13.0\lib\groovy-all-1.8.6.jar;C:\cygwin64\pig-0.13.0\lib\hbase-0.94.1.jar;C:\cygwin64\pig-0.13.0\lib\jruby-complete-1.6.7.jar;C:\cygwin64\pig-0.13.0\lib\js-1.7R2.jar;C:\cygwin64\pig-0.13.0\lib\json-simple-1.1.jar;C:\cygwin64\pig-0.13.0\lib\jython-standalone-2.5.3.jar;C:\cygwin64\pig-0.13.0\lib\piggybank.jar;C:\cygwin64\pig-0.13.0\lib\protobuf-java-2.4.0a.jar;C:\cygwin64\pig-0.13.0\lib\zookeeper-3.4.5.jar:C:\cygwin64\PIG-01~1.0/pig-withouthadoop-h2.jar:
HADOOP_OPTS: -Xmx1000m -Dpig.log.dir=C:\cygwin64\PIG-01~1.0\logs -Dpig.log.file=pig.log -Dpig.home.dir=C:\cygwin64\PIG-01~1.0\
HADOOP_CLIENT_OPTS: -Xmx1000m -Dpig.log.dir=C:\cygwin64\PIG-01~1.0\logs -Dpig.log.file=pig.log -Dpig.home.dir=C:\cygwin64\PIG-01~1.0\
/hadoop-0.20.2/bin/hadoop jar C:\cygwin64\PIG-01~1.0/pig-withouthadoop-h2.jar
when i run in debug mode, i see this exception. This is because Hadoop Jar is not set.
localhsot#mymachine~
$ echo $PIG_INSTALL
C:\cygwin64\pig-0.13.0
localhsot#mymachine~
$ export PIG_INSTALL=/cygdrive/c/cygwin64/pig-0.13.0
localhsot#mymachine~
$ export HADOOP_INSTALL=/cygdrive/c/cygwin64/hadoop-0.20.2/
localhsot#mymachine~
$ export PATH=$PATH:$PIG_INSTALL/bin:$HADOOP_INSTALL/bin
$ pig
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-26 14:05:12,998 [main] INFO org.apache.pig.Main - Apache Pig version 0. 13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-08-26 14:05:12,998 [main] INFO org.apache.pig.Main - Logging error message s to: C:\cygwin64\home\chparekh\pig_1409076312996.log
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/map reduce/task/JobContextImpl
at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java :68)
at org.apache.pig.Main.run(Main.java:643)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.Jo bContextImpl
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 8 more

you can refer below link for the same, i hope this will help you.
http://abhijitsureshshingate.wordpress.com/2013/07/08/code-debug-test-apache-pig-scripts-using-eclipse-on-windows/

Mahout Cookbook - Weird FileNotFoundException

I'm following the mahout cookbook example. In one of them, I'm getting an exception:
mahout seqdirectory -i /home/hduser/cook/lastfm/current -o /home/hduser/cook/lastfm sequencefiles/
And then, I'm getting the following exception:
14/04/29 15:45:38 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/home/hduser/cook/lastfm/current], --keyPrefix=[], --method=[mapreduce], --output=[/home/hduser/cook/lastfm/sequencefiles/], --startPhase=[0], --tempDir=[temp]}
14/04/29 15:45:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/hduser/cook/lastfm/current
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.mahout.text.SequenceFilesFromDirectory.runMapReduce(SequenceFilesFromDirectory.java:162)
at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:91)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
The problem is that the folder exists, as shown here:
hduser#ACER-V3-571G:~/cook/lastfm/current$ ls
artists.txt ArtistTags.dat README.txt tags.txt
Mahout 0.9 and Hadoop 2.2.0
jps shows me:
5435 ResourceManager
7257 Jps
5531 NodeManager
5104 DataNode
5262 SecondaryNameNode
5008 NameNode

I'm sorry, but i found the solution by myself.
Since the file isn't in the Hadoop Filesystem, I should run mahout as local.
export MAHOUT_LOCAL=TRUE
solves the problem.

Since an "export" configures a system environment variable in unix, an similar command should do it in Windows. Try:
SET MAHOUT_LOCAL=TRUE

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Item Recommendation mahout Java error - macos

Related

File not found error while executing a sqoop job

Not able to run mahout jar

Debugging hadoop Wordcount program in eclipse in windows

not able to run pig on windows 7 using cygwin

Mahout Cookbook - Weird FileNotFoundException

Categories

Resources