File not found error while executing a sqoop job - hadoop

When I execute a sqoop job it throws a FileNotFoundException error as below
18/05/29 06:18:59 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hduser/compile/0ce66d1f09ce960a71c165855afbe42c/QueryResult.jar
18/05/29 06:18:59 INFO mapreduce.ImportJobBase: Beginning query import.
18/05/29 06:18:59 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
18/05/29 06:18:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/05/29 06:18:59 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/05/29 06:19:01 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
18/05/29 06:19:01 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/05/29 06:19:01 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
18/05/29 06:19:01 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
18/05/29 06:19:01 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/hduser1354549662/.staging/job_local1354549662_0001
18/05/29 06:19:01 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://svn-server:54310/home/hduser/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib/postgresql-9.2-1002-jdbc4.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:196)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:169)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:266)
at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:729)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
`
It should look for jars and other dependencies in local sqoop/lib directory but it looks it in HDFS with the same file path as my local sqoop lib path. As per project requirement, I need sqoop to look into my local library. How can I achieve this? Thanks.

Have you setup the SQOOP_HOME and PATH environment variables for the sqoop in .bashrc file
if not Please add
vi ~/.bashrc
include the below lines
export SQOOP_HOME=/your/path/to/the/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
save the file and execute the following command
source ~/.bashrc
Hope this is helpfull to you!!!

Related

sqoop query failing to import table

I am trying to execute below sqoop import.
sqoop import --connect 'jdbc:sqlserver://server-IP;database=db_name' --username xxx --password xxx --table xxx --hive-import --hive-table amit_hive --target-dir /user/hive/amitesh123 -m 1.
I have to import a DB table directly to the desired location. as per my understanding goes, above sqoop command line syntax is correctly written. But on executing it, I am get following error:-
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=xxx, access=EXECUTE, inode="/user/hive/amitesh123":hive:hdfs:drwx
Somebody informed me that we have to mention the HIVE database name as well with the above sqoop commands,is that true?,if yes, can someone help me with the parameter that I have use? As per my knowledge goes, we just need to mention the --table to bring the table from DB to HIVE table. please suggest
To test further, I created a new folder , and gave 777 rights to it, still I am getting same error. I have now added the HIVE DB.HIVEtable name with --hive-table, so now the new sqoop import is as follows::
sqoop import --connect 'jdbc:sqlserver://server-IP;database=db_name' --username xxx --password xxx --table xxx --hive-import --hive-table amitesh_db.amit_hive --target-dir /amitesh012345/amitesh -m 1.
However, the permission denied error still there...
INFO mapreduce.Job: Job job_1486315054135_2834 failed with state FAILED due to: Job setup failed : org.apache.hadoop.security.AccessControlException: Permission denied: user=xxx, access=WRITE, inode="/amitesh012345/amitesh/_temporary/1":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1704)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1687)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3890)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3002)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2970)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1061)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1877)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:254)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=at732615, access=WRITE, inode="/amitesh012345/amitesh/_temporary/1":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1704)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1687)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3890)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:55
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3000)
... 13 more
17/03/14 05:23:38 INFO mapreduce.Job: Counters: 2
Job Counters
Total time spent by all maps in occupied slots (ms)=0
Total time spent by all reduces in occupied slots (ms)=0
17/03/14 05:23:38 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
17/03/14 05:23:38 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 24.4698 seconds (0 bytes/sec)
17/03/14 05:23:38 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
17/03/14 05:23:38 INFO mapreduce.ImportJobBase: Retrieved 0 records.
17/03/14 05:23:38 ERROR tool.ImportTool: Error during import: Import job failed!
2nd full stacktrace
+++++++++++++
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/03/14 05:38:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6_IBM_27
17/03/14 05:38:02 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/03/14 05:38:02 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
17/03/14 05:38:02 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
17/03/14 05:38:02 INFO manager.SqlManager: Using default fetchSize of 1000
17/03/14 05:38:02 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:path_to/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:path_to/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/03/14 05:38:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM [T_VND] AS t WHERE 1=0
17/03/14 05:38:03 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is path_to/hadoop
Note: path_to/T_VND.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/03/14 05:38:04 INFO orm.CompilationManager: Writing jar file: path_to/T_VND.jar
17/03/14 05:38:04 INFO mapreduce.ImportJobBase: Beginning import of T_VND
17/03/14 05:38:05 INFO impl.TimelineClientImpl: Timeline service address: http://xxxxxx/
17/03/14 05:38:05 INFO client.RMProxy: Connecting to ResourceManager at xxxxxx/server-IP:port
17/03/14 05:38:06 INFO db.DBInputFormat: Using read commited transaction isolation
17/03/14 05:38:07 INFO mapreduce.JobSubmitter: number of splits:1
17/03/14 05:38:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486315054135_2836
17/03/14 05:38:07 INFO impl.YarnClientImpl: Submitted application application_1486315054135_2836
17/03/14 05:38:07 INFO mapreduce.Job: The url to track the job: http://xxxxxx/server-IP:port/proxy/application_1486315054135_2836/
17/03/14 05:38:07 INFO mapreduce.Job: Running job: job_1486315054135_2836
17/03/14 05:38:13 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
17/03/14 05:38:13 INFO mapreduce.Job: Job job_1486315054135_2836 running in uber mode : false
17/03/14 05:38:13 INFO mapreduce.Job: map 0% reduce 100%
17/03/14 05:38:13 INFO mapreduce.Job: Job job_1486315054135_2836 failed with state FAILED due to:
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: The MapReduce job has already been retired. Performance
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: counters are unavailable. To get this information,
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: you will need to enable the completed job store on
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: the jobtracker with:
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.active = true
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: mapreduce.jobtracker.persist.jobstatus.hours = 1
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: A jobtracker restart is required for these settings
17/03/14 05:38:13 INFO mapreduce.ImportJobBase: to take effect.
17/03/14 05:38:13 ERROR tool.ImportTool: Error during import: Import job failed!

Unable to load Pig in my new Hadoop installation

I am executing the following Pig commands:
grunt> a = load 'hdfs://localhost:50070/user/data/file2' using PigStorage(',') as (usernames:chararray, passwords:chararray, cost:int);
grunt> dump a;
On executing the dump command I get the following error which I am not able to resolve. I am new to bigdata and Apache hadoop stack, I am unable to track this error.
2015-12-23 10:06:48,003 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2015-12-23 10:06:48,021 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-12-23 10:06:48,021 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-12-23 10:06:48,022 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2015-12-23 10:06:48,022 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2015-12-23 10:06:48,024 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2015-12-23 10:06:48,025 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2015-12-23 10:06:48,025 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2015-12-23 10:06:48,039 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-12-23 10:06:48,041 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-12-23 10:06:48,042 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-12-23 10:06:48,042 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2015-12-23 10:06:48,042 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2015-12-23 10:06:48,093 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/lib/pig/pig-0.14.0/pig-0.14.0-core-h2.jar to DistributedCache through /tmp/temp111104108/tmp-1319253659/pig-0.14.0-core-h2.jar
2015-12-23 10:06:48,137 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/lib/pig/pig-0.14.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp111104108/tmp-845511984/automaton-1.11-8.jar
2015-12-23 10:06:48,171 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/lib/pig/pig-0.14.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp111104108/tmp-588474917/antlr-runtime-3.4.jar
2015-12-23 10:06:48,226 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp111104108/tmp1116649992/guava-11.0.2.jar
2015-12-23 10:06:48,270 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/lib/pig/pig-0.14.0/lib/joda-time-2.1.jar to DistributedCache through /tmp/temp111104108/tmp181546662/joda-time-2.1.jar
2015-12-23 10:06:48,277 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-12-23 10:06:48,277 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2015-12-23 10:06:48,277 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2015-12-23 10:06:48,277 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2015-12-23 10:06:48,286 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2015-12-23 10:06:48,290 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-12-23 10:06:48,298 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/hduse281081215/.staging/job_local281081215_0006
2015-12-23 10:06:48,298 [JobControl] INFO org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:DefaultJobName got an error while submitting
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:652)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:490)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:599)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:182)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
2015-12-23 10:06:48,787 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local281081215_0006
2015-12-23 10:06:48,787 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2015-12-23 10:06:48,787 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[6,4],a[-1,-1] C: R:
2015-12-23 10:06:48,791 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-12-23 10:06:53,796 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-12-23 10:06:53,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local281081215_0006 has failed! Stop running all dependent jobs
2015-12-23 10:06:53,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-12-23 10:06:53,797 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-12-23 10:06:53,797 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2015-12-23 10:06:53,797 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-12-23 10:06:53,798 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.14.0 hduse 2015-12-23 10:06:48 2015-12-23 10:06:53 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local281081215_0006 a MAP_ONLY Message: ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:652)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:490)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:599)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:182)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
hdfs://localhost:54310/tmp/temp111104108/tmp484937622,
Input(s):
Failed to read data from "hdfs://localhost:50070/user/data/file2"
Output(s):
Failed to produce result in "hdfs://localhost:54310/tmp/temp111104108/tmp484937622"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local281081215_0006
2015-12-23 10:06:53,798 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-12-23 10:06:53,800 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2015-12-23 10:06:53,800 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2015-12-23 10:06:53,800 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.pig.PigServer.openIterator(PigServer.java:925)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:746)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:822)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
at org.apache.pig.PigServer.store(PigServer.java:997)
at org.apache.pig.PigServer.openIterator(PigServer.java:910)
... 13 more
Caused by: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294)
at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:540)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:801)
... 20 more
Also find the configuration settings.
core-site.xml:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
.bashrc:
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
#HADOOP VARIABLES END
### Pig variables
export PIG_HOME="/usr/lib/pig/pig-0.14.0"
export PIG_CONF_DIR="$PIG_HOME/conf"
export PIG_CLASSPATH="$PIG_CONF_DIR"
export PATH="$PIG_HOME/bin:$PATH"
#PIG VARIABLES END
Also my filepath:
hduse#vijee-Lenovo-IdeaPad-S510p:/home/vijee$ hadoop fs -ls hdfs:/
Found 2 items
drwxr-xr-x - hduse supergroup 0 2015-12-23 10:17 hdfs:///tmp
drwxr-xr-x - hduse supergroup 0 2015-12-23 09:15 hdfs:///user
hduse#vijee-Lenovo-IdeaPad-S510p:/home/vijee$ hadoop fs -ls hdfs:///user/data
Found 1 items
-rwxrwxrwx 1 hduse supergroup 120 2015-12-23 09:24 hdfs:///user/data/file2
I found another place where this question was asked. There the conclusion was this:
(...) here is a workaround that we found - We replaced the Task
Controller from LinuxTaskController to the DefaultTaskController and
things are back to normal. Jobs are successful and no more ENOENT
messages.

hbase warning about deprecated native.lib

hadoop conf file: opt/hadoop/etc/hadoop/core-site.xml
when set
<name>hadoop.native.lib</name>
and then start hbase shell, there will be four lines of warning:
2015-02-10 11:07:46,956 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,005 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,046 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,081 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2015-02-10 11:07:47,169 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
but when set
<name>io.native.lib.available</name>
and then start hbase shell, there will be one line of warning:
2015-02-10 11:07:46,956 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
How can I set to make it doesn't show any of this warning?
I'm on hadoop 2.5.2 and hbase 0.98.8 #ubuntu x64.

Sqoop running into local job runner mode

When i run sqoop am not sure why it runs into local job runner mode and then says that i have provided invalid jobtracker url for LocalJobRunner. Can anyone tell whats going on?
$ bin/sqoop import -jt myjobtracker:50070 --connect jdbc:mysql://mydbhost.com/mydata --username foo --password bar --as-parquetfile --table campaigns --target-dir hdfs://myhdfs:8020/user/myself/campaigns
14/08/20 21:04:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-SNAPSHOT
14/08/20 21:04:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/08/20 21:04:51 INFO manager.SqlManager: Using default fetchSize of 1000
14/08/20 21:04:51 INFO tool.CodeGenTool: Beginning code generation
14/08/20 21:04:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `campaigns` AS t LIMIT 1
14/08/20 21:04:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `campaigns` AS t LIMIT 1
14/08/20 21:04:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `campaigns` AS t LIMIT 1
14/08/20 21:04:51 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-myself/compile/6acdb40688239f19ddf86a1290ad6c64/campaigns.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/08/20 21:04:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-myself/compile/6acdb40688239f19ddf86a1290ad6c64/campaigns.jar
14/08/20 21:04:54 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/08/20 21:04:54 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/08/20 21:04:54 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/08/20 21:04:54 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/08/20 21:04:54 INFO mapreduce.ImportJobBase: Beginning import of campaigns
14/08/20 21:04:54 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
14/08/20 21:04:54 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/hbase/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
14/08/20 21:04:54 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/08/20 21:04:56 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/08/20 21:04:56 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "myjobtracker:50070"
14/08/20 21:04:56 ERROR security.UserGroupInformation: PriviledgedActionException as:myself (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
14/08/20 21:04:56 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:102)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Figured out the problem, i was running sqoop 1.4.5 and pointing it to the latest hadoop 2.0.0-cdh4.4.0 which had the yarn stuff also thats why it was complaining.
When i pointed sqoop to hadoop-0.20/2.0.0-cdh4.4.0 (MR1 i think) it worked.

Mahout Cookbook - Weird FileNotFoundException

I'm following the mahout cookbook example. In one of them, I'm getting an exception:
mahout seqdirectory -i /home/hduser/cook/lastfm/current -o /home/hduser/cook/lastfm sequencefiles/
And then, I'm getting the following exception:
14/04/29 15:45:38 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/home/hduser/cook/lastfm/current], --keyPrefix=[], --method=[mapreduce], --output=[/home/hduser/cook/lastfm/sequencefiles/], --startPhase=[0], --tempDir=[temp]}
14/04/29 15:45:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress
14/04/29 15:45:38 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/hduser/cook/lastfm/current
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.mahout.text.SequenceFilesFromDirectory.runMapReduce(SequenceFilesFromDirectory.java:162)
at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:91)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
The problem is that the folder exists, as shown here:
hduser#ACER-V3-571G:~/cook/lastfm/current$ ls
artists.txt ArtistTags.dat README.txt tags.txt
Mahout 0.9 and Hadoop 2.2.0
jps shows me:
5435 ResourceManager
7257 Jps
5531 NodeManager
5104 DataNode
5262 SecondaryNameNode
5008 NameNode
I'm sorry, but i found the solution by myself.
Since the file isn't in the Hadoop Filesystem, I should run mahout as local.
export MAHOUT_LOCAL=TRUE
solves the problem.
Since an "export" configures a system environment variable in unix, an similar command should do it in Windows. Try:
SET MAHOUT_LOCAL=TRUE

Resources