Error invoking "pig" in cloudera-quickstart-vm-5.8.0 - hadoop

I am a month new to Hadoop environment, I have cloudera-quickstart-vm-5.8.0 on my windows laptop, while invoking 'pig' in cloudera vm, I could not able to enter into grunt shell, the error I am getting is below
[Fatal Error] :-1:-1: Premature end of file. 2017-04-25 06:39:53,207
[main] FATAL org.apache.hadoop.conf.Configuration - error parsing conf
hdfs-default.xml org.xml.sax.SAXParseException; Premature end of file.
Kindly let me know how to resolve this.

Related

Dse is not starting stating unable to write to commit log directory

I am getting below error while starting the dse:
ERROR [main] 2020-02-26 13:08:33,269 DseModule.java:97 - {}. Exiting...
com.google.inject.CreationException: Unable to create injector, see the following errors:
1) An exception was caught and reported. Message: Unable to check disk space available to /u01/dse_ops/logs. Perhaps the Cassandra user does not have the necessary permissions
at com.datastax.bdp.DseModule.configure(Unknown Source)

Getting error while loading file in pig:

I am trying to execute pig script in terminal and i am getting following error:
INFO [Thread-13] org.apache.hadoop.util.NativeCodeLoader - Loaded the native-hadoop library
WARN [Thread-13] org.apache.hadoop.mapred.JobClient - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
INFO [Thread-13] org.apache.hadoop.mapred.JobClient - Cleaning up the staging area file:/tmp/hadoop-biadmin/mapred/staging/biadmin-341199244/.staging/job_local_0001
ERROR [Thread-13] org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:biadmin cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/home/biadmin/PIGData/books.csv
ERROR [main] org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/home/biadmin/PIGData/books.csv
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1024)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1041)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:959)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
at java.security.AccessController.doPrivileged(AccessController.java:310)
at javax.security.auth.Subject.doAs(Subject.java:573)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:886)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:738)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/home/biadmin/PIGData/books.csv
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)
... 15 more
ERROR [main] org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
ERROR [main] org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias b
Details at logfile: /opt/ibm/biginsights/pig/bin/pig_1487413261020.log
can anybody help me to resolve this?
The code:
data = LOAD '/home/biadmin/PIGData/books.csv';
b = FOREACH data GENERATE $0;
DUMP b;
Based on the above exception , the input file is not there in the given path file:/home/biadmin/PIGData/books.csv. (which is local file system path)
Pig has two execution modes:
1. local mode (To process local file system files)
$ pig -x local
2. Mapreduce mode (To process HDFS file system files)
$ pig or $ pig -x mapreduce
Make sure that you are running the pig script in appropriate mode.

java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO Fails to start DFS

I have installed/configured Hadoop on windows hadoop-2.6.0
I couldn't successfully start "sbin\start-dfs" run command.
I am getting below error
16/12/20 13:03:56 FATAL namenode.NameNode: Failed to start namenode.
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.a
ccess0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:5
57)
at org.apache.hadoop.fs.FileUtil.canWrite(FileUtil.java:996)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyze
Storage(Storage.java:490)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSI
mage.java:308)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(
FSImage.java:202)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNam
esystem.java:1020)
There was a similar question for running YARN. And it was told that including hadoop-2.6.0/sbin and hadoop-2.6.0/bin in path would resolve the problem. But still i am facing the error.
Can anyone help me in fixing this?
Please make sure you have proper access permission to namenode directory. Also, format the namenode and start the hdfs services.

Integrating Pig with Hbase

I have installed hadoop-2.5.0, pig 0.13.0 and HBase 0.98.6.1 in linux. When trying to run simple pig script, error occurs as
2014-10-14 16:01:54,891 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Details at logfile: /home/labuser/pig_1413279561970.log
Pasted the log below...
Pig Stack Trace
ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
at org.apache.hadoop.hbase.TableName.(TableName.java:281)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:344)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:382)
at org.apache.hadoop.hbase.TableName.(TableName.java:82)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
It seems that HBase 0.98.6.1 version does not support for pig 0.13.0
So how to make it works? or which version of HBase does support for pig 0.13.0?
The root cause for this has been identified to be https://issues.apache.org/jira/browse/HBASE-6658 where it says the class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" was renamed.
You may need to re-compile using the HBase profile you're using.

FileNotFoundException dfs.include - Hortonworks for windows

I tried to install Hortonworks 2.2.0.2.0.6.0-0009 for Windows on a Windows Server 2012.
Everything seems clean during the installation except when launching "start_local_hdp_services.cmd" to start hadoop services. There, namenode and historyserver services fail to start and generate folowing logs :
For "hadoop-namenode-M1BY1HADOOP.log" :
2014-03-06 09:39:06,755 ERROR org.apache.hadoop.hdfs.server.namenode.HostFileManager: failed to read include file 'c:\hdp\hadoop-2.2.0.2.0.6.0-0009/etc/hadoop/dfs.include'. Continuing to use previous include list.
java.io.FileNotFoundException: c:\hdp\hadoop-2.2.0.2.0.6.0-0009\etc\hadoop\dfs.include (The system cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.util.HostsFileReader.readFileToSet(HostsFileReader.java:54)
at org.apache.hadoop.hdfs.server.namenode.HostFileManager$MutableEntrySet.readFile(HostFileManager.java:265)
at org.apache.hadoop.hdfs.server.namenode.HostFileManager.refresh(HostFileManager.java:284)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.<init>(DatanodeManager.java:176)
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.<init>(BlockManager.java:237)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:609)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:567)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:443)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:491)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:684)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:669)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1254)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320)
For "hadoop-historyserver-M1BY1HADOOP.log":
014-03-06 09:39:20,130 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory: [hdfs://VMHADOOP:8020/mapred/history/done]
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory: [hdfs://VMHADOOP:8020/mapred/history/done]
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:503)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:88)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceInit(JobHistoryServer.java:93)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:155)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:165)
Does anybody know the reason of this error and can help me to solve it please?
Thank you
It is a bug in hadoop (https://issues.apache.org/jira/browse/AMBARI-2355), create empty dfs.include and dfs.exclude file and the problem would vanish.

Resources