Pig register jar, file does not exist error - hadoop

I'm using Hortonworks sandbox and trying to run a simple pig script. There appear to be annoying error related to "file does not exist".
Below is the script:
REGISTER '/piggybank.jar';
inp = load '/my.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage..
ERROR 2997: Encountered IOException. File does not exist:
hdfs://sandbox.hortonworks.com:8020/tmp/udfs/ '/piggybank.jar'
However, my jar is present at the root(/) and I have given proper permission as well. Don't know why the path is pointing to /tmp/udfs....
Can anyone provide some suggestion?

Do not place the path within quotes. Also provide full URI of the Jar file location.
REGISTER hdfs://sandbox.hortonworks.com:8020/piggybank.jar;
Refer REGISTER (a jar/script).

Related

Could not find or load main class hdfs problem

I am trying to use Apache Rya for some tests (https://rya.apache.org/).
For those who are familiar with Rya and RDF stores, I am trying to do a bulk loading which is explained here: https://github.com/apache/rya/blob/master/extras/rya.manual/src/site/markdown/loaddata.md.
Briefly, I should copy a Jar file 'mapreduce/target/rya.mapreduce--shaded.jar' into an hdfs volume then run the following command:
hadoop hdfs://volume/rya.mapreduce-<version>-shaded.jar org.apache.rya.accumulo.mr.tools.RdfFileInputTool -Dac.zk=localhost:2181 -Dac.instance=accumulo -Dac.username=root -Dac.pwd=secret -Drdf.tablePrefix=rya_ -Drdf.format=N-Triples hdfs://volume/dir1,hdfs://volume/dir2,hdfs://volume/file1.nt
Well I copied the needed Jar and the input files into hdfs and verified that they are really there using bin/hadoop fs -put command. My problem is that when I run the cmd in the official example I get the following lines of error that I could not understand or resolve.
/project/hadoop/libexec/hadoop-functions.sh: line 2393: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_USER: invalid variable name
/project/hadoop/libexec/hadoop-functions.sh: line 2358: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_USER: invalid variable name
/project/hadoop/libexec/hadoop-functions.sh: line 2453: HADOOP_HDFS://LOCALHOST:9000/USER/RYA.MAPREDUCE-4.0.0-INCUBATING-SHADED.JAR_OPTS: invalid variable name
Error: Could not find or load main class hdfs:..localhost:9000.user.rya.mapreduce-4.0.0-incubating-shaded.jar
For information; all env variables are properly set, HADOOP_HOME and HADOOP_PREFIX

Getting error file not found for ProcessCenter_CaseManagerConfig.properties while running BPMGenerateUpgradeSchemaScripts.bat command

I am updating IBM BPM 8.6.0 to IBM Business Automation Workflow Version 18.0.0.2, after updating fix pack for IBAW when I am running below command I get an error.
BPMGenerateUpgradeSchemaScripts.bat -profileName Node1Profile -de ProcessCenter
Below is the error which is coming on running above command.
Unable to find the response file
C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties
Unable to find the file C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties, please run the command 'BPMConfig -update -profile deployment_manager_profile -de deployment_environment_name -caseConfigure' to collect the configuration information for the content data sources, please read the knowledge center for details.
java.io.FileNotFoundException: C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties (The system cannot find the file specified.)
CWMCO6007E: The BPMGenerateUpgradeSchemaScripts command could not complete successfully. The following exception occurred :
Faild to initialize the CommonInfo. java.io.FileNotFoundException: C:\IBM\BPM\v8.6\profiles\Node1Profile\config\cells\PCCell1\ProcessCenter_CaseManagerConfig.properties (The system cannot find the file specified.)
The file command asked to run first in the above error is on 11 point in the upgrade guide, can some one please suggest whats wrong with this?

Reading sas file from blob storage in R

I am trying to read .sas7bdat file from default container. I have tried following till now:
sas_file <- RxSasData("wasbs://container#storageaccount.blob.core.windows.net/abc/xyz.sas7bdat")
sas_df <- rxImport(sas_file)
but I get following error:
The file 'wasbs://container#storageaccount.blob.core.windows.net/abc/xyz.sas7bdat' does not exist.
Could not open data source.
Error in doTryCatch(return(expr), name, parentenv, handler) :
Could not open data source.
File exists at the mentioned location in code. Still it throws error. Can someone please help me this?
According to your code, I think you want to local a SAS data file from HDFS on Azure HDInsight via RxSasData. However, RxSasData seems to be not supported on Hadoop env, as the figure below, please see here.
Please try to copy the file to local filesystem on HDI, then to read.

Cannot get schema from loadFunc org.apache.pig.builtin.AvroStorage

I am getting following error while running following pig script
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/avro.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/json-simple-1.1.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/jackson-core-asl-1.8.8.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/jackson-mapper-asl-1.8.8.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/piggybank.jar
list_cookies = LOAD '/user/xyz/testbed/llama-2014-Oct-12d/abc'
USING org.apache.pig.piggybank.storage.avro.AvroStorage();
got following error
2014-10-22 11:51:14,705 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245: Cannot get schema from loadFunc org.apache.pig.builtin.AvroStorage
Details at logfile: /home/xyz/pig_1413991623605.log
In my case, it was simply the fact that the input folder did not exist. Pig error messages are off the mark and not at all helpful. After changing the input folder to one that existed, this error went away. So, be sure to check that before spending a lot of time more difficult debugging!

sas hadoop error - picklist

I am executing SAS program. I have declared CLASSPATH and other variables properly. However when I am defining libname to access Hadoop I am getting error. Please find attached snapshot of sas log.
ERROR: The Java picklist file was not found.
1 libname testdata spde './' hdfshost=default;
ERROR: tkhdjn1 constructNewObjectOfClass: failed.
ERROR: tkhdjn2 JnlFromException: Missing exception.
ERROR: Can't construct instance of class org.apache.hadoop.conf.Configuration.
ERROR: Probable classpath problem.
ERROR: Could not connect to HDFS.
ERROR: Libref TESTDATA is not assigned.
ERROR: Error in the LIBNAME statement.
Can someone please look into issue and exactly let me know what is problem.
My guess is that you're not providing the correct path in your libname statement. According to the documentation:
http://support.sas.com/documentation/cdl/en/engspdehdfsug/67403/HTML/default/viewer.htm#n1s4fhx0fko8zkn1fiinudodmmai.htm
You should have a fully qualified path and './' is not fully qualified.
If I was you, I'd focus on double checking all the requirements specified in the linked documentation.

Resources