PIG Cassandra ERROR 2118 Could not get input splits

PIG Cassandra ERROR 2118 Could not get input splits - hadoop

I started off trying to do simple pig+cassandra integration with this tutorial from datastax: http://docs.datastax.com/en/datastax_enterprise/4.5/datastax_enterprise/ana/anaPigExRel.html
but when i try to store the result into cql, i get this error:
Message: org.apache.pig.backend.executionengine.ExecException: ERROR
2118: Could not get input splits
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:279)
any ideas whats happening? i read some answers here, referring to changing my PIG_PARTITIONER to Murmur3Partitioner
which i already did and it still happens. is it configuration issue?
export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner

I found out that after doing:
export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
i need to do source ~/.bashrc and do pig from that particular console.
though I get another error, but I think this case is solved.

Related

Not able to generate extracted_queries.json using persistgraphql

I am trying to generate extracted_queries.json file using persistgraphql cli utility but when I am running the following command
persistgraphql src/app/ --add_typename
it results in this error
Unable to process input path src/app/. Error message:
Syntax Error GraphQL request (1:1) Unexpected <EOF>
1:
^
How can I fix this?

After looking for answers for a while I am able to make it work.
In my case, we have queries and mutations written in js files so below command worked for me
persistgraphql src/app --js --extension=js --add_typename

Can't use 'put'() to add data to hbase with happybase

My python version is 3.7, and after I ran pip3 install happybase, I started the command hbase thrift start and tried to write a brief .py file as following:
import happybase
connection = happybase.Connection('master')
table = connection.table('jmlr') #'jmlr' is a table in hbase
for i in table.scan():
print(i)
table.put('001', {'title':'dasds'}) #error here
connection.close()
When it's about to run table.put(), it reported such an error:
thriftpy2.transport.base.TTransportException: TTransportException(type=4, message='TSocket read 0 bytes')
And at the same time, the thrift reported an error:
ERROR [thrift-worker-1] thrift.TBoundedThreadPoolServer: Error occurred during processing of message. java.lang.IllegalArgumentException: Invalid famAndQf provided.
But just now I ran this python file again, it gave me a different error in thrift:
thrift.TBoundedThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin
I have tried to add parameters like protocol='compact', transport='framed', but this didn't work, even the table.scan() failed.
Everything in the hbase shell is OK, so I can't figure out what went wrong, I'm about to collapse.

I ran into the same issue and found this sollution. You need to add even empty Column Qualifier ( ':' symbol as delimiter between Column Family and Column Qualifier) into put() method:
table.put('001:', {'title':'dasds'})
Also, you have a different error message after second run of script because thrift server is already failed.
I hope it will help you.

Pig register jar, file does not exist error

I'm using Hortonworks sandbox and trying to run a simple pig script. There appear to be annoying error related to "file does not exist".
Below is the script:
REGISTER '/piggybank.jar';
inp = load '/my.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage..
ERROR 2997: Encountered IOException. File does not exist:
hdfs://sandbox.hortonworks.com:8020/tmp/udfs/ '/piggybank.jar'
However, my jar is present at the root(/) and I have given proper permission as well. Don't know why the path is pointing to /tmp/udfs....
Can anyone provide some suggestion?

Do not place the path within quotes. Also provide full URI of the Jar file location.
REGISTER hdfs://sandbox.hortonworks.com:8020/piggybank.jar;
Refer REGISTER (a jar/script).

Cannot get schema from loadFunc org.apache.pig.builtin.AvroStorage

I am getting following error while running following pig script
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/avro.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/json-simple-1.1.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/jackson-core-asl-1.8.8.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/lib/jackson-mapper-asl-1.8.8.jar
REGISTER /opt/cloudera/parcels/CDH/lib/pig/piggybank.jar
list_cookies = LOAD '/user/xyz/testbed/llama-2014-Oct-12d/abc'
USING org.apache.pig.piggybank.storage.avro.AvroStorage();
got following error
2014-10-22 11:51:14,705 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245: Cannot get schema from loadFunc org.apache.pig.builtin.AvroStorage
Details at logfile: /home/xyz/pig_1413991623605.log

In my case, it was simply the fact that the input folder did not exist. Pig error messages are off the mark and not at all helpful. After changing the input folder to one that existed, this error went away. So, be sure to check that before spending a lot of time more difficult debugging!

Error while connecting Elastic Map Reduce ruby client

I am following the steps mentioned on the AWS to use an interactive Hive session using SSH.
I used the following resources
https://github.com/ucbtwitter/getting-started/wiki/Using-Elastic-Map-Reduce-via-Command-Line
http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/SignUp.html
I was getting this error initially "Error: Missing key access-id" and then I fixed my JSON file. The JSON file is in the same format as mentioned in the above links.
When I run this command
./elastic-mapreduce
I am getting the following error :-
Error: Unable to parse credentials.json: can't convert String into Integer.
I checked the values required in JSON at AWS as well.
Does anyone has an idea why am I getting this error?

The region value in the credentials.json must be of int type.
{......
......
"region": 1
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

PIG Cassandra ERROR 2118 Could not get input splits - hadoop

I found out that after doing: export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner i need to do source ~/.bashrc and do pig from that particular console. though I get another error, but I think this case is solved.

Related

Not able to generate extracted_queries.json using persistgraphql

Can't use 'put'() to add data to hbase with happybase

Pig register jar, file does not exist error

Cannot get schema from loadFunc org.apache.pig.builtin.AvroStorage

Error while connecting Elastic Map Reduce ruby client

Categories

Resources