We are using SonarQube 5.4 and we found the following log lines:
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computed measure 'SMELL_COUNT_BAD_FRAMEWORK_USAGE': 0)
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computed measure 'SMELL_COUNT_USELESS_TEST': 0)
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computed measure 'SMELL_COUNT_WRONG_LANGUAGE': 0)
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computed measure 'SMELL_COUNT_WRONG_LOGIC': 0)
ERROR [c.h.s.m.SonargraphMeasureComputer] Not processing for module: xxx
ERROR [c.h.s.m.SonargraphMeasureComputer] Not processing for module: xxx
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computing measures for component 'xxx' and level PROJECT)
INFO [c.q.p.s.s.p.e.AbstractSmellMeasureComputer] Computed measure 'SMELL_COUNT': 0)
ERROR [c.h.s.m.SonargraphMeasureComputer] Not processing for module: xxx
ERROR [c.h.s.m.SonargraphMeasureComputer] Not processing for module: xxx
INFO [o.s.s.c.s.ExecuteVisitorsStep] Execution time for each component visitor:
INFO [o.s.s.c.s.ExecuteVisitorsStep] - LoadComponentUuidsHavingOpenIssuesVisitor | time=79ms
INFO [o.s.s.c.s.ExecuteVisitorsStep] - IntegrateIssuesVisitor | time=29920ms
We could not find an explanation for the error lines. Does anyone have any ideas?
Installed HDP 3.1.5, and enabled KERBEROS security.
In Hive normal create table is working fine. But when I'm trying to create any role getting below error. Please suggest the solution.
0: jdbc:hive2://host> create role userRole;
INFO : Compiling command(queryId=hive_20200320085236_d9a4f82e-dab8-4952-aa53-da11a1cda4b6): create role userRole
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20200320085236_d9a4f82e-dab8-4952-aa53-da11a1cda4b6); Time taken: 0.021 seconds
INFO : Executing command(queryId=hive_20200320085236_d9a4f82e-dab8-4952-aa53-da11a1cda4b6): create role bdauserRole
INFO : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. createRole not implemented in FallbackHiveAuthorizer
INFO : Completed executing command(queryId=hive_20200320085236_d9a4f82e-dab8-4952-aa53-da11a1cda4b6); Time taken: 0.02 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. createRole not implemented in FallbackHiveAuthorizer (state=08S01,code=1)
Tweets from twitter are stored in hdfs in hadoop.
The tweets need to be processed for sentiment analysis. The tweets in hdfs are in avro format so they need to be processed using Json loader But in pig scripting the tweets from hdfs are not getting read.After changing jar files the pig script is showing failed message
By using these following jar files by pig script is getting failed.
REGISTER '/home/cloudera/Desktop/elephant-bird-hadoop-compat-4.17.jar';
REGISTER '/home/cloudera/Desktop/elephant-bird-pig-4.17.jar';
REGISTER '/home/cloudera/Desktop/json-simple-3.1.0.jar';
These are another set of jar files with which its not failing but data is also not getting read.
REGISTER '/home/cloudera/Desktop/elephant-bird-hadoop-compat-4.17.jar';
REGISTER '/home/cloudera/Desktop/elephant-bird-pig-4.17.jar';
REGISTER '/home/cloudera/Desktop/json-simple-1.1.jar';
Here is all my pig scripting commands i have used:
tweets = LOAD '/user/cloudera/OutputData/tweets' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap;
B = FOREACH tweets GENERATE myMap#'id' as id ,myMap#'tweets' as tweets;
tokens = foreach B generate id, tweets, FLATTEN(TOKENIZE(tweets)) As word;
dictionary = load ' /user/cloudera/OutputData/AFINN.txt' using PigStorage('\t') AS(word:chararray,rating:int);
word_rating = join tokens by word left outer, dictionary by word using 'replicated';
describe word_rating;
rating = foreach word_rating generate tokens::id as id,tokens::tweets as tweets, dictionary::rating as rate;
word_group = group rating by (id,tweets);
avg_rate = foreach word_group generate group, AVG(rating.rate) as tweet_rating;
positive_tweets = filter avg_rate by tweet_rating>=0;
DUMP positive_tweets;
negative_tweets = filter avg_rate by tweet_rating<=0;
DUMP negative_tweets;
Error on dumping above tweets command for the first set of jar files:
Failed to read data from "/user/cloudera/OutputData/tweets"
Failed to produce result in "hdfs://quickstart.cloudera:8020/tmp/temp-1614543351/tmp37889715"
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
2019-05-03 09:59:09,409 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2019-05-03 09:59:09,427 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias tweets. Backend error : org.json.simple.parser.ParseException
Details at logfile: /home/cloudera/pig_1556902594207.log
Error on dumping above tweets command for the second set of jar files:
Successfully read 0 records (5178477 bytes) from: "/user/cloudera/OutputData/tweets"
Successfully stored 0 records in: "hdfs://quickstart.cloudera:8020/tmp/temp-1614543351/tmp479037703"
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
2019-05-03 10:01:05,417 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-05-03 10:01:05,418 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2019-05-03 10:01:05,418 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-05-03 10:01:05,428 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-05-03 10:01:05,428 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
Expected output was sorted positive and neative tweets but getting errors.
Please do help. Thank you.
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias tweets. Backend error : org.json.simple.parser.ParseException This usually indicates a syntax error in the Pig script.
The AS keyword in a LOAD statement usually require a schema. myMap in your LOAD statement is not a valid schema.
See https://stackoverflow.com/a/12829494/8886552 for an example of JsonLoader.
I'm trying to download a 25 day ahead forecast from the ECMWF MARS Web API for all of 2018. These forecasts (WAEF Control Forecast) are only published on mondays and thursdays, and here I'm running into problems fetching the data using the MARS Web API.
I tried requesting the intuitive 2018-01-01/to/2018-12-31, but since there are 5 days a week where there aren't any fields to retrieve, the request fails.
My MARS request file is as follows:
Which results in the following response:
mars - INFO - 20190215.100826 - Welcome to MARS
mars - INFO - 20190215.100826 - MARS Client build stamp: 20190130224336
mars - INFO - 20190215.100826 - MARS Client version: 6.23.3
mars - INFO - 20190215.100826 - MIR version: 1.1.2
mars - INFO - 20190215.100826 - Using ecCodes version 2.10.1
mars - INFO - 20190215.100826 - Using odb_api version: 0.15.9 (file format version: 0.5)
mars - INFO - 20190215.100826 - Maximum retrieval size is 30.00 G
retrieve,target="output.grib",stream=waef,param=229.140/245.140,padding=0,step=600/624/648/672,expver=1,time=00:00:00,date=2018-01-01/to/2018-12-31,type=cf,class=odmars - WARN - 20190215.100826 - For wave data, LEVTYPE forced to Surface
mars - INFO - 20190215.100826 - Automatic split by date is on
mars - INFO - 20190215.100826 - Request has been split into 12 monthly retrievals
mars - INFO - 20190215.100826 - Processing request 1
EXPVER = 0001,
PARAM = 229.140/245.140,
TIME = 0000,
STEP = 600/624/648/672,
TARGET = "output.grib",
DATE = 20180101/20180102/20180103/20180104/20180105/20180106/20180107/20180108/20180109/20180110/20180111/20180112/20180113/20180114/20180115/20180116/20180117/20180118/20180119/20180120/20180121/20180122/20180123/20180124/20180125/20180126/20180127/20180128/20180129/20180130/20180131
mars - INFO - 20190215.100826 - Web API request id: xxx
mars - INFO - 20190215.100826 - Requesting 248 fields
mars - INFO - 20190215.100826 - Calling mars on 'marsod', callback on 36551
mars - INFO - 20190215.100827 - Server task is 228 [marsod]
mars - INFO - 20190215.100827 - Request cost: 72 fields, 17.2754 Mbytes on 1 tape, nodes: hpss [marsod]
2019-02-15 11:08:59 Request is active
mars - INFO - 20190215.102300 - Transfering 18114554 bytes
mars - WARN - 20190215.102301 - Visiting database marsod : expected 248, got 72
mars - ERROR - 20190215.102301 - Expected 248, got 72.
mars - ERROR - 20190215.102301 - Request failed
Is there any way to allow receiving less fields than requested or any other elegant solution to this problem other than only requesting the correct dates for mondays and thursdays?
I managed to find the answer in the MARS documentation after all. Using expect = any in the control section solved the issue. More information can be found here: https://confluence.ecmwf.int/pages/viewpage.action?pageId=43521134
I am running Cassandra and have about 20k records in it to play with. I am trying to run a filter in pig on this data but am getting the following message back:
2015-07-23 13:02:23,559 [Thread-4] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: com.datastax.driver.core.exceptions.InvalidQueryException: Expected 8 or 0 byte long (1)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:260)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:205)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Expected 8 or 0 byte long (1)
at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:263)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:44)
at org.apache.cassandra.hadoop.cql3.CqlRecordReader$RowIterator.(CqlRecordReader.java:259)
at org.apache.cassandra.hadoop.cql3.CqlRecordReader.initialize(CqlRecordReader.java:151)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256)
... 7 more
You would think this is an obvious error, and believe me there are a ton of results on google for this. It's clear that some piece of my data isn't conforming to the expected type of a given column. What I don't understand is 1.) why this is happening, and 2.) how to debug it. If I try to insert invalid data into Cassandra from my nodejs app, it will throw this kind of error if my data type doesn't match the columns data type, which means that this shouldn't be possible? I've read that data validation using UTF8 is wonky and that setting a different kind of validation is the answer, but I don't know how to do that. Here are my steps to reproduce:
grunt> define CqlNativeStorage org.apache.cassandra.hadoop.pig.CqlNativeStorage();
grunt> test = load 'cql://blah/blahblah' USING CqlNativeStorage();
grunt> describe test;
13:09:54.544 [main] DEBUG o.a.c.hadoop.pig.CqlNativeStorage - Found ksDef name: blah
13:09:54.544 [main] DEBUG o.a.c.hadoop.pig.CqlNativeStorage - partition keys: ["ad_id"]
13:09:54.544 [main] DEBUG o.a.c.hadoop.pig.CqlNativeStorage - cluster keys: []
13:09:54.544 [main] DEBUG o.a.c.hadoop.pig.CqlNativeStorage - row key validator: org.apache.cassandra.db.marshal.UTF8Type
13:09:54.544 [main] DEBUG o.a.c.hadoop.pig.CqlNativeStorage - cluster key validator: org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type)
blahblah: {ad_id: chararray,address: chararray,city: chararray,date_created: long,date_listed: long,fireplace: bytearray,furnished: bytearray,garage: bytearray,neighbourhood: chararray,num_bathrooms: int,num_bedrooms: int,pet_friendly: bytearray,postal_code: chararray,price: double,province: chararray,square_feet: int,url: chararray,utilities_included: bytearray}
grunt> query1 = FILTER blahblah BY city == 'New York';
grunt> dump query1;
Then it runs for awhile and dumps out tons of logs and the error appears.
Discovered my problem: the pig partioner did not match CQL3, and therefore the data was being parsed incorrectly. Previously the environment variable was PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner. After I changed it to PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner it started working.
I have two input files
Student file :
abc 30 4.5
xyz 34 9.5
def 28 6.5
klm 35 10.5
Location file :
abc hawthorne
xyz artesia
def garnet
klm vanness
My desired ouput
abc hawthorne
xyz artesia
def garnet
klm vanness
To achieve this, I wrote the following pig program.
A = LOAD '/user/hive/warehouse/students.txt' USING PigStorage(' ') AS (NAME:CHARARRAY,AGE:INT,GPA:FLOAT);
B = LOAD '/user/hive/warehouse/location.txt.txt' using PigStorage(' ') AS (NAME:CHARARRAY,LOCATION:CHARARRAY);
The trouble is that I dont see any output message. On top of that, I see the following warnings while execution :
2014-01-22 15:18:15,829 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 2 time(s).
2014-01-22 15:18:15,829 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 2 time(s).
2014-01-22 15:18:15,829 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2014-01-22 15:18:15,829 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2014-01-22 15:18:15,832 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2014-01-22 15:18:15,832 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2014-01-22 15:18:15,841 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-01-22 15:18:15,841 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-01-22 15:18:15,841 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
Hadoop Job IDs executed by Pig: job_201401210934_0082,job_201401210934_0083
i feel you are not seeing any output because join is not leading to any match.
You are creating a join on NAME from A (abc, xyz, def, klm) & LOCATION from B (hawthorne, artesia, garnet, vanness) and if you see there are no matching strings in two data sets, so leading to no join.