pig - transform data from mutilple rows to columns using Pig - apache-pig-grunt

I am running into an issue while trying to transform the data from long form to wide.
My input file :
Id1
Name1
Mail1
Id2
Name2
Mail2
I want the data to be transformed like
Id1 Name1 Mail1
Id2 Name2 Mail2

Related

where with multiple column and anyof() with dexie

I have table data as mentioned below.
Id Name Type
1 ABC SCAN
2 BDC SCAN
3 ABC MANUAL
4 BDC EXTDEV
5 ABC EXTDEV
As per the above Tbl data want ABC data including all 3 types.
I have tried below function
export const getInventoryScanProductByAccNameAndType = async (
accName
) => await db.InventoryScans
.where(['AccountName', 'ProductEnterType']).anyOf(accName, ['SCAN', 'MANUAL', 'EXTDEVICE']).toArray();
Can anyone help me to get the data below?
Id Name Type
1 ABC SCAN
2 ABC MANUAL
3 ABC EXTDEV

LATERAL VIEW explode funtion in hive

I am trying to export data from excel into a hive table, while doing so, i have a column 'ABC' which has values like '1,2,3'.
I used the lateral view explode function but it does not does anything to my data.
Following is my code snippet :
CREATE TABLE table_name
(
id string,
brand string,
data_name string,
name string,
address string,
country string,
flag string,
sample_list array )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
;
LOAD DATA LOCAL INPATH 'location' INTO TABLE
table_name ;
output sample:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 ["[1,2,3]"]
then i do:
select * from franchise_unsupress LATERAL VIEW explode(SEslist) SEslist as final_SE;
output sample:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 [1,2,3]
I also tried:
select * from franchise_unsupress lateral view explode(split(SEslist,',')) SEslist AS final_SE ;
but got an error:
FAILED: ClassCastException org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
whereas, what i need is:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 1
19 1 ABC SQL ABC Cornstarch IN 1 2
19 1 ABC SQL ABC Cornstarch IN 1 3
Any help will be greatly appreciated! thank you
The problem is that array is recognized in a wrong way and loaded as a single element array ["[1,2,3]"]. It should be [1,2,3] or ["1","2","3"] (if it is array<string>)
When creating table, specify delimiter for collections:
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
I wanted to provide my answer.
The issue was with the input that was being provided. My input txt file had [] around the input value. They had to be removed and it worked.

In PIG how to remove similar values

in my pig script i have a column for country1 and country2 and an id. In my country field, some of the values are similar like below. How do I filter out similar values that have at least 2 consecutive of the same characters?
Ex:
a = load file
a = generate id, country1, country2
output:
id1, us, usa
id2, gb, gba
id3, in, ind
id4, in, usa
expected output:
id4, in, usa
Use SUBSTRING to get the first two characters of the 3rd column and compare that with the 2nd column value.
B = FILTER A BY (LOWER(A.$1) != SUBSTRING(LOWER(A.$2),0,2));
DUMP B;

Delete a specific value from a set of values in a column - HQL

How can I write a HQL for deleting a specific value from a column ( column contains a set of values separated by comma)
Table1
ID Name Value
001 Rajesh 90,100,210,400
002 Suresh 100,400,300,66
003 Mahesh 200,500
004 Virat 400,578,57
I tried the following code but its wrong.
Session session=getSession();
Query query = session.createQuery("delete from Table1 where Value=:Value and Name=:Name");
query.setParameter("Value", "100");
query.setParameter("Name", "Rajesh");
query.executeUpdate();
I want to delete value- 100 from Name - Rajesh

Last unique row

This problem relates to report design in BIRT.
Due to limitations in the data source, I have a data set with the following rows:
Id Name Class
---------------
1 Name1 Foo
2 Name1 Bar
3 Name2 Fizz
4 Name2 Buzz
5 Name3 Baz
Duplicates of the name column should be suppressed, and only the last result should be displayed:
Id Name Class
---------------
2 Name1 Bar
4 Name2 Buzz
5 Name3 Baz
How can I do that?
(Assuming your data is in ID order)
Insert a Group in your BIRT report, on name.
Copy the detail row items into the new group footer.
Delete the detail row and new group header row.
The output from the report should match your requirements.

Resources