Convert semi-structured data to structured data using talend BigData - etl

Employee
Employee Type : 0130
Unit : 4189670095711234
Basic Salary : 11.00
Joined Date : 04/12/yy 06:30:05
Country : 826-United Kingdom
(123.66) --- Endof Employee -------------
R 4567 ABCD -> Len f---- i 01/14
Employee
Employee Type : 0120
Unit : 4189670095711234
Basic Salary : 11.00
Joined Date : 04/12/yy 06:30:05
Country : 826-United Kingdom
(123.66)- --- Endof Employee ------------
R 4567 ABCD -> Len f---- i 01/14
Employee
Employee Type : 0130
Unit : 4189670095711235
Basic Salary : 11.00
Joined Date : 04/12/yy 06:30:05
Country : 826-United Kingdom
(123.66) --- Endof Employee -------------
Hi,
I would like to convert the following semi-structured data to structured data using talend.
Please let me know how can i convert the data to structured form and so that i can insert it into a relational table.

Here is a solution, thanks to tPivotToColumnsDelimited component.
tFileInputDelimilted is associated with a 2 fields schema (nammed property and value) and has a special field separator which is " : " (space-colon-space).Avanced Setting options "Trim all columns" and "Check each row structure against schema" are ticked.
tMap is here to associate a rank for each input line depending the "property" name:
As you can see, the sequence name is based on the property name, so each file record for the same employee will have the same rank value.
Finally, tPivotToColumnsDelimited move on a single line all the input records with the same rank value and, most important, values are associated to the rigth property.
Set "Pivot column" as "property", "Aggregation column" as "value", "Aggregation function" as "first" and "Group by" as "rank". Select the desired filename for the output and finally you will get the desired result:
Hope this helps.

Related

LATERAL VIEW explode funtion in hive

I am trying to export data from excel into a hive table, while doing so, i have a column 'ABC' which has values like '1,2,3'.
I used the lateral view explode function but it does not does anything to my data.
Following is my code snippet :
CREATE TABLE table_name
(
id string,
brand string,
data_name string,
name string,
address string,
country string,
flag string,
sample_list array )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
;
LOAD DATA LOCAL INPATH 'location' INTO TABLE
table_name ;
output sample:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 ["[1,2,3]"]
then i do:
select * from franchise_unsupress LATERAL VIEW explode(SEslist) SEslist as final_SE;
output sample:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 [1,2,3]
I also tried:
select * from franchise_unsupress lateral view explode(split(SEslist,',')) SEslist AS final_SE ;
but got an error:
FAILED: ClassCastException org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
whereas, what i need is:
id brand data_name name address country flag sample_list
19 1 ABC SQL ABC Cornstarch IN 1 1
19 1 ABC SQL ABC Cornstarch IN 1 2
19 1 ABC SQL ABC Cornstarch IN 1 3
Any help will be greatly appreciated! thank you
The problem is that array is recognized in a wrong way and loaded as a single element array ["[1,2,3]"]. It should be [1,2,3] or ["1","2","3"] (if it is array<string>)
When creating table, specify delimiter for collections:
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
I wanted to provide my answer.
The issue was with the input that was being provided. My input txt file had [] around the input value. They had to be removed and it worked.

How to query distinct column value when query all row data

I hava one MySQL Table
id name birthdate city
1 Owen 2011/01/01 USA
2 Mark 2012/05/01 UK
3 Marry 2011/01/01 JP
4 John 2011/01/01 JP
First,I uesd jqgrid to read all row data. But Now,I want to know when birthdate=2011/01/01,how many different city in the table.
Can don't used sql,only used jqgrid plugin?
You are looking for distinct function.
SELECT DISTINCT(city) FROM table WHERE birthday = "2011/01/01";

Sqlite query to determine gender by first name

I have 2 sqlite3 tables :
FND is a Table of names and their likely gender i.e.:
nm,gndr <-column names
Aliyah,F
Moses,M
Peter,M
Members is second table i.e.
Fname,Lname <-column names
DAVID X, BAKER
MARY MIA,MCGEE
TINA HEATHER,JOHNSON
JIM PETER TOM, SANTINO
The members table has first and middle names in the fname column.
I am trying to write a query to list the Members table fnames column, with a generated column indicating gender based on the first word in the fname column.
I tried this but it didn't work:
select m.fname,(select gndr from FND where upper(nm) like m.fname||'%')as gender
from Members m
can anyone correct my sql statement?
... upper(nm) like m.fname||'%'
Let's look at some example values:
nm: 'David'
fname: 'DAVID X'
SQL: 'DAVID' LIKE 'DAVID X%'
This obviously does not match.
You have to reverse the LIKE operands:
m.fname LIKE nm||'%'

Crystal report 2008 formula

Hi ,
I am new to crystal report. I have a problem to create formula :
Let's say :
I have two tables :
tbl_Details :
Emp_id Emp_nameEmp_Deptt
1 Ram MMM
2 Naresh NNN
3 kapil HHH
4 Namita DDD
tbl_Mapping :
Type1 Type2 Emp_Deptt
ButterSotch ButterScotch NNN
ButterScotch Strawberry DDD
Olive Starch MMM
Olive Olive HHH
Note : In the above tables, Emp_Deptt column is same.
Now i want to create a formula like as below :
if Mapping.Type1 = Mapping.Type2 then find Emp_Deptt in Details table and get the Emp_name.
for example :
Butterscotch = Butterscotch then Emp_Deptt is NNN matched with Details table and the Emp_name is Naresh.
if doesn't find any match then nothing to do.
Now, i want to create a formula to get the desired value and print place somewhere in crystal report.
From Database Menu select 'Database Expert'. Add tbl_Mapping & tbl_Details to the selected tables.. make sure tbl_Mapping & tbl_Details are linked with Emp_Deptt fields.
Write following in the formula editor:
if {tbl_Mapping.Type1} = {tbl_Mapping.Type2} then {tbl_Details.Emp_name}

How to write formula to get data from two tables linked together by one column in crystal report

How to write formula to get data from two tables linked together by one column
Hi ,
I am new to crystal report. I have a problem to create formula :
Let's say :
I have two tables :
tbl_Details :
Emp_id Emp_nameEmp_Deptt
1 Ram MMM
2 Naresh NNN
3 kapil HHH
4 Namita DDD
tbl_Mapping :
Col_1 Col_2 Emp_Deptt
ButterSotch ButterScotch NNN
ButterScotch Strawberry DDD
Olive Starch MMM
Olive Olive HHH
Note : In the above tables, Emp_Deptt column is same.
Now i want to create a formula like as below :
if Mapping.Type1 = Mapping.Type2 then find Emp_Deptt in Details table and get the Emp_name.
for example :
Butterscotch = Butterscotch then Emp_Deptt is NNN matched with Details table and the Emp_name is Naresh.
if doesn't find any match then nothing to do.
Now, i want to create a formula to get the desired value and print place somewhere in crystal report.
you need to make view of the tables first, join the the two tables, tbl_Deetails with tbl_Mapping..
link them with the department ID.. once its done, call that view to the report and use the formula like this
if(totext({view.Type1}) == totext({view.Type2})) then
(
{view.Employee_Name};
)
else
(
//nothing
)

Resources