Pentaho Data Integration 8.2 - Timestamp blank on Excel Writer - etl

I have a table input step pulling data from Informix datawarehouse and one of the columns (DLV_TM) is in DATETIME HOUR TO MINUTE.
I added a Select Values Step after and formatted to column to Timestamp with h:mm:ss format.
In the Excel Writer content tab, the column has a Timestamp type and h:mm:ss format.
The transformation runs successfully and the preview data even shows the correct time format but when I open the excel file, the column is blank.
Please help!

Related

Transfer xml data to Oracle table by column or fields by using talend

I am using Talend Studio with objects tFileInputDelimited row1(Main) to tOracleOutput what I want is to transfer the data in xml file to Oracle table.
I want to transfer the values of the last two columns (product_label and email_order) of my excel file to the product table which has this column structure (PRODUCT_ID,PRODUCT_CODE,PRODUCT_LABEL,EMAIL_COMAND
ORDER_ID).
Also, I want to process this condition if a row in my excel file contains an empty product code column then is not insert the column values product_label and email_command.
XML File to load
Product table
enter image description here
what is the proper settings in tFileInputDelimited , or do I need to use other tools?
Refer this image for your reference
Use tFileInputXMl file and filter the records by using tFilterRow and then connect with tOracleOutput

How to specify the timestamp format when creating a table using a hdfs directory

I have the following csv file located at the path/to/file in my hdfs store.
1842,10/1/2017 0:02
7424,10/1/2017 4:06
I'm trying to create a table using the below command:
create external table t
(
number string,
reported_time timestamp
)
ROW FORMAT delimited fields terminated BY ','
LOCATION 'path/to/file';
I can see in the impala query editor that the reported_time column in the table t is always null. I guess this is due the fact that my timestamp wasn't in an accepted timestamp format.
Question:
How can I specify that the timestamp column should be of the dd/mm/yyyy hh:min format so that it correctly parses the timestamp?
You can't customize the timestamp(as per my exp*) but you can create the table with string data type and then you can convert string to timestamp as below:
select number,
reported_time,
from_unixtime(unix_timestamp(reported_time),'dd/MM/yyyy HH:mm') as reported_time
from t;

how to resolve date difference between Hive text file format and parquet file format

We created one external parquet table in hive, inserted the existing text file data into the external parquet table using insert overwrite.
but we did observe date from existing text file are not matching with parquet Files.
Data from to file
txt file date : 2003-09-06 00:00:00
parquet file date : 2003-09-06 04:00:00
Questions :
1) how we can resolve this issue.
2) why we are getting these discrepancy in data.
Even we faced a similar issue when we are sqooping the tables from sql server this is because of driver or jar issue.
when you are doing an insert overwrite try using cast for the date fields.
This should work let me know if you face any issues.
Thanks for your help..
using both beeline and impala query editor in Hue. to access the data stores in parquet table, with the timestamp issue occuring when you use impala query via Hue.
This is most likely related to a known difference in the way Hive and Impala handles timestamp values:
- when Hive stores a timestamp value into Parquet format, it converts local time into UTC time, and when it reads data out, it converts back to local time.
- Impala, however on the other hand, does no conversion when it reads the timestamp field, hence, UTC time is returned instead of local time.
If your servers are located in EST time zone, this can give an explanation for the +4h time offset as below:
- the timestamp 2003-09-06 00:00 in the example should be understood as EST EDT time (sept. 06 is daylight saving time, therefore UTC-4h time zone)
- +4h is added to the timestamp when stored by Hive
- the same offset is subtracted when it is read back by Hive, getting the correct value
- no correction is done when read back by Impala, thus showing 2003-09-06 04:00:00

Creating Date Dimension table using SSIS

I'm new to SSIS.I want to extract Flight Date from excel and store it in Dimension 'Date' using SSIS.
I googled and found ways to populate other dimension tables apart from 'Date'.
In my excel source, the date is stored in 12/31/2011 12:00:00 AM format. I want to extract the date (eg. 12/31/2011) and then stored it in SQL table as,
Year
Month
Day
IsWeekEnd
Any hint, or any helpful link will be highly appreciated.

Informatica Date/Time Conversion

in one of the requirment informatica fetching data from flat file as source file and insert records into a temporary table temp of DB2 database. Flat file has one column as datetime datatype (YYYY/MM/DD HH:MM:SS). However, informatica fetching this column as string datatype (Since Informatica date format is different from this column & DB2). So before loading into temp table of DB2 database, I need to convert back this column into Datetime format.
With Expresion transformation, I can do this but I dont know how? To_date conversion function (TO_DATE(FIELD, 'YYYY/MM/DD HH:MM:SS')) is there but it will take care of date only (YYYY/MM/DD). Its not taking care of time (HH:MM:SS) and because of this records are not inserting into temp table.
How can I convert datetime from String datatype to DB2 datetime format (YYYY/MM/DD HH:MM:SS)?
You tried to use the month format string (i.e. MM) for the minutes part of the date.
You need to use MI:
TO_DATE(FIELD, 'YYYY/MM/DD HH:MI:SS')

Resources