Snowflake shows time value in UTC when data inserted using datastage - jdbc

I am using datastage as an ETL tool to insert data into Snowflake. When time value is inserting it shows a difference of 7 hours.
If actual time value is 1:00:01 then snowflake will show 6:00:01

Related

Delete from table is very slow in oracle standard edition

Delete on table in oracle standard edition(no partition) gets slow with time.
Important Info: I am working on oracle standard edition so partitioning option available.
Detail:
I have one table with no constraint on it (no PK or anyother key or trigger or index or anything).
More than a million record gets inserted in this table in every 15 min using sql loader.
we need to process this 15 min record in every 15 min and at end of process delete any record older than 30 minute so that at any point of time there is more than 30-40 minute of data in the table.
Problem:
As time passes due to so frequent insertion and deletion response from the table gets slow.
Data extraction and delete from table takes more time with every passing run.
After a while even a simple select query takes too long.
We cant truncate table as data loader runs continously and we may loose data if truncate and we dont have create table access to drop and create table.
we have to process data in every 15 minute and made it available to downstream for further processing. it just keep getting slow.
Kindly help me with the aforementioned situation.

Sqoop changes Date to Long when ingested data is saved as avrodata

Using Oracle to ingest the data to HDFS as avrodata using SQOOP. The date/timestamp column fields are changed to long so the value is getting altered.
Example :
28-MAR-18 12.42.06.328000 PM changes to 1523401161454.
Any insights on the issue.

How to ingest data in real time from oracle to Elasticsearch

I am using a loop in scala to query an Oracle table every 10 second, since Oracle table get continuously insertion. I create a select request then I create n json string containing n line from oracle that I push into Elasticsearch. After that I create a delete request to erase the n line from Oracle table that I have inserted into ES. I developped a completely beginner approach. So can you suggest me a better approach to load in real time or micro batch data from Oracle to ES and delete from Oracle. I heard about logstach or SreamSets. Do you have any idea? Thanks

talend etl oracle error 0 row insert

I am a newbie to TalendETL and am using Talend Open Studio for Big Data version 6.2.
I have developed a simple Talend ETL job that picks up data from a tFileInputExcel and tOracleInput(dimension date ) and inserts data into my local Oracle Database.
Below is how my package looks :
this job run but i get 0 rows insert into my local Oracle Database
Your picture shows that no rows come out your tMap Component. Verify that your links inside the Tmap are corrects.
Seems there is no data that matches between fgf.LIBELLE_MOIS and row2.B.

how to resolve date difference between Hive text file format and parquet file format

We created one external parquet table in hive, inserted the existing text file data into the external parquet table using insert overwrite.
but we did observe date from existing text file are not matching with parquet Files.
Data from to file
txt file date : 2003-09-06 00:00:00
parquet file date : 2003-09-06 04:00:00
Questions :
1) how we can resolve this issue.
2) why we are getting these discrepancy in data.
Even we faced a similar issue when we are sqooping the tables from sql server this is because of driver or jar issue.
when you are doing an insert overwrite try using cast for the date fields.
This should work let me know if you face any issues.
Thanks for your help..
using both beeline and impala query editor in Hue. to access the data stores in parquet table, with the timestamp issue occuring when you use impala query via Hue.
This is most likely related to a known difference in the way Hive and Impala handles timestamp values:
- when Hive stores a timestamp value into Parquet format, it converts local time into UTC time, and when it reads data out, it converts back to local time.
- Impala, however on the other hand, does no conversion when it reads the timestamp field, hence, UTC time is returned instead of local time.
If your servers are located in EST time zone, this can give an explanation for the +4h time offset as below:
- the timestamp 2003-09-06 00:00 in the example should be understood as EST EDT time (sept. 06 is daylight saving time, therefore UTC-4h time zone)
- +4h is added to the timestamp when stored by Hive
- the same offset is subtracted when it is read back by Hive, getting the correct value
- no correction is done when read back by Impala, thus showing 2003-09-06 04:00:00

Resources