I am doing functionality testing(need to write hive code while referring Scala code) in my project. I am having an issue with my date functions in my code. In Scala we have casted our date data type into string as changed its structure into ‘YYYYMM’, MY value inside my date column is like 201706(YYYYMM), which is not accepted in Hive (read that it accepts only YYYY-MM-DD).
My question is
1) How to change the YYYYMM to YYYY-MM-DD? I have tried casting to date and also UNIX_TIMESTAMP neither of them are working query is getting failed at the end.
2) We are also using filter.to_date (colm1,”YYYYMM”).between(add_months(to_date((colm2),”YYYYMM”),-27), add_months(to_date((colm2),”YYYYMM”),-2))) in our Scala code , How can I change that to HIVE? Unable to get any ideas
Thanks In advance…..
Regards,
M Sontosh Aditya
use
unix_timestamp(DATE_COLUMN, string pattern)
Further understanding please refer DateFuncitos
Related
I want to filter a parquet partitioned by date.
When I apply the filter
.filter(col('DATE')>= '2020-08-01')
It casts the value 2020-08-01 as a string when doing the filtering as shown in the physical plan. I read that this is not efficient and results in a whole filescan.
PartitionFilters: [isnotnull(DATE#5535), (cast(DATE#5535 as string) >= 2020-08-01)]
How do I cast the string as date in the filter clause? All the examples on the internet mention to use to_date but that works only on columns.
Is this possible, or even worth it?
Please advise.
Thank You
Try this -
import pyspark.sql.functions as F
.filter(F.expr(" `Date` >= to_date('2020-08-01','yyyy-MM-dd' )"))
I am working on a CLI app based on ActiveRecord migration, and I am trying to set one of the table columns to have a t.date property so that the users can check open availability during certain dates. The trouble I am having is how to format my date when I create a new instance of my class. Created instance
As of now, I just have 2020 integer in there so I can test out my other methods but I will need to get this situated before I move on to the more advanced methods. I have tried to use mm/dd/yyyy format in "" and without, as well as mm-dd-yyyy and some other fomrats. Not sure it should be a string, or just integer values.
I hope I am asking my question correctly.
I'd suggest that you would be able to create the date in a variety of different ways before passing it to your function.
Please see the below image for some examples of ways you could create a date object.
Ruby date and datetime class - Tadayoshi Funaba 1998-2011
FILTER("source"."recordCount" USING "source"."snapshot_date" =
EVALUATE('TO_CHAR(%1, ''YYYYMMDD'')', TIMESTAMPADD(SQL_TSI_DAY, -7, EVALUATE('TO_DATE(%1, %2)', "source"."snapshot_date" , 'YYYYMMDD'))))
So i have this piece of code here. I know some will say "Just use the AGO function" But somehow it's causing problems because of it's connection with other tables so what I'm trying to achieve here is like a remake. The process goes this way:
The snapshot_date there is actually in varchar format and not date. So it's like "20131016" and I'm trying to change it to a date then subtract 7 days from it using the TIMESTAMPADD function and then finally returning it back to varchar to use it with FILTER.
This snippet somehow works when testing the FILTER using hardcoded values like "20131016" for example but when tested out with the code above all the row are blank. On paper, the process i assumed would happen goes lke this. "20131016" turns to a date with a format of 20131016 (yyyymmdd) and then less 7 days: 20131009 and then turned into char again "20131009" to be used in the filter.
But somehow that doesn't happen. I think the data format is not applying either to the string->date or the date->string conversion. which results to the values not getting a match at all.
Anyone have any idea what's wrong with my code?
By the way I've already tried to use CAST instead of EVALUATE or TO_TIMEDATE with the same result. Oh and this goes to the formula of the column in BMM.
Thanks
You might get some clues by looking at the SQL generated by the BI Server. I can't see any issues with your column expression, so I wouldn't limit your debugging to that alone.
A query returning nulls is often caused by incorrect levels being set (especially on logical table sources, but potentially on a measure column too). This will often result in some form of SELECT NULL FROM ... in the physical SQL.
Try this :
FILTER("source"."recordCount" USING "source"."snapshot_date" =
EVALUATE('TO_CHAR(%1, %2)', TIMESTAMPADD(SQL_TSI_DAY, -7, EVALUATE('TO_DATE(%1, %2)', TO_CHAR("source"."snapshot_date" , 'YYYYMMDD') , 'YYYYMMDD')) , 'YYYYMMDD'))
I have a string like so: "2014-09-02T03:01:09.8093664Z", and Im trying to convert it into local timezone. I tried from_utc_timestamp(eventTime, 'GMT'), from_utc_timestamp(eventTime, "PDT"), but Hive just returns error:
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"2014-09-02T03:01:09.8093664Z",
.
.
.
... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating Converting field _col0 from UTC to timezone: 'PDT'
Am I doing something wrong here?
I searched stackoverflow and did not find a solution to this problem (Local Time Convert To UTC Time In Hive is related but doesn't solve the problem)
from_unixtime(UNIX_TIMESTAMP("2014-09-02T03:01:09Z", "yyyy-MM-dd'T'HH:mm:ss'Z' "),"yyyy-MM-dd HH:mm:ss")
its conver to 2014-09-02 03:01:09
An usesful way to solve this problem is creating an UDF function to make this operation. This new one could be specific for this case or more generic adapted to more datetime format conversions. You could read below some benefits:
Makes your hive query more readable
Avoid duplicated code if you need this operation in other queries
Makes more scalable your systems because you can update this method whenever you want
Delegate the complex operations to Java code, and consequently you will be able to test those complex parts.
Could you read more about how to create a customized UDF here.
If you need to know how to implement this method in Java I've found in Starckoverflow a post that explains you a way to that, here you have the entry.
You must first extract the time and date string in the proper format before you convert it to GMT. This requires the following format 'yyyy-MM-dd HH:mm:ss'.
Use a regexp_replace to extract the string and then pipe that to the from_utc_timestamp function like this:
select from_utc_timestamp(regexp_replace(event_time,'(\^\\d{4}-\\d{2}-\\d{2})T(\\\d{2}:\\d{2}:\\d{2}).*','$1 $2),'GMT') from my table;
Your output is then: 2014-09-01 03:01:09
Good luck!
Fairly straight forward... I have an EntityDataSource where, in the Select property I'm pulling a variety of fields. One of which is a Date that I would like to have returned in the "MM/dd/yyyy" format. How can I accomplish this?
You could use the .ToString(Format Here)
DateTime time = EDS.Field;
Console.WriteLine(time.ToString(MM/dd/yyyy));
This solution works fine with me:
cast(it.[interview_start_date] as System.String)