How to min function without group by in Hive - hadoop

Consider the following hive query.
SELECT
id,
name,
min(from_unixtime(unix_timestamp(), 'yyyy_MM_dd_HH_mm_ss')) as SYSDATE
FROM tablename
The reason why I used min function is that I wanted the same SYSDATE in all of my records. If I don't add min here, multiple SYSDATE may appear.
I got an error running the query:
An exception was caught.
Error while compiling statement: FAILED: SemanticException [Error 10025]: Line 3:4 Expression not in GROUP BY key 'name'
So I added GROUP BY in my query and it worked.
SELECT
id,
name,
min(from_unixtime(unix_timestamp(), 'yyyy_MM_dd_HH_mm_ss')) as SYSDATE
FROM tablename
GROUP BY id, name
But what if I have twenty or more columns? Isn't it inconvenient to add them all to GROUP BY? And why should I add GROUP BY here? I just want a consistent SYSDATE all across the records. Is there any other way to make it work?

if you do not have any concern about performance, try to use window function to calculate min:
SELECT
id ,
name ,
min(from_unixtime(unix_timestamp(), 'yyyy_MM_dd_HH_mm_ss')) over(partition by 1) as SYSDATE
FROM tablename

Related

query to change the date format in the column

I have a report to create but there's a little problem I can't solve because the column(date) I generate has a different value. I use it in a subquery. My question is can I used a format so that I can manage to edit the value of the column? Please see the table below for reference,
My column(date) contains
date_columns
2019-06-20T11:09:15.674+00:00
2019-06-20T11:09:15.674+00:00
2019-06-20T11:09:15.674+00:00
2019-06-20T11:09:15.673+00:00
Now, my problem is it returned me ORA-01427: single-row subquery returns more than one row becaue of that 2019-06-20T11:09:15.673+00:00. Can I do a format to make it looked like 2019-06-20T11:09:15?
I tried the query below but nothing changed. It returned me a same error.
select distinct to_date(substr(dar.last_update_date,1,15),'YYYY-MM-DD HH:MI:SS')
select distinct to_date(dar.last_update_date,1,15,'YYYY-MM-DD HH:MI:SS')
Thanks!
2019-06-20T11:09:15.673+00:00 appears to be a string of a datetime in the official XML representation. We can turn it into an actual timestamp using to_timestamp_tz() and then cast the timestamp to a date:
select cast(
to_timestamp_tz('2019-06-20T11:09:15.673+00:00','YYYY-MM-DD"T"HH24:MI:SS:FFTZH:TZM')
as date)
from dual;
However, I'm not sure how this will resolve the ORA-01427: single-row subquery returns more than one row error. This exception occurs when we use a subquery like this …
where empno = ( select empno
from emp
where deptno = 30
and sal > 2300 )
… and the subquery returns more than one row because the WHERE clause is too lax. The solution is to fix the subquery's WHERE clause so it returns only one row (or use distinct in the subquery's projection if that's not possible).

Oracle CLOB column and LAG

I'm facing a problem when I try to use LAG function on CLOB column.
So let's assume we have a table
create table test (
id number primary key,
not_clob varchar2(255),
this_is_clob clob
);
insert into test values (1, 'test1', to_clob('clob1'));
insert into test values (2, 'test2', to_clob('clob2'));
DECLARE
x CLOB := 'C';
BEGIN
FOR i in 1..32767
LOOP
x := x||'C';
END LOOP;
INSERT INTO test(id,not_clob,this_is_clob) values(3,'test3',x);
END;
/
commit;
Now let's do a select using non-clob columns
select id, lag(not_clob) over (order by id) from test;
It works fine as expected, but when I try the same with clob column
select id, lag(this_is_clob) over (order by id) from test;
I get
ORA-00932: inconsistent datatypes: expected - got CLOB
00932. 00000 - "inconsistent datatypes: expected %s got %s"
*Cause:
*Action:
Error at Line: 1 Column: 16
Can you tell me what's the solution of this problem as I couldn't find anything on that.
The documentation says the argument for any analytic function can be any datatype but it seems unrestricted CLOB is not supported.
However, there is a workaround:
select id, lag(dbms_lob.substr(this_is_clob, 4000, 1)) over (order by id)
from test;
This is not the whole CLOB but 4k should be good enough in many cases.
I'm still wondering what is the proper way to overcome the problem
Is upgrading to 12c an option? The problem is nothing to do with CLOB as such, it's the fact that Oracle has a hard limit for strings in SQL of 4000 characters. In 12c we have the option to use extended data types (providing we can persuade our DBAs to turn it on!). Find out more.
Some of the features may not work properly in SQL when using CLOBs(like DISTINCT , ORDER BY GROUP BY etc. Looks like LAG is also one of them but, I couldn't find anywhere in docs.
If your values in the CLOB columns are always less than 4000 characters, you may use TO_CHAR
select id, lag( TO_CHAR(this_is_clob)) over (order by id) from test;
OR
convert it into an equivalent SELF JOIN ( may not be as efficient as LAG )
SELECT a.id,
b.this_is_clob AS lagging
FROM test a
LEFT JOIN test b ON b.id < a.id;
Demo
I know this is an old question, but I think I found an answer which eliminates the need to restrict the CLOB length and wanted to share it. Utilizing CTE and recursive subqueries, we can replicate the lag functionality with CLOB columns.
First, let's take a look at my "original" query:
WITH TEST_TABLE AS
(
SELECT LEVEL ORDER_BY_COL,
TO_CLOB(LEVEL) AS CLOB_COL
FROM DUAL
CONNECT BY LEVEL <= 10
)
SELECT tt.order_by_col,
tt.clob_col,
LAG(tt.clob_col) OVER (ORDER BY tt.order_by_col)
FROM test_table tt;
As expected, I get the following error:
ORA-00932: inconsistent datatypes: expected - got CLOB
Now, lets look at the modified query:
WITH TEST_TABLE AS
(
SELECT LEVEL ORDER_BY_COL,
TO_CLOB(LEVEL) AS CLOB_COL
FROM DUAL
CONNECT BY LEVEL <= 10
),
initial_pull AS
(
SELECT tt.order_by_col,
LAG(tt.order_by_col) OVER (ORDER BY tt.order_by_col) AS PREV_ROW,
tt.clob_col
FROM test_table tt
),
recursive_subquery (order_by_col, prev_row, clob_col, prev_clob_col) AS
(
SELECT ip.order_by_col, ip.prev_row, ip.clob_col, NULL
FROM initial_pull ip
WHERE ip.prev_row IS NULL
UNION ALL
SELECT ip.order_by_col, ip.prev_row, ip.clob_col, rs.clob_col
FROM initial_pull ip
INNER JOIN recursive_subquery rs ON ip.prev_row = rs.order_by_col
)
SELECT rs.order_by_col, rs.clob_col, rs.prev_clob_col
FROM recursive_subquery rs;
So here is how it works.
I create the TEST_TABLE, this really is only for the example as you should already have this table somewhere in your schema.
I create a CTE of the data I want to pull, plus a LAG function on the primary key (or a unique column) in the table partitioned and ordered in the same way I would have in my original query.
Create a recursive subquery using the initial row as the root and descending row by row joining on the lagged column. Returning both the CLOB column from the current row and the CLOB column from its parent row.

Print column2 from row with max(column1) without including column2 in group by clause

I know it is a silly question and may be already answered somewhere, please guide me to the link if it is.
I want to print a column which is not included in group by clause. Oracle says that it should be included in group by expression, but I want value to be from the same row from which max() value for the other column was selected.
For example: if I have a table with following columns:
Employee_Name, Action_code, Action_Name
I want to see the name of action with maximum action_code for each employee, also I cannot use subquery in the condition.
I want some thing like this:
select employee_name, max(action_code), action_name --for max code
from emp_table
group by employee_name
This action_name in select statement is causing problem, if I add action_name in group by clause then it will show action name for each action for each employee, which will make the query meaningless.
Thanks for support
You can use a keep .. last pattern:
select employee_name,
max(action_code) as action_code,
max(action_name) keep (dense_rank last order by action_code) as action_name
from emp_table
group by employee_name
The documentation explains this more fully under the sister function first().

how to use or create temp table in Oracle

I am pretty new to Oracle.
I am just stuck when i try to achieve the following logic. I am creating a sql script in oracle that will help me to generate a report. This script will run twice a day so i should't pick the same file when it runs next time.
1) run the query and save the result setand store the Order Id in the temp table when the job runs #11 Am
2) Run the query second time # 3 pm check the temp table and return the result set that's not in temp table.
Following query will generate the result set but not sure how to create a temp table and valid against when it run.
select
rownum as LineNum,
'New' as ActionCode,
ORDER_ID,
AmountType,
trun(sysdate),
trun(systime)
from crd.V_IVZ_T19 t19
where
(t19.acct_cd in
(select fc.child_acct_cd
from cs_config fc
where fc.parent_acct ike 'G_TRI_RPT'))
and t19.date>= trunc(sysdate)
and t19.date<= trunc(sysdate);
Any help much appreciated. I am not sure how to get only the timestamp also.
TEMP table is not the idea here, cause temp table data will not store the data for a long time (just for a session), you just need to create a normal table. Hope it will help you:
--- table for storing ORDER_ID for further checking, make a correct DataType, you also can add date period in the table to control expired order_ids';
CREATE TABLE order_id_store (
order_id NUMBER,
end_date DATE
);
--- filling the table for further checking
INSERT INTO order_id_store
SELECT ORDER_ID, trunc(sysdate)
FROM crd.V_IVZ_T19 t19
WHERE t19.order_id NOT IN (SELECT DISTINCT order_id FROM order_id_store)
AND t19.date>= trunc(sysdate)
AND t19.date<= trunc(sysdate);
--- delete no need data by date period, for example for last 2 days:
DELETE FROM order_id_store WHERE end_date <= trunc(SYSDATE - 2);
COMMIT;
---- for select report without already existed data
SELECT
rownum as LineNum,
'New' as ActionCode,
ORDER_ID,
AmountType,
trun(sysdate),
trun(systime)
FROM crd.V_IVZ_T19 t19
WHERE
(t19.acct_cd in
(select fc.child_acct_cd
from cs_config fc
where fc.parent_acct ike 'G_TRI_RPT'))
AND t19.order_id NOT IN (SELECT DISTINCT order_id FROM order_id_store)
AND t19.date>= trunc(sysdate)
AND t19.date<= trunc(sysdate);
I'm not sure about your "t19.date>=" and "t19.date<=", cause the close duration taking there, make that correct if it's not.

Oracle: Selecting * and aggregate column

Is it possible to select fields using the method below?
SELECT *, count(FIELD) FROM TABLE GROUP BY TABLE
I get the following error
ORA-00923: FROM keyword not found where expected
00923. 00000 - "FROM keyword not found where expected"
*Cause:
*Action:
Error at Line: 1 Column: 9
Is it a syntax error or do you have to explicitly define each column rather than using *?
You can't use * and other columns. If you use an alias, then you can:
SELECT t.*
, count(FIELD)
FROM TABLE t
Also, your GROUP BY TABLE is wrong. You can't group by the table name, you must specify some columns, like this:
SELECT t.customer
, count(FIELD)
FROM TABLE t
GROUP BY t.customer
The columns that are selected in the field should be
an expression used as one of the group by criteria , or
an aggregate function , or
a literal value
For this, you need to indicate the fields you needed and should fit in the following criteria mentioned above.
SELECT FIELD1,FIELD2, COUNT(*) FROM TABLE1 GROUP BY FIELD1, FIELD2
If you insist to use the logic of your query, the use of subquery should be helpful.
For example,
SELECT * FROM TABLE1 T1 INNER JOIN (SELECT FIELD1, COUNT(FIELD1) AS [CountOfFIELD1] FROM TABLE1 T2 GROUP BY FIELD1)T3 ON T1.FIELD1=T3.FIELD1
Instead of * you need to give the column names:
SELECT a, b, COUNT(FIELD)
FROM TABLE
GROUP BY a, b;

Resources