Getting ParseException when running Hive query - hadoop

I'm trying to find the number of employees who are paid less than average wage.
I'm pretty new to hive and struggling a bit, could someone explain whats wrong with my statement and help me out please?
My statement -
SELECT COUNT(*) FROM(SELECT wage, AVG(wage) AS avgWage FROM emp_wages) WHERE wage < avgWage;
The error -
ParseException line 1:82 cannot recognize input near 'where' 'wage' '<' in subquery source
Any help appreciated!

A syntax error. Derived table should be aliased.
SELECT COUNT(*)
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage) t --alias needed here
WHERE wage < avgWage;
Query wise, it needs a change.
select count(*)
from (SELECT wage, AVG(wage) over() AS avgWage
FROM emp_wages
) t
where wage < avgWage

SELECT COUNT(*)
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage)avg --group by needed
WHERE wage < avgWage;

The problem is AVG is an aggregation function. If you want to map one to many relations, you need to use a cross join function:
select
count(*), avg(v1.wage),
sum(case when v.wage < v2.avgwage then 1 else 0 end) below_average
from
emp_wages v cross join (select avg(wage) as avgwage from emp_wages) as v2

The correct query would be:
select count(*) where wage <(select avg(wage) from emp_wages);
You are getting a parsing error as wage and avgWage is in subquery.

Related

Oracle SQL -- Finding count of rows that match date maximum in table

I am trying to use a query to return the count from rows such that the date of the rows matches the maximum date for that column in the table.
Oracle SQL: version 11.2:
The following syntax would seem to be correct (to me), and it compiles and runs. However, instead of returning JUST the count for the maximum, it returns several counts more or less like the "HAIVNG" clause wasn't there.
Select ourDate, Count(1) as OUR_COUNT
from schema1.table1
group by ourDate
HAVING ourDate = max(ourDate) ;
How can this be fixed, please?
You can use:
SELECT MAX(ourDate) AS ourDate,
COUNT(*) KEEP (DENSE_RANK LAST ORDER BY ourDate) AS ourCount
FROM schema1.table1
or:
SELECT ourDate,
COUNT(*) AS our_count
FROM (
SELECT ourDate,
RANK() OVER (ORDER BY ourDate DESC) AS rnk
FROM schema1.table1
)
WHERE rnk = 1
GROUP BY ourDate
Which, for the sample data:
CREATE TABLE table1 (ourDate) AS
SELECT SYSDATE FROM DUAL CONNECT BY LEVEL <= 5 UNION ALL
SELECT SYSDATE - 1 FROM DUAL;
Both output:
OURDATE
OUR_COUNT
2022-06-28 13:35:01
5
db<>fiddle here
I don't know if I understand what you want. Try this:
Select x.ourDate, Count(1) as OUR_COUNT
from schema1.table1 x
where x.ourDate = (select max(y.ourDate) from schema1.table1 y)
group by x.ourDate
One option is to use a subquery which fetches maximum date:
select ourdate, count(*)
from table1
where ourdate = (select max(ourdate)
from table1)
group by ourdate;
Or, a more modern approach (if your database version supports it; 11g doesn't, though):
select ourdate, count(*)
from table1
group by ourdate
order by ourdate desc
fetch first 1 rows only;
You can use this SQL query:
select MAX(ourDate),COUNT(1) as OUR_COUNT
from schema1.table1
where ourDate = (select MAX(ourDate) from schema1.table1)
group by ourDate;

ORA-00979 Not a Group function error for query with User defined function in select statement

I have this query where a user defined function is added in the select and group by statement.
The inner select query without the WITH clause runs fine and doesn't give any error. But after adding WITH clause it gives the following error -
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 3 Column: 29
I need the WITH clause to return only a subset of the entire result set based on input ranges.
Query is as follows:
WITH INFO AS (
SELECT
GET_EVAULATED_VALUE(T.C_IMP, T.IMP) AS IMPORTANCE,
count(*) AS NO_OF_PC_AFFECTED
FROM TABLE_NAME T
WHERE T.ACNT_REL_ID = 16
GROUP BY
(GET_EVAULATED_VALUE(T.C_IMP, T.IMP))
ORDER BY IMPORTANCE desc
)
SELECT * FROM
(
SELECT ROWNUM AS RN,
(SELECT COUNT(*) FROM INFO) COUNTS,
IMPORTANCE
FROM INFO
)
WHERE RN > 0 AND RN <= 10;
I am not sure how to use CTE with group by on user defined function. But I realized that I can rewrite the query to remove sub-query and CTE and make it simpler as following (and it works):
select * from (
select a.*, ROWNUM rnum from
(SELECT
count(*) over() as COUNTS,
GET_EVAULATED_VALUE(T.C_IMP, T.IMP) AS IMPORTANCE,
count(*) AS NO_OF_PC_AFFECTED
FROM TABLE_NAME T
WHERE T.ACNT_RELATION_ID = 16
GROUP BY
(GET_EVAULATED_VALUE(T.C_IMP, T.IMP))
ORDER BY importance desc) a
where ROWNUM <= 10 )
where rnum >= 0;
Same issue here, I created a table "TABLE_CTE" instead of using a CTE and it worked.
CREATE TABLE TABLE_CTE
AS
SELECT
USER_DEFINED_FUNCTION(date_1),
COUNT(*)
FROM
TABLE_NAME
GROUP BY
USER_DEFINED_FUNCTION(date_1)
;
SELECT * FROM TABLE_CTE

FROM keyword not found where expected while filtering data

I got one error
ORA-00923: FROM keyword not found where expected
Here is my query
SELECT * FROM
(SELECT *, (SELECT COUNT(*) FROM invoices) AS numberOfRows
FROM invoices ORDER BY Id DESC) WHERE rownum <= 1
I am begginer in Oracle SQL, but as I see here I have FROM keyword and it looks everythink OK.
I try to modify this query something like but still get another error
ORA-00933: SQL command not properly ended
SELECT * FROM
(SELECT COUNT(*) FROM invoices) AS numberOfRows
FROM invoices ORDER BY Id DESC) WHERE rownum <= 1
What is wrong in first select query ? What is missing ? Since I check everything, start from special character ( . , )
Also I try this kind of solution and get error
ORA-00936: missing expression
SELECT * FROM (SELECT , (SELECT COUNT() FROM invoices) AS numberOfRows FROM invoices ORDER BY Id DESC) WHERE rownum <= 1
The railroad diagram in the documentation:
... shows that you can either use * on its own, or <something>.* along with other columns or expressions. So you need to precede your * with the table name or an alias:
SELECT * FROM
(SELECT i.*, (SELECT COUNT(*) FROM invoices) AS numberOfRows
FROM invoices i ORDER BY Id DESC) WHERE rownum <= 1
If you're on a recent version of Oracle you can do this much more simply with:
select i.*, count(*) over () as numberOfRows
from invoices i
order by id desc
fetch first row only
On older version you still need a subquery, but only one level:
select *
from (
select i.*, count(*) over () as numberOfRows
from invoices i
order by id desc
)
where rownum = 1
db<>fiddle
looks like the FROM is missing from this select "SELECT *,"
SELECT * FROM
(SELECT , (SELECT COUNT() FROM invoices) AS numberOfRows
FROM invoices ORDER BY Id DESC) WHERE rownum <= 1

Not a single-group group function on a case count in oracle

I'm trying to adapt a query that works in MSSQL to Oracle, the query is much bigger (this part is just a field from a much bigger query) but I managed to reduce it so it looks simpler.
SELECT CASE WHEN COUNT(*) > 0 THEN COUNT(*)
ELSE (SELECT COUNT(*) FROM table2)
END
FROM table1
The error I'm getting is:
ora-00937 not a single-group group function
Can someone tell me where's the problem or how can I redefine it?
You can try with this query:
SELECT CASE WHEN (SELECT COUNT(*) FROM table1) > 0 then (SELECT COUNT(*) FROM table1)
ELSE (SELECT COUNT(*) FROM table2)
END
FROM dual;
It is still ugly but it works :)
Update:
To explain how it's working:
We have 2 cases:
If there are records in the table1 then show me how many records
there are
If the table1 is empty, then give me the number of records from the
table2
Dual is the dummy table.
I think that NikNik answer is cleaner but another solution would be:
SELECT *
FROM (SELECT CASE
WHEN Count(*) > 0 THEN Count(*)
ELSE (SELECT Count(*)
FROM table2)
END
FROM table1
GROUP BY table1.primarykey1,
table1.primarykey2)
WHERE ROWNUM = 1

ORA-00937 error during select subquery

I am attempting to write a query that returns the the number of employees, the average salary, and the number of employees paid below the average.
The query I have so far is:
select trunc(avg(salary)) "Average Pay",
count(salary) "Total Employees",
(
select count(salary)
from employees
where salary < (select avg(salary) from employees)
) UnderPaid
from employees;
But when I run this I get the ora-00937 error in the subquery.
I had thought that maybe the "count" function is what is causing the issue, but even running a simpler sub query such as:
select trunc(avg(salary)) "Average Pay",
count(salary) "Total Employees",
(
select avg(salary) from employees
) UnderPaid
from employees;
still returns the same error. As both AVG and COUNT seem to be aggregate functions, I'm not sure why I'm getting the error?
Thanks
When you use scala subquery, which is a subquery in the select list, it should return only one row.
In general, subquery can return multiple rows. So when you use it in the select list with aggregation function, you should wrap it with aggregation function that has no side effect.
select count(*), (select count(*) from emp) from emp
-- ERROR. Oracle doesn't know that the subquery returns only 1 row.
select count(*), max((select count(*) from emp)) from emp
-- You know that the subquery returns 1 row, applying max() results the same.
Or you can rewrite the query like this:
select avg(salary), count(*), count(case when salary < sal_avg then 1 end)
from (select salary, avg(salary) over () sal_avg from emp);
ntalbs' answer works (thanks, ntalbs!), but see question "ORA-00937: Not a single-group group function - Query error" for a more complete explanation if you want one.

Resources