I have a pivot table that I need to have return NULL whenever there is not a value in the database. This seems pretty simple when you look at the Google API for their query language:
http://code.google.com/apis/visualization/documentation/querylanguage.html#Pivot
(source: cjweed.com)
(source: descentcampaign.com)
The Google example is the first image. Notice how it contains null, these null values are from data that does not exist in the original table. When I run the sum(vals) for the pivot to work, it always returns 0, even if it doesn't exist.
Also, my sql statement is:
SELECT weekNumber, sum(value) FROM allval
WHERE weekNumber != 0 AND custId != 28
GROUP BY weekNumber
PIVOT custId
ORDER BY weekNumber
Notice the custId != 28. It still appears in my result table, except everything is set to 0.
Anyone know how the Google example in the link was able to maintain the NULL values even through the aggregate function sum?
Related
i'm with a problem in a query.
I have a table called "store" that I need to query.
Select s.store_name
from store s
where s.store = case
when p_store != 0 then
p_store
else
s.store
end;
This should work but as I have "stores" with characters (-) in column and this is defined as number, that query raise an exception: ORA-01722: invalid number.
So I want to do something like this in query:
IF p_store != 0 then
select store_name from store where store = p_store
else
select store_name from store;
Is it possible?
Thanks!
EDIT:
The query that I wrote above was an example of the query I was running.
The exception was raised because another column (too much hours in front of PC :-( ).
This table have a column that's varchar2(15) and I was doing this condition:
(...)AND S.CODE > 4 (...)
The correct condition that I want to do is:
(...)AND LENGTH(S.CODE) > 4
Thank you all!
As far as I understand you have varchar2 s.store which in fact contains numbers so Oracle tries to compare it casting to numbers but at some point it gets - and throws an exception. What you should do is update on table replacing - by null. But if you don't want to do that you can try to make case return varchar2
Select s.store_name
from store s
where s.store = case
when to_char(p_store) != '0' then
to_char(p_store)
else
s.store
end;
I am guessing you are running the query in a plsql block (as cursor?). In that case dynamic sql is the way to go. check out REF CURSOR too (google!).
I have a table with 3 columns:
table1: ID, CODE, RESULT, RESULT2, RESULT3
I have this SAS code:
data table1
set table1;
BY ID, CODE;
IF FIRST.CODE and RESULT='A' THEN OUTPUT;
ELSE IF LAST.CODE and RESULT NE 'A' THEN OUTPUT;
RUN;
So we are grouping the data by ID and CODE, and then writing to the dataset if certain conditions are met. I want to write a hive query to replicate this. This is what I have:
proc sql;
create table temp as
select *, row_number() over (partition by ID, CODE) as rowNum
from table1;
create table temp2 as
select a.ID, a.CODE, a.RESULT, a.RESULT2, a.RESULT3
from temp a
inner join (select ID, CODE, max(rowNum) as maxRowNum
from temp
group by ID, CODE) b
on a.ID=b.ID and a.CODE=b.CODE
where (a.rowNum=1 and a.RESULT='A') or (a.rowNum=b.maxRowNum and a.RESULT NE 'A');
quit;
There are two issues I see with this.
1) The row that is first or last in each BY group is entirely dependant on the order of rows in table1 in SAS, we aren't ordering by anything. I don't think row order is preserved when translating to a hive query.
2) The SAS code is taking the first row in each BY GROUP or the last, not both. I think that my HIVE query is taking both, resulting in more rows than I want.
Any suggestions or insight on how to improve my query is appreciated. Is it even possible to replicate this SAS code in HIVE?
The SAS code has a by statement (BY ID CODE;), which tells SAS that the set dataset is sorted at those levels. So, not a random selection for first. and last..
That said, we can replicate this in HIVE by using the first_value and last_value window functions.
FIRST.CODE should replicate to
first_value(code) over (partition by Id order by code)fcode
Similarly, LAST.CODE would be
last_value(code) over (partition by Id order by code)lcode
Once you have the fcode and lcode columns, use case when statements for the result column criteria. Like,
case when (code=fcode and result='A') or (code=lcode and result<>'A')
then 1 else 0 end as op_flag
Then the fetch the table with where op_flag = 1
SAMPLE
select id, code, result from (
select *,
first_value(code) over (partition by id order by code)fcode,
last_value(code) over (partition by id order by code)lcode
from footab) f
where (code=fcode and result='A') or (code=lcode and result<>'A')
Regarding point 1) the BY group processing requires the input data to be sorted or indexed on BY variables, so though the code contains no ordering, the source data is processed in order. If the input data was not indexed/sorted, SAS will throw error.
Regarding this, possible differences are on rows with same values of BY variables, especially if the RESULT is different.
In SAS, I would pre-sort data by ID, CODE, RESULT, then use BY ID CODE in order to not be influenced by order of rows.
Regarding 2) FIRST and LAST can be both true in SAS. Since your condition for first and last on RESULT is different, I guess this is not a source of differences.
I guess you could add another field as
row_number() over (partition by ID, CODE desc) as rowNumDesc
to detect last row with rowNumDesc = 1 (so that you skip the join).
EDIT:
I think the two programs above both include random selection of rows for groups with same values of ID and CODE variables, especially with same values of RESULT. But you should get same number of rows from both. If not, just debug it.
However the random aspect in SAS code/storage is based on physical order of rows, while the ROW_NUMBERs randomness within a group will be influenced by the implementation of the function in the engine.
I need a query to return boolean when there's table has data in the given range.
Assume table
Customer
[User ID, Name, Date, Products_Purchased]
I'm trying to do:
select case when exists(
select Date, count(*)
from Customer
where date between '2015-08-03' and '2015-08-05'
)
then cast(1 as BIT)
else case(0 as BIT)end;
This is throwing an error near "select Date".
However, weird part is the inner query is running perfectly fine.
Im wondering if im missing out something here !
What about something more straightforward e.g.
select case when count(*) >0 then 1 else 0 end as HIT
from ... where ...
That way you don't have to bother about Hive assuming that EXISTS implies a correlated sub-query, automagically translated into a MapJoin, i.e. a Java HashMap shuffled to the 2nd line of Mappers jobs, etc. Not exactly your use case.
Then it's not useful to compute the exact count, so the query could be refined as
select case when count(*) >0 then 1 else 0 end as HIT
from
(select ... from ... where ... limit 1) X
[Edit] There is no "bit" datatype in Hive. But the default "int" should be OK if you just want a return flag (zero / non-zero)
I have a query that in the select statement uses a custom built function to return one of the values.
The problem I have is every now and then this function will error out because it returns more than one row of information. SQL Error: ORA-01422: exact fetch returns more than requested number of rows
To further compound the issue I have checked the table data within the range that this query should be running and can't find any rows that would duplicate based on the where clause of this Function.
So I would like a quick way to identify on which Row of the original query this crashes so that I can take the values from that query that would be passed into the function and rebuild the Functions query with these values to get it's result and see which two or more rows are returned.
Any ideas? I was hoping there could be a way to force Oracle to process one row at a time until it errors so you can see the results UP to the first error.
Added the code:
FUNCTION EFFPEG
--Returns Effective Pegged Freight given a Effdate, ShipTo, Item
DATE1 IN NUMBER -- Effective Date (JULIANDATE)
, SHAN IN NUMBER -- ShipTo Number (Numeric)
, ITM IN NUMBER -- Short Item Number (Numeric)
, AST IN VARCHAR -- Advance Pricing type (varchar)
, MCU IN VARCHAR Default Null --ShipFrom Plant (varchar)
) RETURN Number
IS
vReturn Number;
BEGIN
Select ADFVTR/10000
into vReturn
from PRODDTA.F4072
where ADEFTJ <= DATE1
and ADEXDJ >= DATE1
and ADAN8 = SHAN and ADITM = ITM
and TRIM(ADAST) = TRIM(AST)
and ADEXDJ = (
Select min(ADEXDJ) ADEXDJ
from PRODDTA.F4072
where ADEFTJ <= DATE1
and ADEXDJ >= DATE1
and ADAN8 = SHAN
and ADITM = ITM
and TRIM(ADAST) = TRIM(AST));
Query that calls this code and passes in the values is:
select GLEXR, ORDTYPE,
EFFPEG(SDADDJ, SDSHAN, SDITM, 'PEGFRTT', SDMCU),
from proddta.F42119
I think the best way to do it is trough Exceptions.
What you need to do is to add the code to handle many rows exception in your function:
EXCEPTION
WHEN TOO_MANY_ROWS THEN
INSERT INTO ERR_TABLE
SELECT your_columns
FROM query_that_sometimes_returns_multiple_rows
In this example the doubled result will go to separated table or you can decide to simply print out with dbms_output.
An easy page to start can be this, then just google exception and you should be able to find all you need.
Hope this can help.
I have two tables seatinfo(siid,seatno,classid,tsid) and booking (bookid,siid,date,status).
I've input parameter bookDate,v_tsId ,v_clsId. I need exactly one row (bookid) to return. This query is not working. I don't no why. How can I fix it?
select bookid
into v_bookid
from booking
where (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and status=0
and rownum <= 1
and siid in(select siid
from seatinfo
where tsid=v_tsId
and classid= v_clsId);
I also tried this:
select bookid
into v_bookid
from booking,
seatinfo
where booking.siid=seatinfo.siid
and (to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
and booking.status=0
and rownum <= 1
and seatinfo.tsid=v_tsId
and seatinfo.classid= v_clsId;
Are you saying that you get an "ORA-01422: exact fetch returns more than requested number of rows" when you run both of those queries? That seems highly unlikely since you're including the predicate rownum <= 1. Can you cut and paste from a SQL*Plus session that runs just this query in a PL/SQL block and generates the error?
If you are not complaining about the error you mention in the title, and the problem is just that you're not getting the data you expect, the likely problem is that you apparently have a bookDate parameter that has the same name as a column in your table. That is not going to work. When you say
(to_char(booking.bookdate,'dd-mon-yy'))=(to_char(bookDate,'dd-mon-yy'))
you presumably mean to compare the bookDate column in the booking table against the bookDate parameter. But since column names have precedence over local variables, the left-hand side of your expression is also looking at the bookDate column in the booking table. So you're comparing a column to itself. It would make much more sense to change the name of the parameter (to, say, p_bookDate) and then write
booking.bookDate = p_bookDate
or, if you want to do the comparison ignoring the time component of the dates
trunc( booking.bookDate ) = trunc( p_bookDate )