Summation of a field in specified date range - oracle

I need help with the following -- please help .
How to summate a date range.?? I am a newbie in Oracle .

Try something like this:
SELECT FIELD1, FIELD2, TRUNC(MIN(FIELD3)), TRUNC(MAX(FIELD3)), SUM(FIELD4)
FROM SOME_TABLE
WHERE FIELD3 BETWEEN DATE '2013-02-01'
AND DATE '2013-02-04' + INTERVAL '1' DAY - INTERVAL '1' SECOND
GROUP BY FIELD1, FIELD2
ORDER BY MIN(FIELD3);
Share and enjoy.

The questions/answers in the comment section to your ariginal answer show that you are actually looking for two different selections - one for the first date range and one for the overlapping second date range. Only you want to get all result records in a single result set. You can use UNION for that:
select field1, field2, min(trunc(field3))) || '-' || max(trunc(field3))), sum(field4)
from yourtable
where to_char(field3, 'yyyymmdd') between '20130201' and '20130204'
group by field1, field2
UNION
select field1, field2, min(trunc(field3))) || '-' || max(trunc(field3))), sum(field4)
from yourtable
where to_char(field3, 'yyyymmdd') between '20130203' and '20130207'
group by field1, field2
order by 1, 2, 3;

Using hard-coded dates is a bit odd, as is the way you're making your ranges (and your field4 value appears to be wrong in your sample output), but assuming you know what you want...
You can use acase statement to assign a dummy group number to the rows based on the dates, and then have an outer query that uses group by against that dummy field, which I've called gr:
select field1, field2,
to_char(min(field3), 'MM/DD/YYYY')
||'-'|| to_char(max(field3), 'MM/DD/YYYY') as field3,
sum(field4) as field4
from (
select field1, field2, field3, field4,
case when field3 between date '2013-02-01'
and date '2013-02-05' - interval '1' second then 1
when field3 between date '2013-02-05'
and date '2013-02-08' - interval '1' second then 2
end as gr
from t42
)
group by field1, field2, gr
order by field1, field2, gr;
F FIELD2 FIELD3 FIELD4
- ---------- --------------------- ----------
A 1 02/01/2013-02/04/2013 14
A 1 02/05/2013-02/07/2013 21
The display of field3 will look wrong if there is no data for one of the boundary days, but I'm not sure that's the biggest problem with this approach *8-)
You can potentially modify the case to have more generic groups, but I'm not sure how this will be used.

In a comment you say you specify two groups of dates which do not overlap. This comntradicts the data you posted in your question. Several people have wasted their time proposing non-solutions because of your tiresome inability to expalin your requirements in a clear and simple fashion.
Anyway, assuming you have finally got your story straight and the two groups don't overlap this should work for you:
with data as (
select field1
, field2
, field4
, case when field3 between date '2011-10-30' and date '2012-01-28' then 'GR1'
when field3 between date '2012-10-28' and date '2013-02-03' then 'GR2'
else null
end as GRP
from your_table )
select field1
, field2
, GRP
, sum(field4) as sum_field4
from data
where GRP is not null
order by 1, 2, 3
/

Related

Hive Joining multiple tables to create a horizontal layout

We have six hive tables with sample (example) structure like
(where each table has millions of merchant records)
Table1
MerchntId ,field1, field2
Table2
MerchantId, field3,field4
Table3
MerchantId, field5,field6,field7
Table4
MerchantId, field8,field9,field10
Table5
MerchantId, field11, field12, field13
and so on
Requirement is to create a horizantal layout to take all unique merchants where at least one field has value for a merchantId
A merchantId may or may not present in other tables.(for a merchant there may be records in other tables or may not be there)
Final Table
MerchntId, field1, field2, field3, field4,
field5,field6,field7,field8,field9,field10,field11, field12, field13
output should be like after joining
i) 101 abc def ghi
ii) 102 ghj fert hyu ioj khhh jjh ddd aas fff kkk fff vvv ff
for case (i) only three fields have values
for case (ii) all fields have values
For this we are doing FULL OUTER JOIN on merchantId for two tables and so on and then creating the final table
Is there any better approach doing this ?
for eg.
my current approach
SELECT distinct
(case when a.MerchntId IS NOT NULL then a.MerchntId else (case when
b.MerchntId IS NOT NULL
then b.MerchntId else '' end ) end ) as MerchntId,
(case when a.field1 IS NOT NULL then a.field1 else '' end ) as field1,
(case when a.field2 IS NOT NULL then a.field2 else '' end ) as field2,
(case when b.field3 IS NOT NULL then b.field3 else '' end ) as field3,
(case when b.field4 IS NOT NULL then b.field4 else '' end ) as field4
from Table1 a
full outer join Table2 b
ON a.MerchntId = c.MerchntId;
full outer join of table 3 and table 4
and then full outer join of these two tables to create a final table
I don't see any other option since your requirements explicitly translate to a full outer join. However, your query can be improved by using COALESCE and NVL:
SELECT
COALESCE(a.MerchntId, b.MerchntId) as MerchntId,
NVL(a.field1, '') as field1,
NVL(a.field2, '') as field2,
NVL(b.field3, '') as field3,
NVL(b.field4, '') as field4
from Table1 a
full outer join Table2 b
ON a.MerchntId = c.MerchntId;
Also, I'm not sure why you use distinct in your query.
Union all 6 table, substituting missed fields with nulls. Then Aggregate by MerchantId using min or max:
select MerchantId, max(field1) field1, max(field2) field2...max(field13) field13 from
(
select MerchntId field1, field2, null field3, null field4... null field13 from Table1
union all
select MerchntId null field1, null field2, field3, field4... null field13 from Table2
union all
...
select MerchantId, null field1, null field2... field11, field12, field13
from table6
)s group by MerchantId
After this you can apply your logic with replacing nulls with '' if necessary

Oracle join to get max data and a non-grouped column

Consider this part of my query:
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) field3
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2) jointable ON other.fk= jointable.field1
So field4 is a date. I need the date from table. If I add it to the select list I must add it to the group by and as such it will no longer be grouped in a way to pull the MAX(field3).
I could join table again on their primary keys but that doesn't seem ideal. Is there a way to accomplish this?
You could use the aggregate keep dense_rank sytnax to get the date associated with the maximum field3 value for each field1/2 combination:
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) field3,
MAX(field4) KEEP (DENSE_RANK LAST ORDER BY field3) field4
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2) jointable ON other.fk= jointable.field1
Quick demo of just the subquery, with a CTE for some simple data, where the highest field3 is not on the latest field4 date:
with your_table (field1, field2, field3, field4) as (
select 'A', '1', 1, date '2016-11-01' from dual
union all select 'A', '1', 2, date '2016-09-30' from dual
)
SELECT field1, field2, MAX(field3) field3,
MAX(field4) KEEP (DENSE_RANK LAST ORDER BY field3) field4
FROM your_table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2
/
F F FIELD3 FIELD4
- - ---------- ----------
A 1 2 2016-09-30
Seems like a window function would work well here...
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) over (partition by field1, field2) field3, Field4
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2, field4) jointable ON other.fk= jointable.field1
Max of field 3 will now be independant of field 4 but still be dependant on fields 1 and 2.

multiple column update from other tables column in single query - performance improvement

I Have a table MAIN_TABLE of 6 million records. Below query is taking 4 hours. Can someone suggest performance improvement
MAIN_TABLE structure.
FIELD1 (NUMBER), FIELD2(VARCHAR), FILED3............
UPDATE MAIN_TABLE MT
SET
FIELD3 = (SELECT FLDA FROM TABLE1 where FLDX= MT.FIELD2 AND FLDZ='XYZ'),
FIELD4 = (SELECT FLDB FROM TABLE1 where FLDX= MT.FIELD2 AND FLDZ='PQR'),
FIELD6 = (SELECT FLDD FROM TABLE1 where FLDX= MT.FIELD2 AND FLDW='FGH');
Will looping help? Will dividing the population into multiple threads help?
Will any DB hint work?

remove successive rows in hive

What is efficient way to remove successive row with duplicate values in specific fields in hive? for example:
Input:
ID field1 field2 date
1 a b 2015-01-01
1 a b 2015-01-02
2 e d 2015-01-03
output:
ID field1 field2 date
1 a b 2015-01-01
2 e d 2015-01-03
Thanks in advance
One way to remove successive duplicates is to use lag to check the previous id and only keep rows where the previous id is different:
select * from (
select * ,
lag(id) over (order by date) previous_id
from mytable
) t where t.previous_id <> t.id
or t.previous_id is null -- accounts for the 1st row
If you also need to check field1 and field2, then you can add separate lag statements for each field:
select * from (
select * ,
lag(id) over (order by date) previous_id,
lag(field1) over (order by date) previous_field1
from mytable
) t where (t.previous_id <> t.id and t.previous_field1 <> field1)
or t.previous_id is null

oracle not in query takes longer than in query

I have 2 tables each has about 230000 records. When I make a query:
select count(*)
from table1
where field1 in (select field2 from table2).
It takes about 0.2 second.
If I use the same query just changing in to not in
select count(*)
from table1
where field1 NOT in (select field2 from table2).
It never ends.
Why ?
It's the difference between a scan and a seek.
When you ask for "IN" you ask for specifically these values.
This means the database engine can use indexes to seek to the correct data pages.
When you ask for "NOT IN" you ask for all values except these values.
This means the database engine has to scan the entirety of the table/indexes to find all values.
The other factor is the amount of data. The IN query likely involves much less data and therefore much less I/O than the NOT IN.
Compare it to a phonebook, If you want people only named Smith you can just pick the section for Smith and return that. You don't have to read any pages in the book before or any pages after the Smith section.
If you ask for all non-Smith - you have to read all pages before Smith and all after Smith.
This illustrates both the seek/scan aspect and the data amount aspect.
Its better to user not exists, as not in uses row search which takes too long
In worst case both queries can be resolved using two full table scans plus a hash join (semi or anti). We're talking a few seconds for 230 000 rows unless there is something exceptionally going on in your case.
My guess is that either field1 or field2 is nullable. When you use a NOT IN construct, Oracle has to perform an expensive filter operation which is basically executing the inner query once for each row in the outer table. This is 230 000 full table scans...
You can verify this by looking at the the execution plan. It would look something like:
SELECT
FILTER (NOT EXISTS SELECT 0...)
TABLE ACCESS FULL ...
TABLE ACCESS FULL ...
If there are no NULL values in either column (field1, field2) you can help Oracle with this piece of information so another more efficient execution strategy can be used:
select count(*)
from table1
where field1 is not null
and field1 not in (select field2 from table2 where field2 is not null)
This will generate a plan that looks something like:
SELECT
HASH JOIN ANTI
FULL TABLE SCAN ...
FULL TABLE SCAN ...
...or you can change the construct to NOT EXISTS (will generate the same plan as above):
select count(*)
from table1
where not exists(
select 'x'
from table2
where table2.field2 = table1.field1
);
Please note that changing from NOT IN to NOT EXISTS may change the result of the query. Have a look at the following example and try the two different where-clauses to see the difference:
with table1 as(
select 1 as field1 from dual union all
select null as field1 from dual union all
select 2 as field1 from dual
)
,table2 as(
select 1 as field2 from dual union all
select null as field2 from dual union all
select 3 as field2 from dual
)
select *
from table1
--where field1 not in(select field2 from table2)
where not exists(select 'x' from table2 where field1 = field2)
Try:
SELECT count(*)
FROM table1 t1
LEFT JOIN table2 t2 ON t1.field1 = t2.field2
WHERE t2.primary_key IS NULL

Resources