I'm coming back here as I again need your help!
Which of the following is the better choice?
The question is:
I have a table myTable with ['DateYYYYMMDD','field1', 'field2', 'field3', 'MyField'] and every day someone inserts many records.
I have to create 2 (fast) views myView1 and myView2 that select records (from myTable) created in the last 30 days, and with different MYFIELD values.
I have found some distinct simple solutions and I would like to know which is the fastest:
Solution1
--myView1:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and MYFIELD in ('65643L', '65643L174', '65643L8N',
...
'6564L7174', '6564L78N','6564L78N_2O15',
...
'6564L78N3226T2_2O15', '6564L78N8N322',
'6564L78N6T2', '6564L78N6T2_2O15',
'6564L7-NOTT1-6T2', '6564L76T2',
...
'6563XP8N322', '6563XP8N322_2O15',
'6563XP8N3226T2', '6563XP8N3226T2_2O15',
'6563XP8N6T2', '6563XP-NOTT1-6T2',
'6563XP6T2', '9563XPT1',
'9563XPT1_2O15',
...
'9566UB', '9566UB_2O15',
'9566UB174', '9566UB8N',
'6566UB8N_2O15', '6566UB8N174',
'6566UB8N322',
...)
myView2:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and MYFIELD in ('9P26_B', '9P26_BN',
'9P26_8N',
...
'9P26_8NN', '9P26_2O158N9',
'556_B', '556_8N',
...
'5566NP4P', '696N65T',
'696N65T6T2',
...
'696W1P_B', '696W1P_8N')
--solution 2
--myView1:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and (MYFIELD like '656%' or MYFIELD like '956%')
--myView2:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and (MYFIELD like '9P26%'
or MYFIELD like '556_%'
or MYFIELD like '5566%'
or MYFIELD like '696%')
--solution 3
--myView1:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and (REGEXP_LIKE(MYFIELD, '^656') or REGEXP_LIKE(MYFIELD, '^956'))
--myView2:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and (REGEXP_LIKE(MYFIELD, '^9P26')
or REGEXP_LIKE(MYFIELD, '^556_')
or REGEXP_LIKE(MYFIELD, '^5566')
or REGEXP_LIKE(MYFIELD, '^696'))
I hope that explains what I need, if there is a better solution, please suggest it!
Thank You Very Very Much!
Why not just use LIKE?
--myView1:
select field1, field2, ...., fieldn, MYFIELD
from myTable
where DateYYYYMMDD > sysdate -30
and
MYFIELD like '656%' or MYFIELD like '956%'
etc.
REGEXP functions are powerful but not fast.
Like #Tony Andrews said, I would avoid the REGEXP_LIKE option because you don't need any functionality that it provides that LIKE doesn't.
Having an appropriate index is going to help you much more than switching between IN and LIKE. Ideally you would have an index on DateYYYYMMDD, MYFIELD. If you do, I would be surprised if the difference between IN / LIKE made any noticible difference at all with how you are using them.
Related
In the MyTable table, I have the following data in the MyField column whose size is 80 characters:
MyField
-------
WA
W
W51534
W
W
I am trying to exclude lines starting with WA through regexp_like. But the following query returns the W51534 line and not the W lines :
select MyField
from MyTable
where regexp_like (upper (ltrim (MyField)), '^ [W][^A]');
I would like it to also return lines W. How can I do?
thank you in advance
You may not even need to use REGEXP_LIKE here, regular LIKE may suffice:
SELECT MyField
FROM MyTable
WHERE MyField NOT LIKE 'WA%';
Demo
Finally, I solved my problem, as I said in comment, by adding the rpad command :
regexp_like (upper(rpad(MyField,80,'#')), '^[W][^A]');
If someone has a better idea, I'm interested.
thinks
You can negate the regexp_like, so have it match the patterns you don't want:
with mytable(id, myfield) as (
select 1, 'WA' from dual union all
select 2, 'W' from dual union all
select 3, 'W51534' from dual union all
select 4, 'Z' from dual union all
select 5, '' from dual
)
select id, myfield
from mytable
where not regexp_like(upper(myfield), '^WA') or
myfield is null
order by id;
We have six hive tables with sample (example) structure like
(where each table has millions of merchant records)
Table1
MerchntId ,field1, field2
Table2
MerchantId, field3,field4
Table3
MerchantId, field5,field6,field7
Table4
MerchantId, field8,field9,field10
Table5
MerchantId, field11, field12, field13
and so on
Requirement is to create a horizantal layout to take all unique merchants where at least one field has value for a merchantId
A merchantId may or may not present in other tables.(for a merchant there may be records in other tables or may not be there)
Final Table
MerchntId, field1, field2, field3, field4,
field5,field6,field7,field8,field9,field10,field11, field12, field13
output should be like after joining
i) 101 abc def ghi
ii) 102 ghj fert hyu ioj khhh jjh ddd aas fff kkk fff vvv ff
for case (i) only three fields have values
for case (ii) all fields have values
For this we are doing FULL OUTER JOIN on merchantId for two tables and so on and then creating the final table
Is there any better approach doing this ?
for eg.
my current approach
SELECT distinct
(case when a.MerchntId IS NOT NULL then a.MerchntId else (case when
b.MerchntId IS NOT NULL
then b.MerchntId else '' end ) end ) as MerchntId,
(case when a.field1 IS NOT NULL then a.field1 else '' end ) as field1,
(case when a.field2 IS NOT NULL then a.field2 else '' end ) as field2,
(case when b.field3 IS NOT NULL then b.field3 else '' end ) as field3,
(case when b.field4 IS NOT NULL then b.field4 else '' end ) as field4
from Table1 a
full outer join Table2 b
ON a.MerchntId = c.MerchntId;
full outer join of table 3 and table 4
and then full outer join of these two tables to create a final table
I don't see any other option since your requirements explicitly translate to a full outer join. However, your query can be improved by using COALESCE and NVL:
SELECT
COALESCE(a.MerchntId, b.MerchntId) as MerchntId,
NVL(a.field1, '') as field1,
NVL(a.field2, '') as field2,
NVL(b.field3, '') as field3,
NVL(b.field4, '') as field4
from Table1 a
full outer join Table2 b
ON a.MerchntId = c.MerchntId;
Also, I'm not sure why you use distinct in your query.
Union all 6 table, substituting missed fields with nulls. Then Aggregate by MerchantId using min or max:
select MerchantId, max(field1) field1, max(field2) field2...max(field13) field13 from
(
select MerchntId field1, field2, null field3, null field4... null field13 from Table1
union all
select MerchntId null field1, null field2, field3, field4... null field13 from Table2
union all
...
select MerchantId, null field1, null field2... field11, field12, field13
from table6
)s group by MerchantId
After this you can apply your logic with replacing nulls with '' if necessary
Consider this part of my query:
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) field3
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2) jointable ON other.fk= jointable.field1
So field4 is a date. I need the date from table. If I add it to the select list I must add it to the group by and as such it will no longer be grouped in a way to pull the MAX(field3).
I could join table again on their primary keys but that doesn't seem ideal. Is there a way to accomplish this?
You could use the aggregate keep dense_rank sytnax to get the date associated with the maximum field3 value for each field1/2 combination:
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) field3,
MAX(field4) KEEP (DENSE_RANK LAST ORDER BY field3) field4
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2) jointable ON other.fk= jointable.field1
Quick demo of just the subquery, with a CTE for some simple data, where the highest field3 is not on the latest field4 date:
with your_table (field1, field2, field3, field4) as (
select 'A', '1', 1, date '2016-11-01' from dual
union all select 'A', '1', 2, date '2016-09-30' from dual
)
SELECT field1, field2, MAX(field3) field3,
MAX(field4) KEEP (DENSE_RANK LAST ORDER BY field3) field4
FROM your_table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2
/
F F FIELD3 FIELD4
- - ---------- ----------
A 1 2 2016-09-30
Seems like a window function would work well here...
SELECT field1, field2, field3, ...
LEFT JOIN (
SELECT field1, field2, MAX(field3) over (partition by field1, field2) field3, Field4
FROM table
WHERE field2 IN ('1','2','3','4')
AND field4 > SYSDATE - 365
GROUP BY field1, field2, field4) jointable ON other.fk= jointable.field1
Max of field 3 will now be independant of field 4 but still be dependant on fields 1 and 2.
I need help with the following -- please help .
How to summate a date range.?? I am a newbie in Oracle .
Try something like this:
SELECT FIELD1, FIELD2, TRUNC(MIN(FIELD3)), TRUNC(MAX(FIELD3)), SUM(FIELD4)
FROM SOME_TABLE
WHERE FIELD3 BETWEEN DATE '2013-02-01'
AND DATE '2013-02-04' + INTERVAL '1' DAY - INTERVAL '1' SECOND
GROUP BY FIELD1, FIELD2
ORDER BY MIN(FIELD3);
Share and enjoy.
The questions/answers in the comment section to your ariginal answer show that you are actually looking for two different selections - one for the first date range and one for the overlapping second date range. Only you want to get all result records in a single result set. You can use UNION for that:
select field1, field2, min(trunc(field3))) || '-' || max(trunc(field3))), sum(field4)
from yourtable
where to_char(field3, 'yyyymmdd') between '20130201' and '20130204'
group by field1, field2
UNION
select field1, field2, min(trunc(field3))) || '-' || max(trunc(field3))), sum(field4)
from yourtable
where to_char(field3, 'yyyymmdd') between '20130203' and '20130207'
group by field1, field2
order by 1, 2, 3;
Using hard-coded dates is a bit odd, as is the way you're making your ranges (and your field4 value appears to be wrong in your sample output), but assuming you know what you want...
You can use acase statement to assign a dummy group number to the rows based on the dates, and then have an outer query that uses group by against that dummy field, which I've called gr:
select field1, field2,
to_char(min(field3), 'MM/DD/YYYY')
||'-'|| to_char(max(field3), 'MM/DD/YYYY') as field3,
sum(field4) as field4
from (
select field1, field2, field3, field4,
case when field3 between date '2013-02-01'
and date '2013-02-05' - interval '1' second then 1
when field3 between date '2013-02-05'
and date '2013-02-08' - interval '1' second then 2
end as gr
from t42
)
group by field1, field2, gr
order by field1, field2, gr;
F FIELD2 FIELD3 FIELD4
- ---------- --------------------- ----------
A 1 02/01/2013-02/04/2013 14
A 1 02/05/2013-02/07/2013 21
The display of field3 will look wrong if there is no data for one of the boundary days, but I'm not sure that's the biggest problem with this approach *8-)
You can potentially modify the case to have more generic groups, but I'm not sure how this will be used.
In a comment you say you specify two groups of dates which do not overlap. This comntradicts the data you posted in your question. Several people have wasted their time proposing non-solutions because of your tiresome inability to expalin your requirements in a clear and simple fashion.
Anyway, assuming you have finally got your story straight and the two groups don't overlap this should work for you:
with data as (
select field1
, field2
, field4
, case when field3 between date '2011-10-30' and date '2012-01-28' then 'GR1'
when field3 between date '2012-10-28' and date '2013-02-03' then 'GR2'
else null
end as GRP
from your_table )
select field1
, field2
, GRP
, sum(field4) as sum_field4
from data
where GRP is not null
order by 1, 2, 3
/
How can I get the count of 2 columns such that there are distinct combinations of two columns?
Select count(distinct cola, colb)
SELECT COUNT(*)
FROM (
SELECT DISTINCT a, b
FROM mytable
)
SELECT COUNT(1)
FROM (
SELECT DISTINCT COLA, COLB
FROM YOUR_TABLE
)
Another way of doing it
SELECT COUNT(DISTINCT COLA || COLB)
FROM THE_TABLE
http://www.sqlfiddle.com/#!4/c287e/2
SELECT (select count(cola) from ...), (select count(colb) from ...) from ...
You may want to look at this:
http://www.java2s.com/Code/Oracle/Aggregate-Functions/COUNTcolumnandCOUNTcountthenumberofrowspassedintothefunction.htm
You can put Distinct in the subqueries, if you desire.
In Oracle DB, you can concat the columns and then count on that concatenated String like below:
SELECT count(DISTINCT concat(ColumnA, ColumnB)) FROM TableX;
In MySql, you can just add the columns as parameters in count method.
SELECT count(DISTINCT ColumnA, ColumnB) FROM TableX;
SELECT COUNT(*)
FROM (
SELECT DISTINCT a, b
FROM mytable
) As Temp