I have a table in oracle database which may contain amounts >=$10M or <=$-10B.
99999999.99 chunks and also include remainder.
If the value is less than or equal to $-10B, I need to break into one or more 999999999.99 chunks and also include remainder.
Your question is somewhat unreadable, but unless you did not provide examples here is something for start, which may help you or someone with similar problem.
Let's say you have this data and you want to divide amounts into chunks not greater than 999:
id amount
-- ------
1 1500
2 800
3 2500
This query:
select id, amount,
case when level=floor(amount/999)+1 then mod(amount, 999) else 999 end chunk
from data
connect by level<=floor(amount/999)+1
and prior id = id and prior dbms_random.value is not null
...divides amounts, last row contains remainder. Output is:
ID AMOUNT CHUNK
------ ---------- ----------
1 1500 999
1 1500 501
2 800 800
3 2500 999
3 2500 999
3 2500 502
SQLFiddle demo
Edit: full query according to additional explanations:
select id, amount,
case
when amount>=0 and level=floor(amount/9999999.99)+1 then mod(amount, 9999999.99)
when amount>=0 then 9999999.99
when level=floor(-amount/999999999.99)+1 then -mod(-amount, 999999999.99)
else -999999999.99
end chunk
from data
connect by ((amount>=0 and level<=floor(amount/9999999.99)+1)
or (amount<0 and level<=floor(-amount/999999999.99)+1))
and prior id = id and prior dbms_random.value is not null
SQLFiddle
Please adjust numbers for positive and negative borders (9999999.99 and 999999999.99) according to your needs.
There are more possible solutions (recursive CTE query, PLSQL procedure, maybe others), this hierarchical query is one of them.
Related
how do I show in a pie chart the percentage of the total value. eg lets say I have total sick leave which is 30 days and I take 15 days, how in a pie I show 15/30
Generally speaking, to get a percentage value, you'd divide those two values and multiply them by 100 (and, possibly, round the result to 0, 1 or 2 decimals).
That would be - in your example - 50%, right?
SQL> select round(15 / 30 * 100, 2) from dual;
ROUND(15/30*100,2)
------------------
50
SQL>
Chart query expects 3 columns to be specified, e.g.
select null as link,
what as label,
number_of_days as value
from some_table
where some_condition
If you could provide test case so that we'd see what you really have, we'd be able to suggest something more.
Let me start by saying, I am very new to Hive, so I'm not sure what information folks will need to help me out. Please let me know what information would be useful. Also, while I'd usually create a small dataset to recreate the problem with, I think this problem has to do with the scale of my dataset, because I can't seem to recreate the problem on a smaller dataset. Let me know if you have suggestions to make this more easy to answer.
Okay now that's out of the way, here's my problem. I have a huge dataset, partitioned by month, with about 500 million rows per month. I have a column with an ID number in it (I'll call it idcol), and I want to closely examine a couple of examples where there's a high number of repeated IDs and a very low number. So, I used this:
SELECT idcol, COUNT(*) FROM table WHERE month = 7 GROUP BY idcol LIMIT 10;
And got:
000005185884381 13
000035323848000 24
000017027256315 531
000010121767109 54
000039844553332 3
000013731352481 309
000024387407996 3
000028461234451 67
000016564844672 1
000032933040806 17
So, I went to investigate the first idvar with a count of 3, with:
SELECT * FROM table WHERE month = 7 AND idcol = '000039844553332';
I expected to see just 3 rows, but ended up with 469 rows found! That was strange enough, but then I just happened to run the original line of code above but with LIMIT 5 instead and ended up with:
000005185884381 13
000017027256315 75
000010121767109 25
000013731352481 59
000024387407996 1
And, it may be hard to see because the idcol is so long, but idvar 000017027256315 ended up with a count of 531 when I did LIMIT 10 and just 75 when I did LIMIT 5.
What am I missing?! How can I get a correct count of just a small number of values so I can investigate further?!
BTW my first thought was to make the counting part a sub-query, but that didn't change a thing. I used:
SELECT * FROM (SELECT idcol, COUNT(*) FROM table WHERE month = 7 GROUP BY idcol) x LIMIT 10;
...same EXACT results
Most likely the counts are being computed from statistics.See here for the bug and the related discussion.
hive.compute.query.using.stats = FALSE
If this doesn't fix it try the ANALYZE command before running the count(*)
ANALYZE TABLE table_name PARTITION(month) COMPUTE STATISTICS;
I'm embarrassed to admit this is a totally noob question - but I take shelter in the fact that I come from a T-SQL world and this is a totally new territory for me
This is a simple table I have with 4 records only
ContractorID ProjectID Cost
1 100 1000
2 100 800
3 200 1005
4 300 2000
This is my PL SQL function which should take a contractor and a project id and return number of hours ( 10 in this case )
create or replace FUNCTION GetCost(contractor_ID IN NUMBER,
project_ID in NUMBER)
RETURN NUMBER
IS
ContractorCost NUMBER;
BEGIN
Select Cost INTO ContractorCost
from Contractor_Project_Table
where ContractorID= contractor_ID and ProjectID =project_ID ;
return ContractorCost;
END;
But then using
select GetCost(1,100) from Contractor_Project_Table;
This returns same row 4 times
1000
1000
1000
1000
What is wrong here? WHy is this returning 4 rows instead of 1
Thank you for
As #a_horse_with_no_name points out, the problem is that Contractor_Project_Table has (presumably) 4 rows so any SELECT against Contractor_Project_Table with no WHERE clause will always return 4 rows. Your function is getting called 4 times, one for each row in the table.
If you want to call the function with a single set of parameters and return a single row of data, you probably want to select from the dual table
SELECT GetCost( 1, 100 )
FROM dual
Because you have 4 rows in Contractor_Project_Table table. Use this query to get one record.
select GetCost(1,100) from dual;
This is a very complicated situation for me and I was wondering if someone can help me with it:
Here is my table:
Record_no Type Solde SQLCalculatedPmu DesiredValues
------------------------------------------------------------------------
2570088 Insertion 60 133 133
2636476 Insertion 67 119,104 119,104
2636477 Insertion 68 117,352 117,352
2958292 Insertion 74 107,837 107,837
3148350 Radiation 73 107,837 107,83 <---
3282189 Insertion 80 98,401 98,395
3646066 Insertion 160 49,201 49,198
3783510 Insertion 176 44,728 44,725
3783511 Insertion 177 44,475 44,472
4183663 Insertion 188 41,873 41,87
4183664 Insertion 189 41,651 41,648
4183665 Radiation 188 41,651 41,64 <---
4183666 Insertion 195 40,156 40,145
4183667 Insertion 275 28,474 28,466
4183668 Insertion 291 26,908 26,901
4183669 Insertion 292 26,816 26,809
4183670 Insertion 303 25,842 25,836
4183671 Insertion 304 25,757 25,751
In my table every value in the SQLCalculatedPmu column or desiredValue Column is calculated based on the preceding value.
As you can see, I have calculated the SQLcalculatedPMU column based on the round on 3 decimals. The case is that on each line radiation, the client want to start the next calculation based on 2 decimals instead of 3(represented in the desired values column). Next values will be recalculated. For example line 6 will change as the value in line 5 is now on 2 decimals. I could handle this if there where one single radiation but in my case I have a lot of Radiations and in this case they will change all based on the calculation of the two decimals.
In summary, Here are the steps:
1 - round the value of the preceding row of a raditaiton and put it in the radiation row.
2 - calculate all next insertion rows.
3 - when we reach another radiation we redo steps 1 and 2 and so on
I m using an oracle DB and I m the owner so I can make procedures, insert, update, select.
But I m not familiar with procedures or loops.
For information, this is the formula for SQLCalculatedPmu uses two additional culmns price and number and this is calculated every line cumulativelly for each investor:
(price * number)+(cumulative (price*number) of the preceeding lines)
I tried something like this :
update PMUTemp
set SQLCalculatedPmu =
case when Type = 'Insertion' then
(number*price)+lag(SQLCalculatedPmu ,1) over (partition by investor
order by Record_no)/
(number+lag(solde,1) over (partition by investor order by Record_no))
else
TRUNC(lag(SQLCalculatedPmu,1) over partition by invetor order by Record_no))
end;
but I gave me this error (I think it's because I m looking at the preceiding line that itself is modified during the SQL statement) :
ORA-30486 : window function are allowed only in the SELECT list of a query.
I was wondering if creating a procedure that will be called as many time as the number of radiations would do the job but I m really not good in procedures
Any help
Regards,
just to make my need simpler, all I want is to have the DesiredValues column starting from the SQLCalculatedPmu column. Steps are
1 - on a radiation the value become = trunc(preceding value,2)
2 - calculate all next insertion rows this way : (price * number)+(cumulative (price*number) of the preceeding lines). As the radiation value have changed then I need to recalculate next lines based on it
3 - when we reach another radiation we redo steps 1 and 2 and so on
Kindest regards
You should not need a procedure here -- a SQL update of the Radiation rows in the table would do this quicker and more reliably.
Something like ..
update my_table t1
set (column_1, column_2) =
(select round(column_1,2), round(column_2,2)
from my_table t2
where t2.type = 'Insertion' and
t2.record_no = (select max(t3.record_no)
from my_table t3
where t3.type = 'Insertion' and
t3.record_no < t1.record_no ))
where t1.type = 'Radiation'
I have a situation in my application for displaying the count of data which match different criterion. Since the performance of counting is degrading with respect to the growth of database, we decided to show only the availability information using the exists clause.
Below is my table structure
Table: DocInfo
---------------------------------------
DocId number
DocName varchar(250)
DocStatus number
SignedBy number
ForwardedBy number
ForwardCount number
DocOwner number
MgrID number
ProjectId number
The current query which does the counting is like this
SELECT NVL(SUM(CASE
WHEN (DocStatus IN (1150,1155,1170,1182,1190) AND
DocOwner=56366 AND
ForwardCount=0)
THEN 1
ELSE 0
END), 0) "ForReview",
NVL(SUM(CASE
WHEN (DocStatus IN (1200) And
MgrID = 56366 AND
ForwardCount = 0 )
THEN 1
ELSE 0
END), 0) "Accepted" ,
NVL(SUM(CASE
WHEN (DocStatus IN (1150,1155,1170,1182,1190) AND
DocOwner=56366 AND
MgrID = 0 )
THEN 1
ELSE 0
END), 0) "Waiting"
FROM DocInfo
WHERE ProjectId = 313 and
(DocOwner = 56366 or MgrID = 56366)
I need to change the counting to an exists clause so that i can show whether documents are available or not in each category.
Since this change is to improve the performance, running this as different queries is also not advisable. Please help me, I have ran out of my limited knowledge.
Sorry to miss the part which i have already tried.
I have changed the above query to a union with exists clause in each like below.
SELECT 'ForReview' AS A
FROM DUAL
WHERE EXISTS (SELECT NULL
FROM DocInfo
WHERE ProjectId = 313 and
(DocOwner = 56366 or MgrID = 56366) and
(DocStatus IN (1150,1155,1170,1182,1190) AND
DocOwner=56366 AND
ForwardCount=0))
UNION
SELECT 'Accepted' AS A
FROM DUAL
WHERE EXISTS (SELECT NULL
FROM DocInfo
WHERE ProjectId = 313 and
(DocOwner = 56366 or MgrID = 56366) and
(DocStatus IN (1200) And
MgrID = 56366 AND
ForwardCount = 0 ))
UNION
SELECT 'Waiting' AS A
FROM DUAL
WHERE EXISTS (SELECT NULL
FROM DocInfo
WHERE ProjectId = 313 and
(DocOwner = 56366 or MgrID = 56366) and
(DocStatus IN (1150,1155,1170,1182,1190) AND
DocOwner=56366 AND
MgrID = 0))
I have mentioned only 3 conditions, whereas my actual application has 8 different criteria to be added into this query. so when i have 8 Exists clauses, it runs internally as 8 different queries, and in effect it takes more time - single segment in the entire union query takes only 560 ms whereas all queries together takes around 7 seconds to generate the output.
Since my requirement is only to identify the Availability of any such record i do not want to navigate through the entire recordset and count it.
Is there anyway to optimize/rewrite this query
Thank You
"so when i have 8 Exists clauses, it runs internally as 8 different
queries, and in effect it takes more time - single segment in the
entire union query takes only 560 ms whereas all queries together
takes around 7 seconds to generate the output."
Surprise, surprise. Running what amounts to the same query eight times will not be faster than running that query once.
Now it is true that EXISTS can be faster, because it only needs to find a single row which matches the given criteria, rather than retrieving an entire data set. However you have just shifted the retrieved data into the WHERE clause so the database still has to do the same amount of work. In fact, it is apparently doing a lot more work, because 7s > (560ms * 8).
To solve your problem properly you need to understand how the database works and how to tune it. Find out more.
For a start, define a tuning goal. Your original query takes half a second to run: that's not lightning fast but it is pretty quick. Why is this a problem? How quickly do you want it to run?
Next, run an EXPLAIN PLAN. Is the query using indexes? How efficiently is its index usage> What percentage of the rows are being selected?
Now you also need to undersatnd your data. Is the selected data evenly distributed throughout the table or are there clusters? Do some projects, owners or managers have more records than others? How does that distribution effect performance?
Please bear in mind, tuning is a science and it is complicated: there are whole books on the subject and some people make very fine livings as performance troubleshooters. It requires a lot of information about your system, both knowledge of what your application does and low-level information on which activities your database is doing. We can help you in your quest to find a more performant solution but we cannot just look at a shonky query and tell you how to re-write so it runs quicker.