How do I select a random row from the database based on the probability chance assigned to each row.
Example:
Make Chance Value
ALFA ROMEO 0.0024 20000
AUDI 0.0338 35000
BMW 0.0376 40000
CHEVROLET 0.0087 15000
CITROEN 0.016 15000
........
How do I select random make name and its value based on the probability it has to be chosen.
Would a combination of rand() and ORDER BY work? If so what is the best way to do this?
You can do this by using rand() and then using a cumulative sum. Assuming they add up to 100%:
select t.*
from (select t.*, (#cumep := #cumep + chance) as cumep
from t cross join
(select #cumep := 0, #r := rand()) params
) t
where #r between cumep - chance and cumep
limit 1;
Notes:
rand() is called once in a subquery to initialize a variable. Multiple calls to rand() are not desirable.
There is a remote chance that the random number will be exactly on the boundary between two values. The limit 1 arbitrarily chooses 1.
This could be made more efficient by stopping the subquery when cumep > #r.
The values do not have to be in any particular order.
This can be modified to handle chances where the sum is not equal to 1, but that would be another question.
Related
I would like to ask about some algorithms related to checking if a customer can book a table at the store?
I will describe my problem with the following example:
Restaurant:
User M has a restaurant R. R is open from 08:00 to 17:00.
Restaurant R has 3 tables (T1, T2, T3), each table will have 6 seats.
R offers F1 food, which can be eaten within 2 hours.
Booking:
R has a customer C has booked a table T1 for 5 people with F1 food | B[0]
B[0] has a start time: 9AM
M is the manager of the store, so M wants to know if the YYYY-MM-DD date has been ordered by the customer or not?
My current algorithm is:
I will create an array with the elements as the number of minutes of the day, and their default value is 0
24 * 60 = 1440
=> I have: arr[1440] = [0, 0, 0, ...]
Next I will get all the bookings for the day YYYY-MM-DD. The result will be an array B[].
Then I will loop the array B[]
for b in B[]
I then keep looping for the start_time, to the end_time of b with step of 1 min.
for time = start_time, time <= end_time. time++
With each iteration I will reassign the value of the array arr with index as the corresponding number of minutes in the day to 1
(It is quite similar to Sieve of Eratosthenes)
Then what I need to do is iterate over the array arr 1 more time, if there is at least 1 value 0 in the array it means YYYY-MM-DDdate is still bookable.
But my algorithm will not be optimal if increase the number of tables that the store has, the number of days to check is many days (for example from 2022-01-01 -> 2022-02-01), ...
Thank you very much.
P/S: Regarding the technology background, I am currently using laravel 9
I have a wide variety of numbers
In the ten thousands, thousands, hundreds, etc
I would like to compute the rounding to the highest place value ex:
Starting #: 2555.5
Correctly Rounded : 3000
——
More examples ( in the same report )
Given: 255
Rounded: 300
Given: 25555
Rounded: 30000
Given: 2444
Rounded: 2000
But with the Round() or Ceil() functions I get the following
Given: 2555.5
Did not want : 2556
Any ideas ??? Thank you in advance
You can combine numeric functions like this
SELECT
col,
ROUND(col / POWER(10,TRUNC(LOG(10, col)))) * POWER(10,TRUNC(LOG(10,col)))
FROM Data
See fiddle
Explanation:
LOG(10, number) gets the power you need to raise 10 to in order get the number. E.g., LOG(10, 255) = 2.40654 and 10^2.40654 = 255
TRUNC(LOG(10, col)) the number of digit without the leading digit (2).
POWER(10,TRUNC(LOG(10, col))) converts, e.g., 255 to 100.
Then we divide the number by this rounded number. E.g. for 255 we get 255 / 100 = 2.55.
Then we round. ROUND(2.55) = 3
Finally we multiply this rounded result again by the previous divisor: 3 * 100 = 300.
By using the Oracle ROUND function with a second parameter specifying the number of digits with a negative number of digits, we can simplify the select command (see fiddle)
SELECT
col,
ROUND(col, -TRUNC(LOG(10, col))) AS rounded
FROM Data
You can also use this to round by other fractions like quarters of the main number:
ROUND(4 * col, -TRUNC(LOG(10, col))) / 4 AS quarters
see fiddle
Similar to what Olivier had built, you can use a combination of functions to round the numbers as you need. I had built a similar method except instead of using LOG, I used LENGTH to get the number of non-decimal digits.
WITH
nums (num)
AS
(SELECT 2555.5 FROM DUAL
UNION ALL
SELECT 255 FROM DUAL
UNION ALL
SELECT 25555 FROM DUAL
UNION ALL
SELECT 2444 FROM DUAL)
SELECT num,
ROUND (num, (LENGTH (TRUNC (num)) - 1) * -1) as rounded
FROM nums;
NUM ROUNDED
_________ __________
2555.5 3000
255 300
25555 30000
2444 2000
I want to randomly pull out 3 records from a table and then order them to the field "sponsor_ranking".
My Code Reads.
$sql = "SELECT * FROM $TableSponsors ORDER BY RAND(), sponsor_ranking asc LIMIT 3";
But it is not ordering the results in order of the "sponsor_ranking" but it is randomizing the results.
Any suggestions?
Thank you.
Of course, ordering by sponsor_ranking is only useful if each of the records had the same RAND() value, which is not so likely.
You can solve it like this. Order by random, limit to 3, order again by sponsor_ranking.
SELECT * FROM
(SELECT * FROM $TableSponsors
ORDER BY RAND()
LIMIT 3) x
ORDER BY
sponsor_ranking
you could make a subtable in the for clause :
$sql = "SELECT * FROM (SELECT * FROM $TableSponsors ORDER BY RAND() LIMIT 3) Faketable ORDER BY sponsor_ranking";
This will never work. Doing an order by with multiple fields requires that the "earlier" fields have the same values for the second and subsequent fields to even be considered.
You'll have to use a subquery to do the rand() ordering, then rank by the other fields in the parent query:
SELECT *
FROM (
SELECT *
FROM $TableSponsors
ORDER BY RAND()
) as foo
ORDER BY sponsor_ranking
LIMIT 3
e.g. if your table had this:
x y
1 5
1 6
2 7
3 8
4 9
... ORDER BY x DESC, y ASC
then you'd get
x y
4 9 // only one "4", so 9 is ignored, no point in sorting a single value
3 8 // only one "3", so 8 is ignored, no point in sorting a single value
2 7 // ditto
1 5 // hey, there's two "1" values, so now the second field **IS** sorted
1 6
For example if you take the following example into consideration.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.33 - 3rd divided by 3
99.99 - Is the sum of the 3 division outcomes
But i want it to match the original 100.00
One way that i saw it could be done was by taking the original number minus the first two divisions and the result would be my third number. Now if i take those 3 numbers i get my original number.
100.00 - Original Number
33.33 - 1st divided by 3
33.33 - 2nd divided by 3
33.34 - 3rd number
100.00 - Which gives me my original number correctly. (33.33+33.33+33.34 = 100.00)
Is there a formula for this either in Oracle PL/SQL or a function or something that could be implemented?
Thanks in advance!
This version takes precision as a parameter as well:
with q as (select 100 as val, 3 as parts, 2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts
no v
=== =====
1 33.33
2 33.33
3 33.34
For example, if you want to split the value among the number of days in the current month, you can do this:
with q as (select 100 as val
,extract(day from last_day(sysdate)) as parts
,2 as prec from dual)
select rownum as no
,case when rownum = parts
then val - round(val / parts, prec) * (parts - 1)
else round(val / parts, prec)
end v
from q
connect by level <= parts;
1 3.33
2 3.33
3 3.33
4 3.33
...
27 3.33
28 3.33
29 3.33
30 3.43
To apportion the value amongst each month, weighted by the number of days in each month, you could do this instead (change the level <= 3 to change the number of months it is calculated for):
with q as (
select add_months(date '2013-07-01', rownum-1) the_month
,extract(day from last_day(add_months(date '2013-07-01', rownum-1)))
as days_in_month
,100 as val
,2 as prec
from dual
connect by level <= 3)
,q2 as (
select the_month, val, prec
,round(val * days_in_month
/ sum(days_in_month) over (), prec)
as apportioned
,row_number() over (order by the_month desc)
as reverse_rn
from q)
select the_month
,case when reverse_rn = 1
then val - sum(apportioned) over (order by the_month
rows between unbounded preceding and 1 preceding)
else apportioned
end as portion
from q2;
01/JUL/13 33.7
01/AUG/13 33.7
01/SEP/13 32.6
Use rational numbers. You could store the numbers as fractions rather than simple values. That's the only way to assure that the quantity is truly split in 3, and that it adds up to the original number. Sure you can do something hacky with rounding and remainders, as long as you don't care that the portions are not exactly split in 3.
The "algorithm" is simply that
100/3 + 100/3 + 100/3 == 300/3 == 100
Store both the numerator and the denominator in separate fields, then add the numerators. You can always convert to floating point when you display the values.
The Oracle docs even have a nice example of how to implement it:
CREATE TYPE rational_type AS OBJECT
( numerator INTEGER,
denominator INTEGER,
MAP MEMBER FUNCTION rat_to_real RETURN REAL,
MEMBER PROCEDURE normalize,
MEMBER FUNCTION plus (x rational_type)
RETURN rational_type);
Here is a parameterized SQL version
SELECT COUNT (*), grp
FROM (WITH input AS (SELECT 100 p_number, 3 p_buckets FROM DUAL),
data
AS ( SELECT LEVEL id, (p_number / p_buckets) group_size
FROM input
CONNECT BY LEVEL <= p_number)
SELECT id, CEIL (ROW_NUMBER () OVER (ORDER BY id) / group_size) grp
FROM data)
GROUP BY grp
output:
COUNT(*) GRP
33 1
33 2
34 3
If you edit the input parameters (p_number and p_buckets) the SQL essentially distributes p_number as evenly as possible among the # of buckets requested (p_buckets).
I've solved this problem yesterday by subtracting 2 of 3 parts from the starting number, e.g. 100 - 33.33 - 33.33 = 33.34 and the result of summing it up is still 100.
I have table with some positive integer numbers
n
----
1
2
5
10
For each row of this table I want values cos(cos(...cos(0)..)) (cos is applied n times) to be calculated by means of SQL statement (PL/SQL stored procedures and functions are not allowed):
n coscos
--- --------
1 1
2 0.540302305868
5 0.793480358743
10 0.731404042423
I can do this in Oracle 11g by using recursive queries.
Is it possible to do the same in Oracle 10g ?
The MODEL clause can solve this:
Test data:
create table test1(n number unique);
insert into test1 select * from table(sys.odcinumberlist(1,2,5,10));
commit;
Query:
--The last row for each N has the final coscos value.
select n, coscos
from
(
--Set each value to the cos() of the previous value.
select * from
(
--Each value of N has N rows, with value rownumber from 1 to N.
select n, rownumber
from
(
--Maximum number of rows needed (the largest number in the table)
select level rownumber
from dual
connect by level <= (select max(n) from test1)
) max_rows
cross join test1
where max_rows.rownumber <= test1.n
order by n, rownumber
) n_to_rows
model
partition by (n)
dimension by (rownumber)
measures (0 as coscos)
(
coscos[1] = cos(0),
coscos[rownumber > 1] = cos(coscos[cv(rownumber)-1])
)
)
where n = rownumber
order by n;
Results:
N COSCOS
1 1
2 0.54030230586814
5 0.793480358742566
10 0.73140404242251
Let the holy wars begin:
Is this query a good idea? I wouldn't run this query in production, but hopefully it is a useful demonstration that any problem can be solved with SQL.
I've seen literally thousands of hours wasted because people are afraid to use SQL. If you're heavily using a database it is foolish to not use SQL as your primary programming language. It's good to occasionally spend a few hours to test the limits of SQL. A few strange queries is a small price to pay to avoid the disastrous row-by-row processing mindset that infects many database programmers.
Using WITH FUNCTION(Oracle 12c):
WITH FUNCTION coscos(n INT) RETURN NUMBER IS
BEGIN
IF n > 1
THEN RETURN cos(coscos(n-1));
ELSE RETURN cos(0);
END IF;
END;
SELECT n, coscos(n)
FROM t;
db<>fiddle demo
Output:
+-----+-------------------------------------------+
| N | COSCOS(N) |
+-----+-------------------------------------------+
| 1 | 1 |
| 2 | .5403023058681397174009366074429766037354 |
| 5 | .793480358742565591826054230990284002387 |
| 10 | .7314040424225098582924268769524825209688 |
+-----+-------------------------------------------+