I have a database table that stores multiple records of survey scores, the scores are between 1-100. I'm trying to present a frequency distribution on the apps front end, by grouping the scores into the following range;
Less than 20
20-30
30-40
40-50
50-60
60-70
70-80
80-90
90-100
So if the table had the data 87, 92, 95, 98, the user would see
80 - 90 (1)
90 - 100 (3)
etc. I think collections are the way to go about it, but I don't know where to start to get this sort of output, or whether it's even possible in Laravel?
Yes, it's possible. I believe this is the SQL query that you need (assume your table name is "scores", and "score" is the appropriate field):
select (case when score between 0 and 20 then 'Less than 20'
when score between 21 and 30 then 'Between 21 and 30'
when score between 31 and 40 then 'Between 31 and 40'
when score between 41 and 50 then 'Between 41 and 50'
when score between 51 and 60 then 'Between 51 and 60'
when score between 61 and 70 then 'Between 61 and 70'
when score between 71 and 80 then 'Between 71 and 80'
when score between 81 and 90 then 'Between 81 and 90'
when score between 91 and 100 then 'Between 91 and 100'
end) as score_range, count(*) as count
from scores
group by score_range
order by min(score);
So for Laravel it could work like this:
$frequency = DB::select("SELECT (CASE
WHEN score BETWEEN 0 AND 20 THEN 'Less than 20'
WHEN score BETWEEN 21 AND 30 THEN '20-30'
WHEN score BETWEEN 31 AND 40 THEN '30-40'
WHEN score BETWEEN 41 AND 50 THEN '40-50'
WHEN score BETWEEN 51 AND 60 THEN '50-60'
WHEN score BETWEEN 61 AND 70 THEN '60-70'
WHEN score BETWEEN 71 AND 80 THEN '70-80'
WHEN score BETWEEN 81 AND 90 THEN '80-90'
WHEN score BETWEEN 91 AND 100 THEN '90-100'
END) AS score_range, COUNT(*) as count
FROM scores
GROUP BY score_range
ORDER BY MIN(score);");
You can just edit the text titles.
In this query "40-50" (for example) it means, that the score is between 41 and 50. Also you can replace "ORDER BY MIN(score)" to "ORDER BY count" if you want.
Related
table
room-number
entry-number
electricity
n100
5
100
n100
4
90
n200
2
75
n200
1
69
n300
6
150
n300
5
111
result should be
room-number
electricity
n100
100
n200
75
n300
150
I'm not sure because I haven't tried it but
try this.
SELECT room-number, MAX(electricity) FROM table group by room-number
for example : I have a table as follows
id math science english history
1 80 90 90 90
2 70 60 81 78
3 69 50 45 80
4 30 40 10 80
i only want to find the maximum value in column math and science.
Is it possible?
Simply use this :
select max(science),max(math) from your_table
I have a set of 80 students and I need to sort them into 20 groups of 4.
I have their previous exam scores from a prerequisite module and I want to ensure that the average of the sorted group members scores is as close as possible to the overall average of the previous exam scores.
Sorry, if that isn't particularly clear.
Here's a snapshot of the problem:
Student Score
AA 50
AB 45
AC 80
AD 70
AE 45
AF 55
AG 65
AH 90
So the average of the scores here is 62.5. How would I best go about sorting these eight students into two groups of four such that, for both groups, the average of their combined exam scores is as close as possible to 62.5.
My problem is exactly this but with 80 data points (20 groups) rather than 8 (2 groups).
The more I think about this problem the harder it seems.
Does anyone have any ideas?
Thanks
One Possible Solution:
I would try going with a greedy algorithm that starts by pairing each student with another student that gets you closest to your target average. After the initial pairing you should then be able to make subsequent pairs out of the first pairs using the same approach.
After the first round of pairing, this approach leverages taking the average of two averages and comparing that to the target mean to create subsequent groups. You can read more about why that will work for this problem here.
However,
This will not necessarily give you the optimal solution, but is rather a heuristic technique to solve the problem. One noted example below is when one low value must be offset by three high values to reach the targeted mean. These types of groupings will not be accounted for by this technique. However, if you know you have a relatively normal distribution centered around your targeted mean then I think this approach should give a decent approximation.
First sort the goup by score. So it becomes:
AH 90
AC 80
.....
AB 45
AE 45
Then start combinning the first with the last:
(AE, AH, 67.5)
(AB, AC, 62.5)
(AD, AA, 60)
(AG, AF, 60)
And so on in the other case you will combine the two by two. First two with the last two.
Another way:
1. Find all the possible groups by 4 students.
2. Then for every combination of groups find the abs deviation from the average score and SUM it up for the combination of groups.
3. Choose the combination of groups with the lowest sum.
Initially, I did think about the top-bottom match option.
However, as John has highlighted, the results certainly aren't optimal:
Scores Students Avg.
40 94 40 94 'AE' 'DA' 'AI' 'AR' 67
40 90 40 88 'AK' 'CI' 'AM' 'BP' 64.5
40 85 40 80 'AQ' 'AW' 'AT' 'BD' 61.25
40 79 40 77 'AU' 'BC' 'AV' 'AB' 59
40 76 40 75 'AX' 'CG' 'AZ' 'CQ' 57.75
40 75 40 75 'BF' 'CB' 'BN' 'BQ' 57.5
40 75 40 74 'BR' 'BI' 'CF' 'CZ' 57.25
40 74 40 74 'CK' 'CO' 'CP' 'AL' 57
40 72 41 71 'DB' 'CN' 'AG' 'BO' 56
41 71 42 70 'CD' 'BM' 'AH' 'BS' 56
42 70 42 69 'BG' 'BL' 'CU' 'CX' 55.75
43 68 44 67 'BK' 'CY' 'AD' 'CE' 55.5
44 64 44 64 'BJ' 'CR' 'BZ' 'BY' 54
45 64 45 63 'BW' 'BV' 'CS' 'BE' 54.25
45 62 47 60 'CV' 'CH' 'AC' 'CM' 53.5
47 59 47 58 'BT' 'AY' 'CL' 'AP' 52.75
47 57 48 57 'CT' 'BA' 'BX' 'AS' 52.25
48 56 49 56 'CA' 'AJ' 'AN' 'AA' 52.25
50 55 50 54 'BB' 'AF' 'CJ' 'AO' 52.25
51 52 51 52 'CC' 'BU' 'CW' 'BH' 51.5
Does anyone know how to create a calculated column (in Spotfire) that will sum data in order of increasing values contained within another column?
For example, what would the expression be to Sum data in [P] in increasing order of [K], for each [Well]
Some example data:
Well Depth P K
A 85 0.191 108
A 85.5 0.192 102
A 87 0.17 49
A 88 0.184 47
A 89 0.192 50
B 298 0.215 177
B 298.5 0.2 177
B 300 .017 105
B 301 0.23 200
You can use:
Sum([P]) OVER (intersect([Well],AllPrevious([K])))
This returns the cumulative sum of P in order of K per Well in ascending order of K.
Well K P Cumulative Sum of P
A 47 0,184 0,184
A 49 0,17 0,354
A 50 0,192 0,546
A 102 0,192 0,738
A 108 0,191 0,929
B 105 0,017 0,017
B 177 0,215 0,432
B 177 0,2 0,432
B 200 0,23 0,662
Edit Based on OP's comment:
you can use to get the cumulative sum in descending order of K:
Sum([P]) OVER (intersect([Well],AllNExt([K])))
My input table looks like:
guest_id days
101 79
101 70
101 68
101 61
102 101
102 90
102 55
103 99
103 90
Note that, days are in descending order,by guest_id
Desired output table:
guest_id days days_diff
101 79 0
101 70 9
101 68 2
101 61 7
102 101 0
102 90 11
102 55 35
103 99 0
103 90 9
days_diff is the first order difference by guest_id (not throughout days column)
You need to have a unique id column as well (otherwise Hive doesn't know about the order of your rows).
Then you can just self join on id=id+1 to get your differences:
select a.guest_id,
a.days,
case when a.guest_id = b.guest_id then b.days-a.days else 0 end days_diff
from
input a
join input b on a.id=b.id-1
Edit: As pointed out by Kunal in the comments, Hive does have a Lag window function which requires a PARTITION BY ... ORDER BY clause; you still need something to order your table by, for example if you have a date column you would used this like the following:
SELECT guest_id,
days,
LAG(days, 1, 0) OVER (PARTITION BY guest_id ORDER BY date)
FROM input;