Creating frequency interval in Crystal Report - oracle

I am trying to create a dataset of frequency interval in crystal report something like below. First column is rowid, second is start interval , third column is end interval and fourth column is interval name.
1 0 29 0 - 29
2 30 59 30 - 59
3 60 89 60 - 89
4 90 119 90 - 119
5 120 149 120 - 149
6 150 179 150 - 179
7 180 209 180 - 209
8 210 239 210 - 239
9 240 269 240 - 269
10 270 299 270 - 299
11 300 329 300 - 329
12 330 359 330 - 359
13 360 389 360 - 389
14 390 419 390 - 419
15 420 449 420 - 449
16 450 479 450 - 479
17 480 509 480 - 509
18 510 539 510 - 539
19 540 569 540 - 569
20 570 599 570 - 599
21 600 629 600 - 629
22 630 659 630 - 659
23 660 689 660 - 689
24 690 719 690 - 719
25 720 749 720 - 749
26 750 779 750 - 779
27 780 809 780 - 809
28 810 839 810 - 839
29 840 869 840 - 869
30 870 899 870 - 899
Can I write a CTE to generate this interval so that I can use it directly in crystal report without writing function on database side? Below is the code which I wrote:
declare intervalStart integer := 0;
intervalEnd integer := 900;
intervalMins varchar(10) := 30;
totalIntervals number := 0;
begin
begin
execute immediate 'create global temporary table intervalTable (row_Id int not null, intStart integer, intEnd integer, intervalName varchar2(25))ON COMMIT DELETE ROWS';
exception when others then dbms_output.put_line(sqlerrm);
end;
totalIntervals := intervalEnd/intervalMins;
--dbms_output.put_line(totalIntervals);
for i in 1 ..totalIntervals loop
intervalStart := 0;
intervalEnd := 0;
intervalStart := intervalStart + (i-1)*intervalMins;
intervalEnd := intervalEnd + (i*intervalMins)-1;
--dbms_output.put_line(intervalStart || ' - ' || intervalEnd);
insert into intervalTable
(
row_id,
intStart,
intEnd,
intervalName
)
values(i, intervalStart, intervalEnd, (intervalStart || ' - ' || intervalEnd));
end loop;
end;

I think you want something like this:
with freq_data as (
select level as id, (level-1)*30 as start_interval, ((level-1)*30) + 29 as end_interval, (level-1)*30 || ' - ' || to_char(((level-1)*30) + 29) as label
from dual
connect by level <= 30
order by level
)
select * from freq_data;
Output
ID START_INTERVAL END_INTERVAL LABEL
1 0 29 0 - 29
2 30 59 30 - 59
3 60 89 60 - 89
4 90 119 90 - 119
5 120 149 120 - 149
6 150 179 150 - 179
7 180 209 180 - 209
8 210 239 210 - 239
9 240 269 240 - 269
10 270 299 270 - 299
11 300 329 300 - 329
12 330 359 330 - 359
13 360 389 360 - 389
14 390 419 390 - 419
15 420 449 420 - 449
16 450 479 450 - 479
17 480 509 480 - 509
18 510 539 510 - 539
19 540 569 540 - 569
20 570 599 570 - 599
21 600 629 600 - 629
22 630 659 630 - 659
23 660 689 660 - 689
24 690 719 690 - 719
25 720 749 720 - 749
26 750 779 750 - 779
27 780 809 780 - 809
28 810 839 810 - 839
29 840 869 840 - 869
30 870 899 870 - 899
An example using the above in a join query:
create table my_test
(
num number
-- other important data ...
);
-- insert some random numbers
insert into my_test
select trunc(DBMS_RANDOM.VALUE(0,900))
from dual
connect by level <= 10;
commit;
Now joining to get the label for each num field:
with freq_data as (
select level as id, (level-1)*30 as start_interval, ((level-1)*30) + 29 as end_interval, (level-1)*30 || ' - ' || to_char(((level-1)*30) + 29) as label
from dual
connect by level <= 30
order by level
)
select t.num, d.label
from my_test t
left join freq_data d ON (t.num between d.start_interval and d.end_interval);
Output:
NUM LABEL
64 60 - 89
73 60 - 89
128 120 - 149
154 150 - 179
267 240 - 269
328 300 - 329
550 540 - 569
586 570 - 599
745 720 - 749
795 780 - 809

Related

increasing range parsing challenge with awk

I wrote this in response to Reddit's daily programmer challenge, and I would like to get some of your feedback on it to improve the code (it seems to work). The challenge is as follows:
We are given a list of numbers in a "short-hand" range notation where only the significant part of the next number is written because we know the numbers are always increasing (ex. "1,3,7,2,4,1" represents [1, 3, 7, 12, 14, 21]). Some people use different separators for their ranges (ex. "1-3,1-2", "1:3,1:2", "1..3,1..2" represent the same numbers [1, 2, 3, 11, 12]) and they sometimes specify a third digit for the range step (ex. "1:5:2" represents [1, 3, 5]).
NOTE: For this challenge range limits are always inclusive.
Our job is to return a list of the complete numbers.
The possible separators are: ["-", ":", ".."]
Sample input:
104..02
545,64:11
Sample output:
104 105 106...200 201 202 # truncated for simplicity
545 564 565 566...609 610 611 # truncated for simplicity
My solution:
BEGIN { FS = "," }
function next_value(current_value, previous_value) {
regexp = current_value "$"
while(current_value <= previous_value || !(current_value ~ regexp)) {
current_value += 10
}
return current_value;
}
{
j = 0
delete number_list
for(i = 1; i <= NF; i++) {
# handle fields with ranges
if($i ~ /-|:|\.\./) {
split($i, range, /-|:|\.\./)
if(range[1] > range[2]) {
if(j != 0) {
range[1] = next_value(range[1], number_list[j-1])
range[2] = next_value(range[2], range[1])
}
else
range[2] = next_value(range[2], range[1]);
}
if(range[3] == "")
number_to_iterate_by = 1;
else
number_to_iterate_by = range[3];
range_iterator = range[1]
while(range_iterator <= range[2]) {
number_list[j] = range_iterator
range_iterator += number_to_iterate_by
j++
}
}
else {
number_list[j] = $i
j++
}
}
# apply increasing range logic and print
for(i = 0; i < j; i++ ) {
if(i == 0) {
if(NR != 1) printf "\n"
current_value = number_list[i]
}
else {
previous_value = current_value
current_value = next_value(number_list[i], previous_value)
}
printf "%s ", current_value
}
}
END { printf "\n" }
This is BASH (Not AWK).
I believe it is a valid answer because the original challenge doesn't specify a language.
#!/bin/bash
mkord(){ local v=$1 dig base
max=$2
(( dig=10**${#v} , base=max/dig*dig , v+=base ))
while (( v < max )); do (( v+=dig )); done
max=$v
}
while read line; do
line="${line//[,\"]/ }" line="${line//[:-]/..}"
IFS=' ' read -a arr <<<"$line"
max=0 a='' res=''
for val in "${arr[#]//../ }"; do
IFS=" " read v1 v2 v3 <<<"$val"
(( a==0 )) && max=$v1
[[ $v1 ]] && mkord "$v1" "$max" && v1=$max
[[ $v2 ]] && mkord "$v2" "$max" && v2=$max
res=$res${a:+,}${v2:+\{}$v1${v2:+\.\.}$v2${v3:+\.\.}$v3${v2:+\}}
a=1
done
(( ${#arr[#]} > 1 )) && res={$res}
eval set -- $res
echo "\"$*\""
done <"infile"
If the source of the tests is:
$ cat infile
"1,3,7,2,4,1"
"1-3,1-2"
"1:5:2"
"104-2"
"104..02"
"545,64:11"
The result will be:
"1 3 7 12 14 21"
"1 2 3 11 12"
"1 3 5"
"104 105 106 107 108 109 110 111 112"
"104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202"
"545 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611"
This gets the list done in 7 milliseconds.
My solution using gawk, RT (It contains the input text that matched the text denoted by RS) and next_n function uses modulo operation for to find the next number based on the last
cat range.awk
BEGIN{
RS="\\.\\.|,|:|-"
start = ""
end = 0
temp = ""
}
function next_n(n, last){
mod = last % (10**length(n))
if(mod < n) return last - mod + n
return last + ((10**length(n))-mod) + n
}
{
if(RT==":" || RT==".." || RT=="-"){
if(start=="") start = next_n($1,end)
else temp = $1
}else{
if(start != ""){
if(temp==""){
end = next_n($1,start)
step = 1
}else {
end = next_n(temp,start)
step = $1
}
for(i=start; i<=end; i+=step) printf "%s ", i
start = ""
temp = ""
}else{
end = next_n($1,end)
printf "%s ", end
}
}
}
END{
print ""
}
TEST 1
echo "104..02" | awk -f range.awk
OUTPUT 1
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202
TEST 2
echo "545,64:11" | awk -f range.awk
OUTPUT 2
545 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611
TEST 3
echo "2..5,7,2-1,2:1,0-3,2-7,8..0,4,4,2..1" | awk -f range.awk
OUTPUT 3
2 3 4 5 7 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 40 41 42 43 52 53 54 55 56 57 58 59 60 64 74 82 83 84 85 86 87 88 89 90 91
TEST 4 with step
echo "1:5:2,99,88..7..3" | awk -f range.awk"
OUTPUT 4
1 3 5 99 188 191 194 197

Keep lines based on ratio between lines

I have a sort -g k9 command on a file that gives me this in the bash standard output:
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674
28.32 671 301 32 1 507 48 702 3e-49 183
28.32 671 301 32 1 507 47 701 3e-49 183
31.40 516 247 24 86 507 196 698 1e-46 176
31.41 519 243 25 86 507 196 698 5e-46 175
27.72 588 290 26 19 481 98 675 2e-39 154
30.56 337 170 17 101 413 302 598 5e-20 96.3
30.56 337 170 17 101 413 302 598 8e-20 95.5
I would like to cut my data based on the 9th column. The idea would be to compare the value of the 9th column on line i, divide it by the value of the 9th column on line i+1, and if the ratio is 0 OR 0/0 OR > 1e-50, line i and i+1 are kept. As soon as one of these conditions is not filled, stop reading. The desired output would be:
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674
I can obtain this output with head -n 6 but this is obviously not based on the condition on values in the 9th column. Please note that the values are in 'scientific' format.
I know how to do this in Python (write the standard output to a file, calculate ratios, etc.) but for commodity reasons I'd prefer a shell-based solution (with awk or sort for instance) although I don't know if that's possible. Thanks for your help!
Just exit the script when the condition is not accomplished; otherwise, print the previous line and store the 9th field to compare on the next loop:
$ awk '($9 && prev/$9>1e-50) {exit} {print stored; prev=$9; stored=$0}' file
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674

Nurse scheduling model formulation in AMPL

I have been working on a Nurse scheduling problem in AMPL for the following conditions:
Total no. of Nurses=20
Total no. of shits= 3 #morning,day,night
Planning Horizon 7 days: let's say M T W R F Sa Su
Along with following constraints:
Max no. of working days in a week: 5
A rest days after 4 continuous
night shifts.
Consecutive night and morning shifts are not allowed.
Demand per shift is 7 nurses.
A nurse can only work in one shift per day, i.e. morning, night, day
Cost scenarios:
Morning shift: $12
Day shift: $13
Night shift : $15
Objective function is to minimize the cost of operation as per Nurse preferences.
Can anyone give me an idea of how this problem can be formulated ?
So at first some things unusual in your problem definition:
This is not a real optimization problem, since your objective function is fixed per definition (every shift has 7 nurses, and every nurse has an equal price per shift)
In your Problem you defined 7 nurses per shift with a maimum of 5 working days. So you need 7 nurses on three shifts on seven days. This equals 147 nurse/shifts. But with the cap of five working days and only one shift per day, you just have 20 Nurses on 5 shifts, which equals to 100 nurse/shifts.
I've built the problem in Mathprog but the code should be more or less equal to AMPL. I've started with three sets for the nurses, days and shifts.
set shifts := {1,2,3};
set days := {1,2,3,4,5,6,7};
set nurses := {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20};
The shedule is defined as a set of binary variables:
var schedule{nurses, days, shifts}, binary;
The simple objective contains the sum of all nurse/shifts in this week with the related prices:
minimize cost: sum{i in nurses, j in days}(schedule[i,j,1]*c_morning+schedule[i,j,2]*c_day+schedule[i,j,3]*c_night);
To your first constraint one can limit the sum of all shifts per nurse to five, since there is only one shift per day possible:
s.t. working_days{n in nurses}:
sum{i in days, j in shifts}(schedule[n,i,j]) <= 5;
The restday is the hardest part of the problem. For simplicity I've created another set which just contains the days, where a nurse could have achived four night-shifts in a row. You can also formulate the constraint with the original set of days and exclude the first four days.
set nigth_days := {5,6,7};
s.t. rest{n in nurses,i in nigth_days}:
(schedule[n,i-4,3]+schedule[n,i-3,3]+schedule[n,i-2,3]+schedule[n,i-1,3]+sum{j in shifts}(schedule[n,i,j])) <= 4;
For not having a morning-shift after a night-shift I used the same attempt like for the rest days. The seventh day is excluded, since there is no eigth day where we can look for a morning-shift.
set yester_days := {1,2,3,4,5,6};
s.t. night_morning{i in yester_days, n in nurses}:
(schedule[n,i,3]+schedule[n,i+1,1]) <= 1;
The demand of four nurses per shift should be met (I've reduced the number since more then 4 nurses are infeasible, due to the 5 shift limit)
s.t. demand_shift{i in days, j in shifts}:
sum{n in nurses}(schedule[n,i,j]) = 4;
The fifth constraint is to limit the shifts per day to a max of one.
s.t. one_shift{n in nurses, i in days}:
sum{ j in shifts}(schedule[n,i,j]) <= 1;
set nurse; #no. of full time employees working in the facility
set days; #planning horizon
set shift; #no. of shift in a day
set S; #shift correseponding to the outsourced nurses
set D;#day corresponding to the outsourced nurses
set N;#
# ith nurse working on day j
# j starts from Monday (j=1), Tuesday( j=2), Wednesday (j=3), Thursday(j=4), Friday(j=5), Saturday(j=6), Sunday(j=7)
#s be the shift as morning, day and night
param availability{i in nurse, j in days};
param costpershift{i in nurse, j in days, s in shift};
param outcost{n in N, l in D, m in S};
var nurseavailability{i in nurse,j in days,s in shift} binary; # = 1 if nurse i is available on jth day working on sth shift, 0 otherwise
var outsourced{n in N, l in D, m in S} integer;
#Objective function
minimize Cost: sum{i in nurse, j in days, s in shift} costpershift[i,j,s]*nurseavailability[i,j,s]+ sum{ n in N, l in D, m in S}outcost[n,l,m]*outsourced[n,l,m];
#constraints
#maximum no. of shifts per day
subject to maximum_shifts_perday {i in nurse,j in days}:
sum{s in shift} nurseavailability[i,j,s]*availability[i,j] <= 1;
#maximum no. of working says a week
subject to maximum_days_of_work {i in nurse}:
sum{j in days,s in shift} availability[i,j]*nurseavailability[i,j,s]<=5; #maximum working days irrespective of shifts
# rest days after night shifts
subject to rest_days_after_night_shift{i in nurse}:
sum{j in days} availability[i,j]*nurseavailability[i,j,3]<=4;
#demand per shift
subject to supply{j in days, s in shift, l in D, m in S}:
sum{i in nurse} availability[i,j]*nurseavailability[i,j,s] + sum{n in N} outsourced[n,l,m]=7;
#outsourcing only works well when there is more variability in supply.
#increasing the staff no. would be effective for reducing the cost variability in demand.
#considering a budget of $16,000 per week
#outsourcing constraints: a maximum of 20 nurses can be outsourced per shift
# no. of fulltime employees=30
#demand is 7 nurses per shift
#the average variability
#all nurses are paid equally # $12 per hour.
#cost of an outsourced shift is $144.
#cost of morning shift is $96.
#cost of day shift is $104.
#cost of night shift is $120.
data;
#set nurse ordered:= nurse1 nurse2 nurse3 nurse4 nurse5 nurse6 nurse7 nurse8
#nurse9 nurse10 nurse11 nurse12 nurse13 nurse14 nurse15 nurse16 nurse17
#nurse18 nurse19 nurse20;
set nurse:= 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30;
#set days ordered:= Monday Tuesday Wednesday Thursday Friday Saturday Sunday;
set days:= 1 2 3 4 5 6 7;
#set shift ordered:= Morning Day Night;
set shift:= 1 2 3;
set D:= 1 2 3 4 5 6 7; #outsourced days
set S:=1 2 3; #outshit
set N := 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20;
param outcost
[*,*,1]:
1 2 3 4 5 6 7:=
1 144 144 144 144 144 144 144
2 144 144 144 144 144 144 144
3 144 144 144 144 144 144 144
4 144 144 144 144 144 144 144
5 144 144 144 144 144 144 144
6 144 144 144 144 144 144 144
7 144 144 144 144 144 144 144
8 144 144 144 144 144 144 144
9 144 144 144 144 144 144 144
10 144 144 144 144 144 144 144
11 144 144 144 144 144 144 144
12 144 144 144 144 144 144 144
13 144 144 144 144 144 144 144
14 144 144 144 144 144 144 144
15 144 144 144 144 144 144 144
16 144 144 144 144 144 144 144
17 144 144 144 144 144 144 144
18 144 144 144 144 144 144 144
19 144 144 144 144 144 144 144
20 144 144 144 144 144 144 144
[*,*,2]:
1 2 3 4 5 6 7:=
1 144 144 144 144 144 144 144
2 144 144 144 144 144 144 144
3 144 144 144 144 144 144 144
4 144 144 144 144 144 144 144
5 144 144 144 144 144 144 144
6 144 144 144 144 144 144 144
7 144 144 144 144 144 144 144
8 144 144 144 144 144 144 144
9 144 144 144 144 144 144 144
10 144 144 144 144 144 144 144
11 144 144 144 144 144 144 144
12 144 144 144 144 144 144 144
13 144 144 144 144 144 144 144
14 144 144 144 144 144 144 144
15 144 144 144 144 144 144 144
16 144 144 144 144 144 144 144
17 144 144 144 144 144 144 144
18 144 144 144 144 144 144 144
19 144 144 144 144 144 144 144
20 144 144 144 144 144 144 144
[*,*,3]:
1 2 3 4 5 6 7:=
1 144 144 144 144 144 144 144
2 144 144 144 144 144 144 144
3 144 144 144 144 144 144 144
4 144 144 144 144 144 144 144
5 144 144 144 144 144 144 144
6 144 144 144 144 144 144 144
7 144 144 144 144 144 144 144
8 144 144 144 144 144 144 144
9 144 144 144 144 144 144 144
10 144 144 144 144 144 144 144
11 144 144 144 144 144 144 144
12 144 144 144 144 144 144 144
13 144 144 144 144 144 144 144
14 144 144 144 144 144 144 144
15 144 144 144 144 144 144 144
16 144 144 144 144 144 144 144
17 144 144 144 144 144 144 144
18 144 144 144 144 144 144 144
19 144 144 144 144 144 144 144
20 144 144 144 144 144 144 144;
param availability:
1 2 3 4 5 6 7 :=
1 0 0 0 0 0 0 0
2 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1
6 1 1 1 1 1 1 1
7 1 0 1 1 1 1 1
8 1 1 1 1 1 1 1
9 1 1 1 1 1 1 1
10 1 1 1 1 1 1 1
11 1 1 1 1 1 1 1
12 1 1 1 1 1 1 1
13 1 1 1 1 1 1 1
14 1 1 1 1 1 1 1
15 1 1 1 1 1 1 1
16 1 1 1 1 1 1 1
17 0 1 1 1 1 1 1
18 1 1 1 1 1 1 1
19 1 1 1 1 1 1 1
20 1 1 1 1 1 1 1
21 1 1 1 1 1 1 1
22 1 1 1 1 1 1 1
23 1 1 1 1 1 1 1
24 1 1 1 1 1 1 1
25 1 1 1 1 1 1 1
26 1 1 1 1 1 1 1
27 1 1 1 1 1 1 1
28 1 1 1 1 1 1 1
29 1 1 1 1 1 1 1
30 1 1 1 1 1 1 1;
param costpershift:=
[*,*,1]: 1 2 3 4 5 6 7 :=
1 96 96 96 96 96 96 96
2 96 96 96 96 96 96 96
3 96 96 96 96 96 96 96
4 96 96 96 96 96 96 96
5 96 96 96 96 96 96 96
6 96 96 96 96 96 96 96
7 96 96 96 96 96 96 96
8 96 96 96 96 96 96 96
9 96 96 96 96 96 96 96
10 96 96 96 96 96 96 96
11 96 96 96 96 96 96 96
12 96 96 96 96 96 96 96
13 96 96 96 96 96 96 96
14 96 96 96 96 96 96 96
15 96 96 96 96 96 96 96
16 96 96 96 96 96 96 96
17 96 96 96 96 96 96 96
18 96 96 96 96 96 96 96
19 96 96 96 96 96 96 96
20 96 96 96 96 96 96 96
21 96 96 96 96 96 96 96
22 96 96 96 96 96 96 96
23 96 96 96 96 96 96 96
24 96 96 96 96 96 96 96
25 96 96 96 96 96 96 96
26 96 96 96 96 96 96 96
27 96 96 96 96 96 96 96
28 96 96 96 96 96 96 96
29 96 96 96 96 96 96 96
30 96 96 96 96 96 96 96
[*,*,2] : 1 2 3 4 5 6 7 :=
1 104 104 104 104 104 104 104
2 104 104 104 104 104 104 104
3 104 104 104 104 104 104 104
4 104 104 104 104 104 104 104
5 104 104 104 104 104 104 104
6 104 104 104 104 104 104 104
7 104 104 104 104 104 104 104
8 104 104 104 104 104 104 104
9 104 104 104 104 104 104 104
10 104 104 104 104 104 104 104
11 104 104 104 104 104 104 104
12 104 104 104 104 104 104 104
13 104 104 104 104 104 104 104
14 104 104 104 104 104 104 104
15 104 104 104 104 104 104 104
16 104 104 104 104 104 104 104
17 104 104 104 104 104 104 104
18 104 104 104 104 104 104 104
19 104 104 104 104 104 104 104
20 104 104 104 104 104 104 104
21 104 104 104 104 104 104 104
22 104 104 104 104 104 104 104
23 104 104 104 104 104 104 104
24 104 104 104 104 104 104 104
25 104 104 104 104 104 104 104
26 104 104 104 104 104 104 104
27 104 104 104 104 104 104 104
28 104 104 104 104 104 104 104
29 104 104 104 104 104 104 104
30 104 104 104 104 104 104 104
[*,*,3] : 1 2 3 4 5 6 7 :=
1 120 120 120 120 120 120 120
2 120 120 120 120 120 120 120
3 120 120 120 120 120 120 120
4 120 120 120 120 120 120 120
5 120 120 120 120 120 120 120
6 120 120 120 120 120 120 120
7 120 120 120 120 120 120 120
8 120 120 120 120 120 120 120
9 120 120 120 120 120 120 120
10 120 120 120 120 120 120 120
11 120 120 120 120 120 120 120
12 120 120 120 120 120 120 120
13 120 120 120 120 120 120 120
14 120 120 120 120 120 120 120
15 120 120 120 120 120 120 120
16 120 120 120 120 120 120 120
17 120 120 120 120 120 120 120
18 120 120 120 120 120 120 120
19 120 120 120 120 120 120 120
20 120 120 120 120 120 120 120
21 120 120 120 120 120 120 120
22 120 120 120 120 120 120 120
23 120 120 120 120 120 120 120
24 120 120 120 120 120 120 120
25 120 120 120 120 120 120 120
26 120 120 120 120 120 120 120
27 120 120 120 120 120 120 120
28 120 120 120 120 120 120 120
29 120 120 120 120 120 120 120
30 120 120 120 120 120 120 120;

Data for simple TSP [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I wrote a simple genetic algorithm that can solve traveling salesman problem with 5 cities. I want to see how it does on a problem with more cities, something like 10, 25, 50, 100, but I can't find a sample date for the problem to try it on. Basically, I am looking for 2D lists or matrices with distances between cities. It would be nice if there is a solution. Where should I look?
Thank You in Advance
A well-known benchmark library for the TSP with instances ranging from as few as 14 to close to 100,000 cities is the TSPLIB. The instances have been solved to optimality, for some instances the optimal solution is also available.
Many of the instances have a real-world background such as travel been cities in Germany, Switzerland, the USA, or in the whole world. Some of the instances represent drilling problems for computer board layout There's also an instance that represents the voyage of Ulysses.
The sources I've found online are quite huge. I might be doing something wrong, but 10 places (cities) take ~0.6s and 11 places take ~7s. The smallest known-solution dataset I could find was 15 places (and considered "small", the "classical" one being 48 places) but perhaps those are for optimized (non-brute force) algorithms. In the end I made my own table with real-world cities:
m
a
a h
s h s u
t a e i g l
r a e t e s
i c r t l e b b a e
c h l a e c o e n o p
h e e r e h n r n h e
t n n d n t n g e e n
maastricht 0 29 20 21 16 31 100 12 4 31 18
aachen 29 0 15 29 28 40 72 21 29 41 12
heerlen 20 15 0 15 14 25 81 9 23 27 13
sittard 21 29 15 0 4 12 92 12 25 13 25
geleen 16 28 14 4 0 16 94 9 20 16 22
echt 31 40 25 12 16 0 95 24 36 3 37
bonn 100 72 81 92 94 95 0 90 101 99 84
hulsberg 12 21 9 12 9 24 90 0 15 25 13
kanne 4 29 23 25 20 36 101 15 0 35 18
ohe 31 41 27 13 16 3 99 25 35 0 38
epen 18 12 13 25 22 37 84 13 18 38 0
Optimal (by program): cities 0-7-4-3-9-5-2-6-1-10-8-0 = 253km
maastricht -> hulsberg -> geleen -> sittard -> ohe -> kanne -> echt
-> heerlen -> bonn -> aachen -> epen -> kanne -> maastricht
The data format readable by the program is a partial table (because it's symmetrical):
29 20 21 16 31 100 12 4 31 18
15 29 28 40 72 21 29 41 12
15 14 25 81 9 23 27 13
4 12 92 12 25 13 25
16 94 9 20 16 22
95 24 36 3 37
90 101 99 84
15 25 13
35 18
38
For me this takes ~6.7 seconds to process on a 3rd gen i7 (i7-3630QM). Program is written in C++, single-threaded and simply brute-forces the possibilities. For testing it might be more practical to remove one place, then it takes ~660ms (0.7s) which is still enough to see if code changes make much of a difference.
For next incomes, i'll paste some more "small" cases:
You can find more tests in here. but the file is ".tsp" extension and you should do a simple parse that translate to the matrix of distances.
(distance in miles)
6 cities, expected: 1248.0
9999 64 378 519 434 200
64 9999 318 455 375 164
378 318 9999 170 265 344
519 455 170 9999 223 428
434 375 265 223 9999 273
200 164 344 428 273 9999
15 cities, expected: 1194.0
-1 141 134 152 173 289 326 329 285 401 388 366 343 305 276
141 -1 152 150 153 312 354 313 249 324 300 272 247 201 176
134 152 -1 24 48 168 210 197 153 280 272 257 237 210 181
152 150 24 -1 24 163 206 182 133 257 248 233 214 187 158
173 153 48 24 -1 160 203 167 114 234 225 210 190 165 137
289 312 168 163 160 -1 43 90 124 250 264 270 264 267 249
326 354 210 206 203 43 -1 108 157 271 290 299 295 303 287
329 313 197 182 167 90 108 -1 70 164 183 195 194 210 201
285 249 153 133 114 124 157 70 -1 141 147 148 140 147 134
401 324 280 257 234 250 271 164 141 -1 36 67 88 134 150
388 300 272 248 225 264 290 183 147 36 -1 33 57 104 124
366 272 257 233 210 270 299 195 148 67 33 -1 26 73 96
343 247 237 214 190 264 295 194 140 88 57 26 -1 48 71
305 201 210 187 165 267 303 210 147 134 104 73 48 -1 30
276 176 181 158 137 249 287 201 134 150 124 96 71 30 -1
Hugeeee 29 cities, expected: 27603
imagem: western sahara
-1 74 4110 3048 2267 974 4190 3302 4758 3044 3095 3986 5093 6407 5904 8436 6963 6694 6576 8009 7399 7267 7425 9639 9230 8320 9300 8103 7799
74 -1 4070 3000 2214 901 4138 3240 4702 2971 3021 3915 5025 6338 5830 8369 6891 6620 6502 7939 7326 7193 7351 9571 9160 8249 9231 8030 7725
4110 4070 -1 1173 1973 3496 892 1816 1417 3674 3778 2997 2877 3905 5057 5442 4991 5151 5316 5596 5728 5811 5857 6675 6466 6061 6523 6165 6164
3048 3000 1173 -1 817 2350 1172 996 1797 2649 2756 2317 2721 3974 4548 5802 4884 4887 4960 5696 5537 5546 5634 7045 6741 6111 6805 6091 5977
2267 2214 1973 817 -1 1533 1924 1189 2498 2209 2312 2325 3089 4401 4558 6342 5175 5072 5075 6094 5755 5712 5828 7573 7222 6471 7289 6374 6187
974 901 3496 2350 1533 -1 3417 2411 3936 2114 2175 3014 4142 5450 4956 7491 5990 5725 5615 7040 6430 6304 6459 8685 8268 7348 8338 7131 6832
4190 4138 892 1172 1924 3417 -1 1233 652 3086 3185 2203 1987 3064 4180 4734 4117 4261 4425 4776 4844 4922 4971 5977 5719 5228 5780 5302 5281
3302 3240 1816 996 1189 2411 1233 -1 1587 1877 1979 1321 1900 3214 3556 5175 4006 3947 3992 4906 4615 4599 4700 6400 6037 5288 6105 5209 5052
4758 4702 1417 1797 2498 3936 652 1587 -1 3286 3374 2178 1576 2491 3884 4088 3601 3818 4029 4180 4356 4469 4497 5331 5084 4645 5143 4761 4787
3044 2971 3674 2649 2209 2114 3086 1877 3286 -1 107 1360 2675 3822 2865 5890 4090 3723 3560 5217 4422 4257 4428 7000 6514 5455 6587 5157 4802
3095 3021 3778 2756 2312 2175 3185 1979 3374 107 -1 1413 2725 3852 2826 5916 4088 3705 3531 5222 4402 4229 4403 7017 6525 5451 6598 5142 4776
3986 3915 2997 2317 2325 3014 2203 1321 2178 1360 1413 -1 1315 2511 2251 4584 2981 2778 2753 4031 3475 3402 3531 5734 5283 4335 5355 4143 3897
5093 5025 2877 2721 3089 4142 1987 1900 1576 2675 2725 1315 -1 1323 2331 3350 2172 2275 2458 3007 2867 2935 2988 4547 4153 3400 4222 3376 3307
6407 6338 3905 3974 4401 5450 3064 3214 2491 3822 3852 2511 1323 -1 2350 2074 1203 1671 2041 1725 1999 2213 2173 3238 2831 2164 2901 2285 2397
5904 5830 5057 4548 4558 4956 4180 3556 3884 2865 2826 2251 2331 2350 -1 3951 1740 1108 772 2880 1702 1450 1650 4779 4197 2931 4270 2470 2010
8436 8369 5442 5802 6342 7491 4734 5175 4088 5890 5916 4584 3350 2074 3951 -1 2222 2898 3325 1276 2652 3019 2838 1244 1089 1643 1130 2252 2774
6963 6891 4991 4884 5175 5990 4117 4006 3601 4090 4088 2981 2172 1203 1740 2222 -1 684 1116 1173 796 1041 974 3064 2505 1368 2578 1208 1201
6694 6620 5151 4887 5072 5725 4261 3947 3818 3723 3705 2778 2275 1671 1108 2898 684 -1 432 1776 706 664 756 3674 3090 1834 3162 1439 1120
6576 6502 5316 4960 5075 5615 4425 3992 4029 3560 3531 2753 2458 2041 772 3325 1116 432 -1 2174 930 699 885 4064 3469 2177 3540 1699 1253
8009 7939 5596 5696 6094 7040 4776 4906 4180 5217 5222 4031 3007 1725 2880 1276 1173 1776 2174 -1 1400 1770 1577 1900 1332 510 1406 1002 1499
7399 7326 5728 5537 5755 6430 4844 4615 4356 4422 4402 3475 2867 1999 1702 2652 796 706 930 1400 -1 371 199 3222 2611 1285 2679 769 440
7267 7193 5811 5546 5712 6304 4922 4599 4469 4257 4229 3402 2935 2213 1450 3019 1041 664 699 1770 371 -1 220 3583 2970 1638 3037 1071 560
7425 7351 5857 5634 5828 6459 4971 4700 4497 4428 4403 3531 2988 2173 1650 2838 974 756 885 1577 199 220 -1 3371 2756 1423 2823 852 375
9639 9571 6675 7045 7573 8685 5977 6400 5331 7000 7017 5734 4547 3238 4779 1244 3064 3674 4064 1900 3222 3583 3371 -1 620 1952 560 2580 3173
9230 9160 6466 6741 7222 8268 5719 6037 5084 6514 6525 5283 4153 2831 4197 1089 2505 3090 3469 1332 2611 2970 2756 620 -1 1334 74 1961 2554
8320 8249 6061 6111 6471 7348 5228 5288 4645 5455 5451 4335 3400 2164 2931 1643 1368 1834 2177 510 1285 1638 1423 1952 1334 -1 1401 648 1231
9300 9231 6523 6805 7289 8338 5780 6105 5143 6587 6598 5355 4222 2901 4270 1130 2578 3162 3540 1406 2679 3037 2823 560 74 1401 -1 2023 2617
8103 8030 6165 6091 6374 7131 5302 5209 4761 5157 5142 4143 3376 2285 2470 2252 1208 1439 1699 1002 769 1071 852 2580 1961 648 2023 -1 594
7799 7725 6164 5977 6187 6832 5281 5052 4787 4802 4776 3897 3307 2397 2010 2774 1201 1120 1253 1499 440 560 375 3173 2554 1231 2617 594 -1

data.frame to spatial polygone data frame

I have this data.frame
data <- read.table(text="Id x y valecolo valecono
1 1 12.18255221 29.406365240 4 990
2 2 9.05893970 20.923087170 4 1090
3 3 1.11192442 2.460411416 0 420
4 4 15.51290096 27.185287490 16 1320
5 5 20.41913438 32.166268590 13 1050
6 6 12.75939095 17.552435030 60 1010
7 7 28.06853355 30.839057830 12 1030
8 8 6.96288868 7.177616682 33 1010
9 9 30.60527190 20.792242110 23 640
10 10 12.07646283 7.658266843 19 810
11 11 10.42878294 5.520913954 0 700
12 12 23.61674977 11.111217320 0 838
13 13 27.16148898 12.259423750 11 1330
14 14 28.00931750 6.258448426 20 777
15 15 20.79999922 -0.000877298 4 630
16 16 21.59999968 -0.005502197 38 830
17 17 19.46122172 -1.229166015 7 740
18 18 28.20370719 -6.305622777 12 660
19 19 29.94840042 -7.192584050 0 1030
20 20 29.28601258 -12.133404940 10 870
21 21 5.88104817 -3.608777319 0 1050
22 22 30.37845976 -26.784308510 0 900
23 23 13.68270042 -12.451253320 0 300
24 24 26.01871530 -26.024342420 22 1330
25 25 20.17735764 -20.829648070 21 1190
26 26 5.04404016 -5.550464740 7 1030
27 27 17.98312114 -26.468988540 0 1200
28 28 8.50660753 -12.957145840 9 850
29 29 10.79633248 -18.938827100 36 1200
30 30 13.36599497 -28.413203870 7 1240
31 31 10.77987946 -28.531459810 0 350
32 32 8.35194396 -24.410755680 28 910
33 33 1.55014408 -12.302725060 10 980
34 34 -0.00388992 -17.899999200 12 1120
35 35 -2.82062504 -16.155620130 12 450
36 36 -4.75903628 -22.962014490 20 920
37 37 -6.07839546 -15.339592840 28 840
38 38 -11.32647798 -24.068047630 0 665
39 39 -11.88138209 -24.245262620 12 1180
40 40 -14.06823800 -25.587589260 36 350
41 41 -10.92180227 -18.461223360 7 1180
42 42 -12.48843186 -20.377660600 0 400
43 43 -18.63696964 -27.415068190 18 1220
44 44 -16.73351789 -23.807549250 0 500
45 45 -22.49024869 -29.944803740 7 1040
46 46 -22.66130064 -27.391018580 0 500
47 47 -15.26565038 -17.866446720 16 1060
48 48 -24.20192852 -23.451155780 0 600
49 49 -21.39663774 -20.089958090 0 750
50 50 -12.33344998 -9.875526199 16 980
51 51 -30.94772590 -22.478895910 0 790
52 52 -24.85783868 -15.225318840 25 720
53 53 -2.44485324 -1.145728097 54 970
54 54 -24.67985433 -7.169018707 4 500
55 55 -30.82457650 -7.398346555 4 750
56 56 -23.56898920 -5.265475270 4 760
57 57 -3.91708603 -0.810208045 0 350
58 58 -26.86563675 -4.251776497 0 440
59 59 -26.64738877 -1.675324623 8 450
60 60 -8.79897138 -0.134558536 11 830
61 61 -21.78250663 1.716077388 0 920
62 62 -28.98396759 6.007465815 24 980
63 63 -34.61607994 8.311853049 8 500
64 64 -25.63850107 7.453677191 15 880
65 65 -22.98762116 11.266290120 11 830
66 66 -33.48522130 19.100848030 0 350
67 67 -25.53096486 16.777135830 21 740
68 68 -18.95412327 15.681238150 0 300
69 69 -8.94874230 8.144324435 0 500
70 70 -10.91433241 10.579099310 4 750
71 71 -13.44807236 14.327310800 0 1090
72 72 -16.24086139 20.940019610 0 500
73 73 -17.51162097 24.111886810 0 940
74 74 -12.47496424 18.363422910 0 1020
75 75 -17.76118016 27.990410510 0 660
76 76 -5.54534556 9.730834410 0 850
77 77 -11.30971858 29.934766840 0 950
78 78 -10.38743785 27.493148220 0 740
79 79 -8.61491396 25.166312360 0 950
80 80 -3.40550077 14.197273530 0 710
81 81 -0.77957621 3.770246702 0 750
82 82 -3.01234325 21.186924550 0 1200
83 83 -2.05241931 32.685624900 0 1200
84 84 -2.26900366 36.128820600 0 970
85 85 0.82954518 5.790885396 0 850
86 86 22.08151130 19.671119440 19 870
87 87 12.60107972 23.864904860 0 1260
88 88 9.78406607 26.163968270 0 600
89 89 11.69995152 33.091322170 0 1090
90 90 20.64705880 -16.439632140 0 840
91 91 24.68314851 -21.314655730 0 1561
92 92 30.33133300 -27.235396100 0 1117
93 93 -26.24691654 -22.405635470 0 1040
94 94 -21.68016500 -24.458519270 10 1000
95 95 -1.57455856 -30.874986140 0 500
96 96 -29.75642086 -5.610894981 0 350
97 97 -3.66771076 26.448084810 0 900
98 98 -26.54457307 29.824419350 0 1050
99 99 -17.90426678 18.751297440 0 200
100 100 10.22894253 -6.274450952 0 880")
And I would like to create a visualization with the polygons of thiessen, then colorize the polygons according to their "valecono" value.
I tried this:
> library(deldir)
> z <- deldir(x,y,rw=c(-34.51608,30.7052719,-30.774986,36.2288206))
> w <- tile.list(z)
> plot(w, fillcol=data$valecono, close=TRUE)
Which seems weird to me, and I'm not sure how R attributed these colors.
Do you have any other suggestions for this case?
I also tried to convert my data.frame in SpatialPolygonsDataFrame, what I did not manage. I tried to convert my data.frame into SpatialPointsDataFrame, which was not a problem, but was not very useful, because I did not find how to convert it then to a SpatialPointsDataFrame.
spdf <- SpatialPointsDataFrame(coords = coords, data = data,
proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
I try all this because I think that with a SpatialPointsDataFrame, it would be easier to have this visualization of polygons with colors according to the valecono of the points.
You can do
library(dismo)
coordinates(data) <- ~x + y
v <- voronoi(data)
spplot(v, "valecolo")
With base plot
s <- (floor(sort(v$valecono)/400) + 1)
plot(v, col=rainbow(60)[v$valecolo+1])
points(data, cex=s/2, col=gray((1:4)/4)[s])

Resources