cumsum with more than 1 variable using ddply - plyr

I'm trying to get cumsum for more than one variable using ddply, but it's not working.
I'm using this code:
ddply(.data=Summaryday, .variables=('DaysToClose_'),.fun=transform,
cumsumPosit=cumsum(PositCount),
cumsumNegat=cumsum(NegatCount))
but the result isn't correct:
DaysToClose_ PositCount NegatCount cumsumPosit cumsumNegat
1 1 7340 27256 7340 27256
2 2 2243 7597 2243 7597
3 3 1526 4545 1526 4545
4 4 1315 3756 1315 3756
5 5 1142 3320 1142 3320
6 6 1216 3118 1216 3118
7 7 1252 3324 1252 3324
8 8 1180 3077 1180 3077
9 9 975 2053 975 2053
10 10 684 1429 684 1429
11 11 613 1244 613 1244
12 12 596 1199 596 1199
13 13 542 1218 542 1218
14 14 711 1434 711 1434
15 15 645 1333 645 1333
16 16 577 899 577 899
17 17 373 667 373 667
18 18 369 656 369 656
19 19 340 624 340 624
If someone can help me on this, I appreciate that.

I am not sure why you would use ddply here. You can't really subset by DaysToClose because each row is then a unique subset, and so you always get cumsum of a single value. Maybe you'd want to use mutate instead
library(tidyverse)
data %>% mutate(cumsumPosit = cumsum(PositCount),
cumsumNegat = cumsum(NegatCount))

Related

MPICH output not printing

Problem
I'm running an executable cp2k installed on HPC cluster using mpich-3.2. The output from the executable is printed in an out file. The problem is, that there is no output in the out file after some steps are printed, but when I see the status of my job on the cluster, it turns out that it is still running. Basically, the problem is that my job is still running, but the output is not getting printed.
Script
I'm using the following job script:
#!/bin/bash
#PBS -N test
#PBS -o test.log
#PBS -j oe
#PBS -l nodes=2:ppn=20
#PBS -q mini
#PBS -l walltime=2:00:00
cd $PBS_O_WORKDIR
echo Master process running on `hostname`
echo Directory is `pwd`
echo PBS has allocated the following nodes:
echo `cat $PBS_NBODEFILE`
NPROCS=`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS nodes
export I_MPI_FABRICS=shm:dapl
export I_MPI_PROVIDER=psm2
export I_MPI_FALLBACK=0
export KMP_AFFINITY=verbose,scatter
export OMP_NUM_THREADS=1
export I_MPI_IFACE=ib0
echo Starting executation at `date`
EXEC="/home/arshil/software/cp2k-5.1.0/exe/local/cp2k.popt"
cp $EXEC ./cp2k
mpiexec -np $NPROCS --machinefile $PBS_NODEFILE ./cp2k -i test.inp >& out
rm cp2k
echo Finished at `date`
Error
The ouput in the out file:
SCF WAVEFUNCTION OPTIMIZATION
----------------------------------- OT ---------------------------------------
Minimizer : DIIS : direct inversion
in the iterative subspace
using 7 DIIS vectors
safer DIIS on
Preconditioner : FULL_SINGLE_INVERSE : inversion of
H + eS - 2*(Sc)(c^T*H*c+const)(Sc)^T
Precond_solver : DEFAULT
stepsize : 0.08000000 energy_gap : 0.08000000
eps_taylor : 0.10000E-15 max_taylor : 4
----------------------------------- OT ---------------------------------------
Step Update method Time Convergence Total energy Change
------------------------------------------------------------------------------
1 OT DIIS 0.80E-01 21.3 0.00002878 -8797.2068024142 -8.80E+03
2 OT DIIS 0.80E-01 10.9 0.00007114 -8797.2061897209 6.13E-04
3 OT DIIS 0.80E-01 10.8 0.00001688 -8797.2073257531 -1.14E-03
As it can be seen, there is no printing after step 3 in the output file, but the job is still running in the background. Even after the walltime is over, the output file remains the same as above. Where is the output going?
The executable cp2k is used to perform quantum chemical calculations and was installed on the cluster along with mpich-3.2. All CP2K needs is an input file with extension .inp. For my case, test.inp is the input file.
&FORCE_EVAL
METHOD Quickstep
&DFT
BASIS_SET_FILE_NAME GTH_BASIS_SETS
POTENTIAL_FILE_NAME GTH_POTENTIALS
&MGRID
NGRIDS 4
CUTOFF 380
REL_CUTOFF 60
&END MGRID
&QS
METHOD GPW
MAP_CONSISTENT
EXTRAPOLATION ASPC
EXTRAPOLATION_ORDER 3
&END QS
&SCF
MAX_SCF 1000
EPS_SCF 1.0E-5
SCF_GUESS ATOMIC
&OT
PRECONDITIONER FULL_SINGLE_INVERSE
MINIMIZER DIIS
N_DIIS 7
&END OT
&PRINT
&RESTART OFF
&END RESTART
&END PRINT
&END SCF
&XC
&XC_FUNCTIONAL PBE
&END XC_FUNCTIONAL
&vdW_POTENTIAL
DISPERSION_FUNCTIONAL PAIR_POTENTIAL
&PAIR_POTENTIAL
PARAMETER_FILE_NAME dftd3.dat
TYPE DFTD3
REFERENCE_FUNCTIONAL PBE
R_CUTOFF [angstrom] 12.3
&END PAIR_POTENTIAL
&END vdW_POTENTIAL
&END XC
&END DFT
&SUBSYS
&CELL
ABC 24.6904 24.6904 24.6904
PERIODIC XYZ
&END CELL
&KIND C
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q4
&END KIND
&KIND P
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q5
&END KIND
&KIND H
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q1
&END KIND
&KIND O
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q6
&END KIND
&KIND N
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q5
&END KIND
&KIND Mg
BASIS_SET TZV2P-GTH
POTENTIAL GTH-PBE-q10
&END KIND
&COLVAR
&COORDINATION
ATOMS_FROM 41
ATOMS_TO 38
R_0 [bohr] 4.5
NN 6
ND 12
&END COORDINATION
&END COLVAR
&COLVAR
&COORDINATION
ATOMS_FROM 41
ATOMS_TO 42 44 47 50 53 56 59 62 65 68 71 74 77 80 83 86 89 92 95 98 101 104 107 110 113 116 119 122 125 128 131 134 137 140 143 146 149 152 155 158 161 164 167 170 173 176 179 182 185 188 191 194 197 200 203 206 209 212 215 218 221 224 227 230 233 236 239 242 245 248 251 254 257 260 263 266 269 272 275 278 281 284 287 290 293 296 299 302 305 308 311 314 317 320 323 326 329 332 335 338 341 344 347 350 353 356 359 362 365 368 371 374 377 380 383 386 389 392 395 398 401 404 407 410 413 416 419 422 425 428 431 434 437 440 443 446 449 452 455 458 461 464 467 470 473 476 479 482 485 488 491 494 497 500 503 506 509 512 515 518 521 524 527 530 533 536 539 542 545 548 551 554 557 560 563 566 569 572 575 578 581 584 587 590 593 596 599 602 605 608 611 614 617 620 623 626 629 632 635 638 641 644 647 650 653 656 659 662 665 668 671 674 677 680 683 686 689 692 695 698 701 704 707 710 713 716 719 722 725 728 731 734 737 740 743 746 749 752 755 758 761 764 767 770 773 776 779 782 785 788 791 794 797 800 803 806 809 812 815 818 821 824 827 830 833 836 839 842 845 848 851 854 857 860 863 866 869 872 875 878 881 884 887 890 893 896 899 902 905 908 911 914 917 920 923 926 929 932 935 938 941 944 947 950 953 956 959 962 965 968 971 974 977 980 983 986 989 992 995 998 1001 1004 1007 1010 1013 1016 1019 1022 1025 1028 1031 1034 1037 1040 1043 1046 1049 1052 1055 1058 1061 1064 1067 1070 1073 1076 1079 1082 1085 1088 1091 1094 1097 1100 1103 1106 1109 1112 1115 1118 1121 1124 1127 1130 1133 1136 1139 1142 1145 1148 1151 1154 1157 1160 1163 1166 1169 1172 1175 1178 1181 1184 1187 1190 1193 1196 1199 1202 1205 1208 1211 1214 1217 1220 1223 1226 1229 1232 1235 1238 1241 1244 1247 1250 1253 1256 1259 1262 1265 1268 1271 1274 1277 1280 1283 1286 1289 1292 1295 1298 1301 1304 1307 1310 1313 1316 1319 1322 1325 1328 1331 1334 1337 1340 1343 1346 1349 1352 1355 1358 1361 1364 1367 1370 1373 1376 1379 1382 1385 1388 1391 1394 1397 1400 1403 1406 1409 1412 1415 1418 1421 1424 1427 1430 1433 1436 1439 1442 1445 1448 1451 1454 1457
ATOMS_TO 1460 1463 1466 1469 1472 1475 1478 1481 1484 1487 1490 1493 1496 1499 1502 1505
R_0 [bohr] 4.5
NN 6
ND 12
&END COORDINATION
&END COLVAR
&END SUBSYS
&END FORCE_EVAL
&GLOBAL
PROJECT test
RUN_TYPE MD
PRINT_LEVEL LOW
&END GLOBAL
&MOTION
&MD
ENSEMBLE NVT
STEPS 100000
TIMESTEP 0.5
TEMPERATURE 310
TEMP_TOL 100
&THERMOSTAT
&NOSE
LENGTH 3
YOSHIDA 3
TIMECON 100.0
MTS 2
&END NOSE
&END
&PRINT
&ENERGY
&EACH
MD 10
&END
&END
&PROGRAM_RUN_INFO
&EACH
MD 100
&END
&END
FORCE_LAST
&END PRINT
&END MD
&FREE_ENERGY
&METADYN
DO_HILLS
LAGRANGE .TRUE.
NT_HILLS 40
WW [kcalmol] 1
TEMPERATURE 310
TEMP_TOL 10
&METAVAR
SCALE 0.05
COLVAR 1
MASS 50
LAMBDA 2
&WALL
POSITION 0.0
TYPE QUARTIC
&QUARTIC
DIRECTION WALL_MINUS
K 10.0
&END
&END
&END METAVAR
&METAVAR
SCALE 0.05
COLVAR 2
MASS 50
LAMBDA 2
&WALL
POSITION 0.0
TYPE QUARTIC
&QUARTIC
DIRECTION WALL_MINUS
K 10.0
&END
&END
&END METAVAR
&PRINT
&COLVAR
COMMON_ITERATION_LEVELS 3
&EACH
MD 1
&END
&END
&HILLS
COMMON_ITERATION_LEVELS 3
&EACH
MD 1
&END
&END
&END
&END METADYN
&END
&PRINT
&TRAJECTORY
&EACH
MD 1
&END
&END
&VELOCITIES OFF
&END
&RESTART
&EACH
MD 20
&END
ADD_LAST NUMERIC
&END
&RESTART_HISTORY
&EACH
MD 2000
&END
&END
&END
&END MOTION
&EXT_RESTART
RESTART_FILE_NAME NVT-1.restart
RESTART_COUNTERS .FALSE.
&END
The problem in my opinion is not with the input file. It has got to do something with mpich-3.2. I would really appreciate some help.
This may be something similar going on / solutions that can be used here: Python "print" not working when embedded into MPI program It is not perfect as you are not using python however it may help.
At a basic level MPI launches many processes - but only the command that launches it has access to stdio etc. The redirect at the end of the line starting with mpiexec sends the stdout of mpiexec to a file. The output from your script is buffered by mpiexec until the processes end (either they complete or they are stopped).
Where the output is going is a good question and may require changes in test.np or some other way of shutting down (you mention you were out of wall time). I'm looking to solve the same problem - and will update this (if) I find an answer.
Also the output from different processes started by mpi can arrive in random order. I do not care about this but if you do you may need to pass the messages back to some common thread which sorts their order.

if parts of two columns match in a file, copy row and move to new file

I have a blastn output file with tens of thousands of rows. I'm only interested in rows where part of the query sequence ID does not match with part of the subject sequence ID, which I'd like to put into a new text file. Here is an excerpt of the massive output file for which I want to extract information from, as an example:
qseqid qlen qstart qend sseqid slen sstart send evalue bitscore length pident nident mismatch gaps
OFAS003927-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 744 121 679 OFAS003927-RA-EXON03_Anisoscelini_Anisoscelis_flavolineatus_CMF_0018_S7_L005_UQ_trinity_assembled 557 1 557 0 832 562 93.594 526 28 8
OFAS003927-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 744 155 650 OFAS003927-RA-EXON03_Placoscelini_Plaxiscelis_limbata_CMF_0072_S29_L005_UQ_trinity_assembled 820 327 819 0 808 496 96.169 477 16 3
OFAS003927-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 744 222 686 OFAS003927-RA-EXON03_Anisoscelini_Leptoscelis_tricolor_CMF_0079_S32_L005_UQ_trinity_assembled 465 1 465 0 793 465 97.419 453 12 0
OFAS003927-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 744 429 635 OFAS003927-RA-EXON03B_Clavigrallini_Clavigralla_sp_CMF_0335_S81_L005_UQ_trinity_assembled 655 1 207 4.30E-87 316 207 94.203 195 12 0
OFAS003927-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 744 531 629 OFAS003927-RA-EXON07_Mictini_Anoplocnemis_sp_CMF_0052_S20_L005_UQ_trinity_assembled 668 1 99 9.92E-39 156 99 94.949 94 5 0
OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 696 1 696 OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 696 1 696 0 1286 696 100 696 0 0
OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 696 1 696 OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_declivis_CMF_0069_S26_L005_UQ_trinity_assembled 1060 332 1025 0 1212 696 98.132 683 11 2
OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 696 1 696 OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_thomasi_CMF_0028_S13_L005_UQ_trinity_assembled 814 50 745 0 1147 698 96.418 673 21 4
OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_alata_CMF_0025_S10_L005_UQ_trinity_assembled 696 1 695 OFAS007459-RA-EXON03_Acanthocephalini_Acanthocephala_confraterna_CMF_0123_S44_L005_UQ_trinity_assembled 1313 578 1274 0 1131 699 95.994 671 22 6
qseqid = query sequence ID
sseqid = subject sequence ID
What should be matching is the OFAS#-RA-EXON# between the two ID's for each row. When this isn't the case, e.g., the 4th and 5th row, I want to extract the entire row and place into a new text file. I know some regex pattern will need to be employed, but how to indicate columns and search on a per row basis isn't clear to me.
This will work with GNU Awk :
tail -n+2 input.txt | awk '{ if( substr($1,0,21) != substr($5,0,21)) { print $0 } }'
Regards!

Keep lines based on ratio between lines

I have a sort -g k9 command on a file that gives me this in the bash standard output:
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674
28.32 671 301 32 1 507 48 702 3e-49 183
28.32 671 301 32 1 507 47 701 3e-49 183
31.40 516 247 24 86 507 196 698 1e-46 176
31.41 519 243 25 86 507 196 698 5e-46 175
27.72 588 290 26 19 481 98 675 2e-39 154
30.56 337 170 17 101 413 302 598 5e-20 96.3
30.56 337 170 17 101 413 302 598 8e-20 95.5
I would like to cut my data based on the 9th column. The idea would be to compare the value of the 9th column on line i, divide it by the value of the 9th column on line i+1, and if the ratio is 0 OR 0/0 OR > 1e-50, line i and i+1 are kept. As soon as one of these conditions is not filled, stop reading. The desired output would be:
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674
I can obtain this output with head -n 6 but this is obviously not based on the condition on values in the 9th column. Please note that the values are in 'scientific' format.
I know how to do this in Python (write the standard output to a file, calculate ratios, etc.) but for commodity reasons I'd prefer a shell-based solution (with awk or sort for instance) although I don't know if that's possible. Thanks for your help!
Just exit the script when the condition is not accomplished; otherwise, print the previous line and store the 9th field to compare on the next loop:
$ awk '($9 && prev/$9>1e-50) {exit} {print stored; prev=$9; stored=$0}' file
55.19 645 156 15 9 520 58 702 0.0 661
55.50 636 159 16 9 520 58 693 0.0 654
55.19 645 156 15 9 520 58 702 0.0 658
56.52 644 147 16 9 520 59 701 0.0 669
55.97 645 151 15 9 520 65 709 0.0 672
55.97 645 151 15 9 520 65 709 4e-124 674

Creating frequency interval in Crystal Report

I am trying to create a dataset of frequency interval in crystal report something like below. First column is rowid, second is start interval , third column is end interval and fourth column is interval name.
1 0 29 0 - 29
2 30 59 30 - 59
3 60 89 60 - 89
4 90 119 90 - 119
5 120 149 120 - 149
6 150 179 150 - 179
7 180 209 180 - 209
8 210 239 210 - 239
9 240 269 240 - 269
10 270 299 270 - 299
11 300 329 300 - 329
12 330 359 330 - 359
13 360 389 360 - 389
14 390 419 390 - 419
15 420 449 420 - 449
16 450 479 450 - 479
17 480 509 480 - 509
18 510 539 510 - 539
19 540 569 540 - 569
20 570 599 570 - 599
21 600 629 600 - 629
22 630 659 630 - 659
23 660 689 660 - 689
24 690 719 690 - 719
25 720 749 720 - 749
26 750 779 750 - 779
27 780 809 780 - 809
28 810 839 810 - 839
29 840 869 840 - 869
30 870 899 870 - 899
Can I write a CTE to generate this interval so that I can use it directly in crystal report without writing function on database side? Below is the code which I wrote:
declare intervalStart integer := 0;
intervalEnd integer := 900;
intervalMins varchar(10) := 30;
totalIntervals number := 0;
begin
begin
execute immediate 'create global temporary table intervalTable (row_Id int not null, intStart integer, intEnd integer, intervalName varchar2(25))ON COMMIT DELETE ROWS';
exception when others then dbms_output.put_line(sqlerrm);
end;
totalIntervals := intervalEnd/intervalMins;
--dbms_output.put_line(totalIntervals);
for i in 1 ..totalIntervals loop
intervalStart := 0;
intervalEnd := 0;
intervalStart := intervalStart + (i-1)*intervalMins;
intervalEnd := intervalEnd + (i*intervalMins)-1;
--dbms_output.put_line(intervalStart || ' - ' || intervalEnd);
insert into intervalTable
(
row_id,
intStart,
intEnd,
intervalName
)
values(i, intervalStart, intervalEnd, (intervalStart || ' - ' || intervalEnd));
end loop;
end;
I think you want something like this:
with freq_data as (
select level as id, (level-1)*30 as start_interval, ((level-1)*30) + 29 as end_interval, (level-1)*30 || ' - ' || to_char(((level-1)*30) + 29) as label
from dual
connect by level <= 30
order by level
)
select * from freq_data;
Output
ID START_INTERVAL END_INTERVAL LABEL
1 0 29 0 - 29
2 30 59 30 - 59
3 60 89 60 - 89
4 90 119 90 - 119
5 120 149 120 - 149
6 150 179 150 - 179
7 180 209 180 - 209
8 210 239 210 - 239
9 240 269 240 - 269
10 270 299 270 - 299
11 300 329 300 - 329
12 330 359 330 - 359
13 360 389 360 - 389
14 390 419 390 - 419
15 420 449 420 - 449
16 450 479 450 - 479
17 480 509 480 - 509
18 510 539 510 - 539
19 540 569 540 - 569
20 570 599 570 - 599
21 600 629 600 - 629
22 630 659 630 - 659
23 660 689 660 - 689
24 690 719 690 - 719
25 720 749 720 - 749
26 750 779 750 - 779
27 780 809 780 - 809
28 810 839 810 - 839
29 840 869 840 - 869
30 870 899 870 - 899
An example using the above in a join query:
create table my_test
(
num number
-- other important data ...
);
-- insert some random numbers
insert into my_test
select trunc(DBMS_RANDOM.VALUE(0,900))
from dual
connect by level <= 10;
commit;
Now joining to get the label for each num field:
with freq_data as (
select level as id, (level-1)*30 as start_interval, ((level-1)*30) + 29 as end_interval, (level-1)*30 || ' - ' || to_char(((level-1)*30) + 29) as label
from dual
connect by level <= 30
order by level
)
select t.num, d.label
from my_test t
left join freq_data d ON (t.num between d.start_interval and d.end_interval);
Output:
NUM LABEL
64 60 - 89
73 60 - 89
128 120 - 149
154 150 - 179
267 240 - 269
328 300 - 329
550 540 - 569
586 570 - 599
745 720 - 749
795 780 - 809

Data for simple TSP [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I wrote a simple genetic algorithm that can solve traveling salesman problem with 5 cities. I want to see how it does on a problem with more cities, something like 10, 25, 50, 100, but I can't find a sample date for the problem to try it on. Basically, I am looking for 2D lists or matrices with distances between cities. It would be nice if there is a solution. Where should I look?
Thank You in Advance
A well-known benchmark library for the TSP with instances ranging from as few as 14 to close to 100,000 cities is the TSPLIB. The instances have been solved to optimality, for some instances the optimal solution is also available.
Many of the instances have a real-world background such as travel been cities in Germany, Switzerland, the USA, or in the whole world. Some of the instances represent drilling problems for computer board layout There's also an instance that represents the voyage of Ulysses.
The sources I've found online are quite huge. I might be doing something wrong, but 10 places (cities) take ~0.6s and 11 places take ~7s. The smallest known-solution dataset I could find was 15 places (and considered "small", the "classical" one being 48 places) but perhaps those are for optimized (non-brute force) algorithms. In the end I made my own table with real-world cities:
m
a
a h
s h s u
t a e i g l
r a e t e s
i c r t l e b b a e
c h l a e c o e n o p
h e e r e h n r n h e
t n n d n t n g e e n
maastricht 0 29 20 21 16 31 100 12 4 31 18
aachen 29 0 15 29 28 40 72 21 29 41 12
heerlen 20 15 0 15 14 25 81 9 23 27 13
sittard 21 29 15 0 4 12 92 12 25 13 25
geleen 16 28 14 4 0 16 94 9 20 16 22
echt 31 40 25 12 16 0 95 24 36 3 37
bonn 100 72 81 92 94 95 0 90 101 99 84
hulsberg 12 21 9 12 9 24 90 0 15 25 13
kanne 4 29 23 25 20 36 101 15 0 35 18
ohe 31 41 27 13 16 3 99 25 35 0 38
epen 18 12 13 25 22 37 84 13 18 38 0
Optimal (by program): cities 0-7-4-3-9-5-2-6-1-10-8-0 = 253km
maastricht -> hulsberg -> geleen -> sittard -> ohe -> kanne -> echt
-> heerlen -> bonn -> aachen -> epen -> kanne -> maastricht
The data format readable by the program is a partial table (because it's symmetrical):
29 20 21 16 31 100 12 4 31 18
15 29 28 40 72 21 29 41 12
15 14 25 81 9 23 27 13
4 12 92 12 25 13 25
16 94 9 20 16 22
95 24 36 3 37
90 101 99 84
15 25 13
35 18
38
For me this takes ~6.7 seconds to process on a 3rd gen i7 (i7-3630QM). Program is written in C++, single-threaded and simply brute-forces the possibilities. For testing it might be more practical to remove one place, then it takes ~660ms (0.7s) which is still enough to see if code changes make much of a difference.
For next incomes, i'll paste some more "small" cases:
You can find more tests in here. but the file is ".tsp" extension and you should do a simple parse that translate to the matrix of distances.
(distance in miles)
6 cities, expected: 1248.0
9999 64 378 519 434 200
64 9999 318 455 375 164
378 318 9999 170 265 344
519 455 170 9999 223 428
434 375 265 223 9999 273
200 164 344 428 273 9999
15 cities, expected: 1194.0
-1 141 134 152 173 289 326 329 285 401 388 366 343 305 276
141 -1 152 150 153 312 354 313 249 324 300 272 247 201 176
134 152 -1 24 48 168 210 197 153 280 272 257 237 210 181
152 150 24 -1 24 163 206 182 133 257 248 233 214 187 158
173 153 48 24 -1 160 203 167 114 234 225 210 190 165 137
289 312 168 163 160 -1 43 90 124 250 264 270 264 267 249
326 354 210 206 203 43 -1 108 157 271 290 299 295 303 287
329 313 197 182 167 90 108 -1 70 164 183 195 194 210 201
285 249 153 133 114 124 157 70 -1 141 147 148 140 147 134
401 324 280 257 234 250 271 164 141 -1 36 67 88 134 150
388 300 272 248 225 264 290 183 147 36 -1 33 57 104 124
366 272 257 233 210 270 299 195 148 67 33 -1 26 73 96
343 247 237 214 190 264 295 194 140 88 57 26 -1 48 71
305 201 210 187 165 267 303 210 147 134 104 73 48 -1 30
276 176 181 158 137 249 287 201 134 150 124 96 71 30 -1
Hugeeee 29 cities, expected: 27603
imagem: western sahara
-1 74 4110 3048 2267 974 4190 3302 4758 3044 3095 3986 5093 6407 5904 8436 6963 6694 6576 8009 7399 7267 7425 9639 9230 8320 9300 8103 7799
74 -1 4070 3000 2214 901 4138 3240 4702 2971 3021 3915 5025 6338 5830 8369 6891 6620 6502 7939 7326 7193 7351 9571 9160 8249 9231 8030 7725
4110 4070 -1 1173 1973 3496 892 1816 1417 3674 3778 2997 2877 3905 5057 5442 4991 5151 5316 5596 5728 5811 5857 6675 6466 6061 6523 6165 6164
3048 3000 1173 -1 817 2350 1172 996 1797 2649 2756 2317 2721 3974 4548 5802 4884 4887 4960 5696 5537 5546 5634 7045 6741 6111 6805 6091 5977
2267 2214 1973 817 -1 1533 1924 1189 2498 2209 2312 2325 3089 4401 4558 6342 5175 5072 5075 6094 5755 5712 5828 7573 7222 6471 7289 6374 6187
974 901 3496 2350 1533 -1 3417 2411 3936 2114 2175 3014 4142 5450 4956 7491 5990 5725 5615 7040 6430 6304 6459 8685 8268 7348 8338 7131 6832
4190 4138 892 1172 1924 3417 -1 1233 652 3086 3185 2203 1987 3064 4180 4734 4117 4261 4425 4776 4844 4922 4971 5977 5719 5228 5780 5302 5281
3302 3240 1816 996 1189 2411 1233 -1 1587 1877 1979 1321 1900 3214 3556 5175 4006 3947 3992 4906 4615 4599 4700 6400 6037 5288 6105 5209 5052
4758 4702 1417 1797 2498 3936 652 1587 -1 3286 3374 2178 1576 2491 3884 4088 3601 3818 4029 4180 4356 4469 4497 5331 5084 4645 5143 4761 4787
3044 2971 3674 2649 2209 2114 3086 1877 3286 -1 107 1360 2675 3822 2865 5890 4090 3723 3560 5217 4422 4257 4428 7000 6514 5455 6587 5157 4802
3095 3021 3778 2756 2312 2175 3185 1979 3374 107 -1 1413 2725 3852 2826 5916 4088 3705 3531 5222 4402 4229 4403 7017 6525 5451 6598 5142 4776
3986 3915 2997 2317 2325 3014 2203 1321 2178 1360 1413 -1 1315 2511 2251 4584 2981 2778 2753 4031 3475 3402 3531 5734 5283 4335 5355 4143 3897
5093 5025 2877 2721 3089 4142 1987 1900 1576 2675 2725 1315 -1 1323 2331 3350 2172 2275 2458 3007 2867 2935 2988 4547 4153 3400 4222 3376 3307
6407 6338 3905 3974 4401 5450 3064 3214 2491 3822 3852 2511 1323 -1 2350 2074 1203 1671 2041 1725 1999 2213 2173 3238 2831 2164 2901 2285 2397
5904 5830 5057 4548 4558 4956 4180 3556 3884 2865 2826 2251 2331 2350 -1 3951 1740 1108 772 2880 1702 1450 1650 4779 4197 2931 4270 2470 2010
8436 8369 5442 5802 6342 7491 4734 5175 4088 5890 5916 4584 3350 2074 3951 -1 2222 2898 3325 1276 2652 3019 2838 1244 1089 1643 1130 2252 2774
6963 6891 4991 4884 5175 5990 4117 4006 3601 4090 4088 2981 2172 1203 1740 2222 -1 684 1116 1173 796 1041 974 3064 2505 1368 2578 1208 1201
6694 6620 5151 4887 5072 5725 4261 3947 3818 3723 3705 2778 2275 1671 1108 2898 684 -1 432 1776 706 664 756 3674 3090 1834 3162 1439 1120
6576 6502 5316 4960 5075 5615 4425 3992 4029 3560 3531 2753 2458 2041 772 3325 1116 432 -1 2174 930 699 885 4064 3469 2177 3540 1699 1253
8009 7939 5596 5696 6094 7040 4776 4906 4180 5217 5222 4031 3007 1725 2880 1276 1173 1776 2174 -1 1400 1770 1577 1900 1332 510 1406 1002 1499
7399 7326 5728 5537 5755 6430 4844 4615 4356 4422 4402 3475 2867 1999 1702 2652 796 706 930 1400 -1 371 199 3222 2611 1285 2679 769 440
7267 7193 5811 5546 5712 6304 4922 4599 4469 4257 4229 3402 2935 2213 1450 3019 1041 664 699 1770 371 -1 220 3583 2970 1638 3037 1071 560
7425 7351 5857 5634 5828 6459 4971 4700 4497 4428 4403 3531 2988 2173 1650 2838 974 756 885 1577 199 220 -1 3371 2756 1423 2823 852 375
9639 9571 6675 7045 7573 8685 5977 6400 5331 7000 7017 5734 4547 3238 4779 1244 3064 3674 4064 1900 3222 3583 3371 -1 620 1952 560 2580 3173
9230 9160 6466 6741 7222 8268 5719 6037 5084 6514 6525 5283 4153 2831 4197 1089 2505 3090 3469 1332 2611 2970 2756 620 -1 1334 74 1961 2554
8320 8249 6061 6111 6471 7348 5228 5288 4645 5455 5451 4335 3400 2164 2931 1643 1368 1834 2177 510 1285 1638 1423 1952 1334 -1 1401 648 1231
9300 9231 6523 6805 7289 8338 5780 6105 5143 6587 6598 5355 4222 2901 4270 1130 2578 3162 3540 1406 2679 3037 2823 560 74 1401 -1 2023 2617
8103 8030 6165 6091 6374 7131 5302 5209 4761 5157 5142 4143 3376 2285 2470 2252 1208 1439 1699 1002 769 1071 852 2580 1961 648 2023 -1 594
7799 7725 6164 5977 6187 6832 5281 5052 4787 4802 4776 3897 3307 2397 2010 2774 1201 1120 1253 1499 440 560 375 3173 2554 1231 2617 594 -1

Resources