Sort a dict by its value pairs - sorting

I have a dict which looks like this:
Unsorted:
12 {12 489} 29 {89 12} 27 {301 302} 26 {489 329} 8 {89 302} 55 {44 301}
I would like to sort it like this:
55 {44 301} 27 {301 302} 8 {89 302} 29 {89 12} 12 {12 489} 26 {489 329}
As you can see, most of the time the second key value of the preceding entry is identical to the first key entry of the following entry. ( 12 and 489 in the last two entries)
This although is no requirement. The 302 of the second and third entry also fullfills the requirement of a "chain" as it exists in both the second and the third entry.
The only thing I want to do is sorting these entries in such a way, that the values in braces form a uninterupted chain.
It does not matter if the result looks like in the example or if it is mirrored.
From TCL 8.6 on I could do something similar to Sort Tcl dict by value using stride. But I'm stuck with this (Tcl8.5.9) version. What is the easiest way to this?

I don't know if this is the easiest way:
set x [dict create 12 {12 489} 29 {89 12} 27 {301 302} 26 {489 329} 8 {89 302} 55 {44 301}]
# transform the dict into a list of lists
dict for {k v} $x {lappend unsorted [list $k $v]}
lappend sorted [lindex $unsorted 0]
set unsorted [lrange $unsorted 1 end]
# keep going until there's nothing more to add to the sorted list
while {[llength $unsorted] != 0} {
set changed false
for {set idx 0} {$idx < [llength $unsorted]} {incr idx} {
set elem [lindex $unsorted $idx]
lassign [lindex $elem end] a b
set head [lindex $sorted 0 end]
set tail [lindex $sorted end end]
if {$a in $head || $b in $head} {
set sorted [linsert $sorted 0 $elem]
set changed true
} elseif {$a in $tail || $b in $tail} {
lappend sorted $elem
set changed true
}
if {$changed} {
set unsorted [lreplace $unsorted $idx $idx]
break
}
}
# avoid infinite loop if the unsorted list is not empty, but
# contains nothing to add to the sorted list
if {! $changed} break
}
foreach elem $sorted {dict set y {*}$elem}
puts "Unsorted: $x"
puts "Sorted: $y"
Unsorted: 12 {12 489} 29 {89 12} 27 {301 302} 26 {489 329} 8 {89 302} 55 {44 301}
Sorted: 55 {44 301} 27 {301 302} 8 {89 302} 29 {89 12} 12 {12 489} 26 {489 329}

Related

Interchange columns using bash

I have a file containing two columns e.g.
10 25
26 38
40 62
85 65
88 96
97 8
I want first column to contain all minimum values and second column containing all maximum values. Something like this:
10 25
26 38
40 62
65 85
88 96
8 97
Hope this helps.
awk '{
if ($2 < $1 )
print $2," "$1;
else
print $1," "$2;
}' filename.txt
Make sure columns are space separated
Using Python, it is straightforward:
values = [
(10, 25),
(26, 38),
(40, 62),
(85, 65),
(88, 96),
(97, 8),
]
result = [(min(v), max(v)) for v in values]
You get:
[(10, 25), (26, 38), (40, 62), (65, 85), (88, 96), (8, 97)]
Using bash… I don't know:
python -c "<your command here>'
I have tried this
#!/bin/sh
echo -n "Enter a file name > "
read name
if ["$1" -gt "$2"];
then
awk ' { t= $1; $1 = $2; $2=t ; print; } ' $name
fi
exit 0;

Need help to formating data

I need your help to formatting my data. I have a data like below
Ver1
12 45
Ver2
134 23
Ver3
2345 980
ver4
21 1
ver36
213141222 22
....
...etc
I need my data like the below format
ver1 12 45
ver2 134 23
ver3 2345 980
ver4 21 1
etc.....
Also i want the total count of col 2 and 3 at the end of the output. Im not sure the scripts, if you provide simple script (May AWK can, but not sure).if possible please share the detailed answer to learn and understand.
$ awk 'NR%2{printf $0" ";next;}
{col1+=$1; col2+=$2} 1;
END{print "TOTAL col1="col1, "col2="col2}' file
Ver1 12 45
Ver2 134 23
Ver3 2345 980
ver4 21 1
ver36 213141222 22
TOTAL col1=213143734 col2=1071
It merges every two lines as solved by Kent. It also sums the 1st and 2nd column into col1 and col2 vars. Finally, it prints the value in the END {} block.

Bash script number generator

I need to generate random numbers in an specific format as test data. For example, given a number "n" I need to produce "n" random numbers and write them in a file. The file must contain at most 3 numbers per line. Here is what I have:
#!/bin/bash
m=$1
output=$2
for ((i=1; i<= m; i++)) do
echo $((RANDOM % 29+2)) >> $output
done
This outputs the numbers as:
1
2
24
21
10
14
and what I want is:
1 2 24
21 10 14
Thank you for your help!
Pure bash (written as a function rather than a script file)
randx3() {
local d=$' \n'
local i
for ((i=0;i<$(($1 - 1));++i)); do
printf "%d%c" $((RANDOM%29 + 2)) "${d:$((i%3)):1}"
done
printf "%d\n" $((RANDOM%29 + 2))
}
Note that it doesn't take a file argument; rather it outputs to stdout, so you would use it like this:
randx3 11 > /path/to/output
That style is often more flexible.
Here's a less hacky one which allows you to select how often you want a newline:
randx() {
local i
local m=$1
local c=${2:-3}
for ((i=1;i<=m;++i)); do
if ((i%c && i<m)); then
printf "%d " $((RANDOM%29 + 2))
else
printf "%d\n" $((RANDOM%29 + 2))
fi
done
}
Call that one as randx 11 or randx 11 7 (second argument defaults to 3).
Pipe the output to a command that will read 3 lines at a time:
for ((i=1; i<= m; i++)) do
echo $((RANDOM % 29+2))
done | sed -e '$!N;$!N;s/\n/ /g' >> $output
This is what paste was designed for:
$ for i in {0..10}; do echo $RANDOM; done | paste -d' ' - - -
14567 3240 16354
17457 25616 12772
3912 7490 12206
7342 10554
Another approach would be to build up the values in an array, then use printf.
m=$1
output=$2
vals=()
while (( m-- )); do
vals+=( $((RANDOM % 29+2)) )
done
printf '%d %d %d\n' "${vals[#]}" > "$output"
Shortest!!!
I need to produce "n" random numbers and write them in a file. The file must contain at most 3 numbers per line.
pr -t -3 -s\ <(for ((n=6;n--;)){ echo $((RANDOM % 29+2));}) >file
Then
cat file
11 29 27
14 21 22
YAS: Yet another bash solution
As a script:
#!/bin/bash
n=$1
file=$2
out=()
>$file
for ((i=1;i<=n;i++));do
out+=($((RANDOM%29+2)))
[ $((i%3)) -eq 0 ] && echo ${out[*]} >>$file && out=()
done
[ "$out" ] && echo ${out[*]} >>$file
Usage:
script <quantity of random> <filename>
Important remark about RANDOM%29
This way of rendering random between 2 to 30 is not equitable!
As $RANDOM give a number between 0 and 32767, there is:
for ((i=0;i<32768;i++)) ;do
((RL[$((i%29+2))]++))
done
for ((i=0;i<32;i++));do
printf "%3d %5d\n" $i ${RL[i]}
done | column
0 0 7 1130 14 1130 21 1130 28 1130
1 0 8 1130 15 1130 22 1130 29 1129
2 1130 9 1130 16 1130 23 1130 30 1129
3 1130 10 1130 17 1130 24 1130 31 0
4 1130 11 1130 18 1130 25 1130
5 1130 12 1130 19 1130 26 1130
6 1130 13 1130 20 1130 27 1130
... there is 1130 chances to obtain a number between 2 to 28, but only 1129 chances to obtain a 29 or a 30.
To prevent this, you have to drop unwanted results:
random2to30() {
local _random=32769
while (( $_random>=32741 )) ;do
_random=$RANDOM;
done;
printf -v $1 "%d" $((2+_random%29))
}
The proof:
tstr2to30() {
unset $1
local _random=32769
while (( $_random>=32741 )); do
read _random || break
done
[ "$_random" ] && printf -v $1 "%d" $((2 +_random % 29 ))
}
unset RL
while tstr2to30 MyRandom && [ "$MyRandom" ] ;do
((RL[MyRandom]++))
done < <(seq 0 32767)
for ((i=0;i<32;i++));do
printf "%3d %5d\n" $i ${RL[i]}
done | column
Give:
0 0 7 1129 14 1129 21 1129 28 1129
1 0 8 1129 15 1129 22 1129 29 1129
2 1129 9 1129 16 1129 23 1129 30 1129
3 1129 10 1129 17 1129 24 1129 31 0
4 1129 11 1129 18 1129 25 1129
5 1129 12 1129 19 1129 26 1129
6 1129 13 1129 20 1129 27 1129
Where all value do obtain exactly same (1129) chances!
Final useable script
So the script could become (Don't forget bash's shebang!):
#!/bin/bash
n=${1:-11} # default to 11 values
c=${2:-3} # default to 3 values by lines
minval=${3:-2} # default to 2 random min
maxval=${4:-30} # defailt to 30 random max
file=${5:-/dev/stdout} # default to STDOUT
rnum=$(( maxval - minval + 1 ))
rmax=$(( ( 32768 / rnum ) * rnum ))
randomGen() {
local _random=33000
while [ $_random -ge $rmax ] ;do
_random=$RANDOM
done
printf -v $1 "%d" $(( minval +_random % rnum ))
}
out=()
for ((i=1;i<=n;i++));do
randomGen MyRandom
out+=($MyRandom)
[ $((i%c)) -eq 0 ] && echo ${out[*]} >>"$file" && out=()
done
[ "$out" ] && echo ${out[*]} >>"$file"
This awk will insert a newline after every 3rd line or a space:
for ((i=1; i<= m; i++)); do
echo $((RANDOM % 29+2))
done | awk '{printf "%s%c", $1, (NR % 3) ? " " : "\n"}' >> $output
yet another way of doing it :
eval echo {1..$m} | xargs -n3 echo $((RANDOM % 29+2)) > $output

split a file based upon line number

I have a large file that needs to be slitted based on line numbers.
For instance , my file is like that:
aaaaaa
bbbbbb
cccccc
dddddd
****** //here blank line//
eeeeee
ffffff
gggggg
hhhhhh
*******//here blank line//
ıııııı
jjjjjj
kkkkkk
llllll
******
//And so on...
I need two separate files as such that one file should have first 4 lines, third 4 lines, fifth 4 lines in it and the other file should have second 4 lines, fourth 4 lines, sixth 4 lines in it and so on. how can I do that in bash script?
You can play with the number of the line, NR:
$ awk 'NR%10>0 && NR%10<5' your_file > file1
$ awk 'NR%10>5' your_file > file2
If it is 10K + n, 0 < n < 5, then goes to the first file.
If it is 10K + n, n > 5, then goes to the second file.
In one line:
$ awk 'NR%10>0 && NR%10<5 {print > "file1"} NR%10>5 {print > "file2"}' file
Test
$ cat a
1
2
3
4
6
7
8
9
11
12
13
14
16
17
18
19
21
22
23
24
26
27
28
29
31
32
33
34
36
37
38
39
41
42
43
44
46
47
48
49
51
$ awk 'NR%10>0 && NR%10<5 {print > "file1"} NR%10>5 {print > "file2"}' a
$ cat file1
1
2
3
4
11
12
13
14
21
22
23
24
31
32
33
34
41
42
43
44
51
$ cat file2
6
7
8
9
16
17
18
19
26
27
28
29
36
37
38
39
46
47
48
49
You can do this with head and tail (which are not be part of the bash itself):
head -n 20 <file> | tail -n 5
gives you the lines 15 to 20.
This is however inefficient, if you want to get multiple sections of your file, since it has to be parsed again and again. In this case I'd prefer some real scripting.
Another approach is to treat blank-line-separated paragraphs as the records, and print odd-numbered and even-numbered records to different files:
awk -v RS= -v ORS='\n\n' '{
outfile = (NR % 2 == 1) ? "file1" : "file2"
print > outfile
}' file
Maybe something like that:
#!/bin/bash
EVEN="even.log"
ODD="odd.log"
line_count=0
block_count=0
while read line
do
# ignore blank lines
if [ ! -z "$line" ]; then
if [ $(( $block_count % 2 )) -eq 0 ]; then
# even
echo "$line" >> "$EVEN"
else
# odd
echo "$line" >> "$ODD"
fi
line_count=$[$line_count +1]
if [ "$line_count" -eq "4" ]; then
block_count=$[$block_count +1]
line_count=0
fi
fi
done < "$1"
The first argument is the source file: ./split.sh split_input
This script prints lines from file 1.txt with indexes 0, 1, 2, 3, 8, 9, 10, 11, 16, 17, 18, 19, ...
i=0
while read p; do
if [ $i%8 -lt 4 ]
then
echo $p
fi
let i=$i+1
done < 1.txt
This script prints lines with indexes 4, 5, 6, 7, 12, 13, 14, 15, ...
i=0
while read p; do
if [ $i%8 -gt 3 ]
then
echo $p
fi
let i=$i+1
done < 1.txt

Sorting a column with only min value

I have this kind of file :
abak 1 2 3 4
b.b 2 3 4 5
abak 2 5 6 2
b.b -1.2 3 4 6
cc 3 4 5 6
And I want
abak 1 2 3 4
b.b -1.2 3 4 6
cc 3 4 5 6
A sorted by column 2 file with only the min value for the column
As a first step I tried to sort the lines with :
set file [open "[lindex $argv 0]" "r"]
foreach line [split [read $file] "\n"] {
lappend records [split $line " "]
}
set records [lsort -index 1 -real $records]
foreach record $records {
puts [join $record " "]
}
}
but i go the error :
expected floating-point number but got ""
while executing
"lsort -index 1 -real $records"
column 2 have not all floating number, but it's a real number;
Why it cannot work ?
Thanks
This is very much a question about creating and manipulating a data structure. This is how I would approach it:
set fid [open filename r]
set data [dict create]
while {[gets $fid line] != -1} {
set fields [regexp -inline -all {\S+} $line]
dict lappend data [lindex $fields 0] [lrange $fields 1 end]
}
dict for {key values} $data {
puts [format "%-5s %s" $key [lindex [lsort -real -index 0 $values] 0]]
}
outputs
abak 1 2 3 4
b.b -1.2 3 4 6
cc 3 4 5 6
Your key problem is that split is not the right way to extract those records: it converts multi-space sequences into empty elements. Instead, you want to use this:
lappend records [regexp -all -inline {\S+} $line]
That will convert the line into its list of non-space sequences. (Yes, you lose the spaces when you reconvert; that's usually not too big a problem but you can handle it if you need to.) The rest of your code looks fine enough.

Resources