Sorting decimals

Sorting decimals - bash

here is another question for sorting a list with decimals:
$ list="1 2 5 2.1"
$ for j in "${list[#]}"; do echo "$j"; done | sort -n
1 2 5 2.1
I expected
1 2 2.1 5

If you intended that the variable list be an array, then you needed to say:
list=(1 2 5 2.1)
which would result in
1
2
2.1
5

for j in $list; do echo $j; done | sort -n
or
printf '%s\n' $list|sort -n

You do not need to "${list[#]}" but just $list because it is just a string. Otherwise it gets all numbers in the same field.
$ for j in $list; do echo $j; done | sort -n
1
2
2.1
5
With your previous code it was not sorting at all:
$ list="77 1 2 5 2.1 99"
$ for j in "${list[#]}"; do echo "$j"; done | sort -n
77 1 2 5 2.1 99

Related

Bash: extract column using empty lines as separators

I have a file like:
1
2
3
4
5
a
b
c
d
e
And want to put it like:
1 a
2 b
3 c
4 d
5 e
Is there a quick way to do it in bash?

pr is the tool to use for columnizing data:
pr -s" " -T -2 filename

With paste and process substitution:
$ paste -d " " <(sed -n '1,/^$/{/^$/d;p}' file) <(sed -n '/^$/,${//!p}' file)
1 a
2 b
3 c
4 d
5 e

Simple bash script the does the job:
nums=()
is_line=0
cat ${1} | while read line
do
if [[ ${line} == '' ]]
then
is_line=1
else
if [[ ${is_line} == 0 ]]
then
nums=("${nums[#]}" "${line}")
else
echo ${nums[0]} ${line}
nums=(${nums[*]:1})
fi
fi
done
Run it like this: ./script filename
Example:
$ ./script filein
1 a
2 b
3 c
4 d
5 e

$ rs 2 5 <file | rs -T
1 a
2 b
3 c
4 d
5 e
If you want that extra separator space off, use -g1 in the latter rs. Explained:
print file in 5 cols and 2 rows
-T transpose it

Sum/Average numbers in a single line - UNIX

I'm working on a small script to take 3 numbers in a single line, sum and average them, and print the result at the end of the line. I know how to use the paste command, but everything I'm finding is telling me how to average a column. I need to average a line, not a column. Any advice? Thanks!

awk to the rescue!
$ echo 1 2 3 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
6 2
works for any number of input fields
$ echo 1 2 3 4 | awk -v RS=' ' '{sum+=$1; count++} END{print sum, sum/count}'
10 2.5

You can manipulate your line before giving it to bc. With bc you have additional possibilities such as setting the scale.
A simple mean from 1 2 3 would be
echo "1 2 3" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/3/' | bc
You can wrap it in a function and see more possibilities:
function testit {
echo "Input $#"
echo "Integer mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc
echo "floating decimal mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/(&)\/'$#'/' | bc -l
echo "2 decimal output mean"
echo "$#" | sed -e 's/\([0-9.]\) /\1+/g' -e 's/.*/scale=2; (&)\/'$#'/' | bc
echo
}
testit 4 5 6
testit 4 5 8
testit 4.2 5.3 6.4
testit 1 2 3 4 5 6 7 8 9

Removing a specified number of lines from both head and tail of a stream

k=$1
m=$2
fileName=$3
head -n -$k "$fileName" | tail -n +$m
I have the bash code.
when I execute it, it only removes less than what it should remove. like ./strip.sh 4 5 hi.txt > bye.txt should remove first 4 lines and last 5 lines, but it only removes first 4 lines and last "4" lines. Also, when I execute ./strip.sh 1 1 hi.txt > bye.txt, it only removes last line, not first line....

#!/bin/sh
tail -n +"$(( $1 + 1 ))" <"$3" | head -n -"$2"
Tested as follows:
set -- 4 5 /dev/stdin # assign $1, $2 and $3
printf '%s\n' {1..20} | tail -n +"$(( $1 + 1 ))" <"$3" | head -n -"$2"
...which correctly prints numbers between 5 and 15, trimming the first 4 from the front and 5 from the back. Similarly, with set -- 3 6 /dev/stdin, numbers between 4 and 14 inclusive are printed, which is likewise correct.

select rows where values of two columns agree

if I have the following:
1 5 a
2 5 a
3 5 a
4 5 a
5 5 a
6 5 a
1 3 b
2 3 b
3 3 b
4 3 b
5 3 b
6 3 b
How do I only select rows where the two columns have the same value i.e.
5 5 a
3 3 b
in bash / awk / sed.
I know how to select rows with certain values using awk, but only when I specifiy the value.

Just say:
$ awk '$1==$2' file
5 5 a
3 3 b
As you see, when the condition $1 == $2 is accomplished, awk automatically prints the line.

perl -ane 'print if $F[0] == $F[1]' file

For completeness:
bash
while read first second rest; do
[[ $first -eq $second ]] && echo "$first $second $rest"
done < fi
or if content is not just integers:
while read first second rest; do
[[ $first == $second ]] && echo "$first $second $rest"
done < file
sed
sed -En '/^([^ ]+) \1 /p' file

This might work for you (GNU grep):
grep '^\(\S\+\)\s\+\1\s\+' file

Count line lengths in file using command line tools

Problem
If I have a long file with lots of lines of varying lengths, how can I count the occurrences of each line length?
Example:
file.txt
this
is
a
sample
file
with
several
lines
of
varying
length
Running count_line_lengths file.txt would give:
Length Occurences
1 1
2 2
4 3
5 1
6 2
7 2
Ideas?

This
counts the line lengths using awk, then
sorts the (numeric) line lengths using sort -n and finally
counts the unique line length values uniq -c.
$ awk '{print length}' input.txt | sort -n | uniq -c
1 1
2 2
3 4
1 5
2 6
2 7
In the output, the first column is the number of lines with the given length, and the second column is the line length.

Pure awk
awk '{++a[length()]} END{for (i in a) print i, a[i]}' file.txt
4 3
5 1
6 2
7 2
1 1
2 2

Using bash arrays:
#!/bin/bash
while read line; do
((histogram[${#line}]++))
done < file.txt
echo "Length Occurrence"
for length in "${!histogram[#]}"; do
printf "%-6s %s\n" "${length}" "${histogram[$length]}"
done
Example run:
$ ./t.sh
Length Occurrence
1 1
2 2
4 3
5 1
6 2
7 2

$ perl -lne '$c{length($_)}++ }{ print qq($_ $c{$_}) for (keys %c);' file.txt
Output
6 2
1 1
4 3
7 2
2 2
5 1

Try this:
awk '{print length}' FILENAME
Or next if you want the longest length:
awk '{ln=length} ln>max{max=ln} END {print FILENAME " " max}'
You can combine above command with find using -exec option.

You can accomplish this by using basic unix utilities only:
$ printf "%s %s\n" $(for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c | sed -E "s/([0-9]+)[^0-9]+([0-9]+)/\2 \1/")
1 1
2 2
4 3
5 1
6 2
7 2
How it works?
Here's the source file:
$ cat file.txt
this
is
a
sample
file
with
several
lines
of
varying
length
Replace each line of the source file with its length:
$ for line in $(cat file.txt); do printf $line | wc -c; done
4
2
1
6
4
4
7
5
2
7
6
Sort and count the number of length occurrences:
$ for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c
1 1
2 2
3 4
1 5
2 6
2 7
Swap and format the numbers:
$ printf "%s %s\n" $(for line in $(cat file.txt); do printf $line | wc -c; done | sort -n | uniq -c | sed -E "s/([0-9]+)[^0-9]+([0-9]+)/\2 \1/")
1 1
2 2
4 3
5 1
6 2
7 2

If you allow for the columns to be swapped and don't need the headers, something as easy as
while read line; do echo -n "$line" | wc -m; done < file | sort | uniq -c
(without any advanced tricks with sed or awk) will work. The output is:
1 1
2 2
3 4
1 5
2 6
2 7
One important thing to keep in mind: wc -c counts the bytes, not the characters, and will not give the correct length for strings containing multibyte characters. Therefore the use of wc -m.
References:
man uniq(1)
man sort(1)
man wc(1)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Sorting decimals - bash

here is another question for sorting a list with decimals: $ list="1 2 5 2.1" $ for j in "${list[#]}"; do echo "$j"; done | sort -n 1 2 5 2.1 I expected 1 2 2.1 5

If you intended that the variable list be an array, then you needed to say: list=(1 2 5 2.1) which would result in 1 2 2.1 5

for j in $list; do echo $j; done | sort -n or printf '%s\n' $list|sort -n

Related

Bash: extract column using empty lines as separators

Sum/Average numbers in a single line - UNIX

Removing a specified number of lines from both head and tail of a stream

select rows where values of two columns agree

Count line lengths in file using command line tools

Categories

Resources