bash sorting -g for scientific notation (E values)

bash sorting -g for scientific notation (E values) - sorting

I tried sort in bash using the -g option. I got the following output for sort -g name.dat:
1.2978025974026E+15 1.2800000000000E-28 3.1000000000000E-29
1.3565266968326E+13 3.9650000000000E-26 1.0000000000000E-29
1.3879277777778E+14 2.5900000000000E-27 6.6000000000000E-28
2.4176806451613E+14 .........................................
Only the first few digits are sorted (e.g., 1.29 < 1.35 < 1.38), but the order isn't actually correct because the scientific notation exponent (E+15) is being ignored.
I also tried sort -k 1 -n name.dat and sort -k 1 -g name.dat, but those don't work either. Changing the E to e also doesn't work.

Related

sort does not work with -h with text file

In my OS, I can find
-h, --human-numeric-sort
compare human readable numbers (e.g., 2K 1G)
And I have a file aaa.txt:
2M
5904K
1G
Then I type
sort -h aaa.txt
The output is
5904K
2M
1G
It's wrong. It should be
2M
5904K
1G
Questions:
Why does sort -h not work? The result is wrong even in lexicographically order perspective. How to sort the aaa.txt file in human readable numbers.
Or it can work only with du -h? But the most vostes answer seems can work with awk.
With du -h, sort does not need to specify which field, like sort -k1h,1 ? Why? What would happend if the memory size is not in the first field?

Why does sort -h not work?
Below is a comment from GNU sort's source code.
/* Compare numbers ending in units with SI xor IEC prefixes
<none/unknown> < K/k < M < G < T < P < E < Z < Y
Assume that numbers are properly abbreviated.
i.e. input will never have both 6000K and 5M. */
It's not mentioned in the man page, but -h is not supposed to work with your input.
How to sort the aaa.txt file in human readable numbers.
You can use numfmt to perform a Schwartzian transform as shown below.
$ numfmt --from=auto < aaa.txt | paste - aaa.txt | sort -n | cut -f2
2M
5904K
1G

Bash: reshape a dataset of many rows to dataset of many columns

Suppose I have the following data:
# all the numbers are their own number. I want to reshape exactly as below
0 a
1 b
2 c
0 d
1 e
2 f
0 g
1 h
2 i
...
And I would like to reshape the data such that it is:
0 a d g ...
1 b e h ...
2 c f i ...
Without writing a complex composition. Is this possible using the unix/bash toolkit?
Yes, trivially I can do this inside a language. The idea is NOT TO "just" do that. So if some cat X.csv | rs [magic options] sort of solution (and rs, or the bash reshape command, would be great, except it isn't working here on debian stretch) exists, that is what I am looking for.
Otherwise, an equivalent answer that involves a composition of commands or script is out of scope: already got that, but would rather not have it.

Using GNU datamash:
$ datamash -s -W -g 1 collapse 2 < file
0 a,d,g
1 b,e,h
2 c,f,i
Options:
-s sort
-W use whitespace (spaces or tabs) as delimiters
-g 1 group on the first field
collapse 2 print comma-separated list of values of the second field
To convert the tabs and commas to space characters, pipe the output to tr:
$ datamash -s -W -g 1 collapse 2 < file | tr '\t,' ' '
0 a d g
1 b e h
2 c f i

bash version:
function reshape {
local index number key
declare -A result
while read index number; do
result[$index]+=" $number"
done
for key in "${!result[#]}"; do
echo "$key${result[$key]}"
done
}
reshape < input
We just need to make sure input is in unix format

Sorting using -k

I tried this solution to my list and I can't get what I want after sorting.
I got list:
m_2_mdot_3_a_1.dat ro= 303112.12
m_1_mdot_2_a_0.dat ro= 300.10
m_2_mdot_1_a_3.dat ro= 221.33
m_3_mdot_1_a_1.dat ro= 22021.87
I used sort -k 2 -n >name.txt
I would like to get list from the lowest ro to the highest ro. What I did wrong?
I got a sorting but by the names of 1 column or by last value but like: 1000, 100001, 1000.2 ... It sorted like by only 4 meaning numbers or something.

cat test.txt | tr . , | sort -k3 -g | tr , .
The following link gave a good answer Sort scientific and float
In brief,
you need -g option to sort on decimal numbers;
the -k option start
from 1 not 0;
and by default locale, sort use , as seperator
for decimal instead of .
However, be careful if your name.txt contains , characters

Since there's a space or a tab between ro= and the numeric value, you need to sort on the 3rd column instead of the 2nd. So your command will become:
cat input.txt | sort -k 3 -n

Determining the number of decimals in a float number

I want to run a command (such as ls -lrt) 49 times and every time 20 milliseconds after the previous run. What I have written in my bash file is:
for i in `seq 1 49`;
do
v=6.$((i*20)
sleep $v && ls -lrt
done
But it apparently does not differentiate cases like where i equals to 4 with the one that i equals to 40 as both result in v=6.8. What I need is to wait 6.080 for i=4 and 6.800 for i=40.

You can use printf to format the number:
printf -v v '6.%03d' $((i*20))
-v v specifies that the variable $v should hold the result.

how about v=$(echo "scale=2;6+$i*0.02"|bc)
this will keep increasing if the result was greater than 7, although it won't happen till 49. But personally I think it is better than string concatenation.

KornShell Sort Array of Integers

Is there a command in KornShell (ksh) scripting to sort an array of integers? In this specific case, I am interested in simplicity over efficiency. For example if the variable $UNSORTED_ARR contained values "100911, 111228, 090822" and I wanted to store the result in $SORTED_ARR

Is it actually an indexed array or a list in a string?
Array:
UNSORTED_ARR=(100911 111228 090822)
SORTED_ARR=($(printf "%s\n" ${UNSORTED_ARR[#]} | sort -n))
String:
UNSORTED_ARR="100911, 111228, 090822"
SORTED_ARR=$(IFS=, printf "%s\n" ${UNSORTED_ARR[#]} | sort -n | sed ':a;$s/\n/,/g;N;ba')
There are several other ways to do this, but the principle is the same.
Here's another way for a string using a different technique:
set -s -- ${UNSORTED_ARR//,}
SORTED_ARR=$#
SORTED_ARR=${SORTED_ARR// /, }
Note that this is a lexicographic sort so you would see this kind of thing when the numbers don't have leading zeros:
$ set -s -- 10 2 1 100 20
$ echo $#
1 10 100 2 20

If I take that out then it works but I can't loop through it (because its a list of strings now) – pws5068 Mar 4 '11 at 21:01
Do this:
\# create sorted array
set **-s** -A $#

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

bash sorting -g for scientific notation (E values) - sorting

Related

sort does not work with -h with text file

Bash: reshape a dataset of many rows to dataset of many columns

Sorting using -k

Determining the number of decimals in a float number

KornShell Sort Array of Integers

Categories

Resources