sort strings in csh - sorting

set abc=( x1 y1 x2 y2 x21 y21 x22 y22 )
set new=`echo $abc | sort -kn`
echo $new
The above script gives me the same array.
I expect
x1 x2 x21 x22 y1 y2 y21 y22
Where did I go wrong ?

sort sorts by lines, and you're giving it just a single line of input.
This should work:
set abc = ( x1 y1 x2 y2 x21 y21 x22 y22 )
set new = `echo $abc | fmt -1 | sort -n`
echo $new

Related

awk or sed command for columns and rows selection from multiple files

Looking for a command for the following task:
I have three files, each with two columns, as seen below.
I would like to create file4 with four columns.
The output should resemble a merge-sorted version of file1, file2 and file3 such that the first column is sorted, the second column is the second column of file1 the third column is the second column of file2 and the fourth column is the second column of file3.
The entries in column 2 to 3 should not be sorted but should match the key-value in the first column of the original files.
I tried intersection in Linux, but not giving the desired outputs.
Any help will be appreciated. Thanks in advance!!
$ cat -- file1
A1 B5
A10 B2
A3 B15
A15 B6
A2 B10
A6 B19
$ cat -- file2
A10 C4
A4 C8
A6 C5
A3 C10
A12 C14
A15 C18
$ cat -- file 3
A3 D1
A22 D9
A20 D3
A10 D5
A6 D10
A21 D11
$ cat -- file 4
col1 col2 col3 col4
A1 B5
A2 B10
A3 B15 C10 D1
A4 C8
A6 B19 C5 D10
A10 B2 C4 D5
A12 C14
A15 B6 C18
A20 D3
A21 D11
A22 D9
Awk + Bash version:
( echo "col1, col2, col3, col4" &&
awk 'ARGIND==1 { a[$1]=$2; allkeys[$1]=1 } ARGIND==2 { b[$1]=$2; allkeys[$1]=1 } ARGIND==3 { c[$1]=$2; allkeys[$1]=1 }
END{
for (k in allkeys) {
print k", "a[k]", "b[k]", "c[k]
}
}' file1 file2 file3 | sort -V -k1,1 ) | column -t -s ','
Pure Bash version:
declare -A a
while read key value; do a[$key]="${a[$key]:-}${a[$key]:+, }$value"; done < file1
while read key value; do a[$key]="${a[$key]:-, }${a[$key]:+, }$value"; done < file2
while read key value; do a[$key]="${a[$key]:-, , }${a[$key]:+, }$value"; done < file3
(echo "col1, col2, col3, col4" &&
for i in ${!a[#]}; do
echo $i, ${a[$i]}
done | sort -V -k1,1) | column -t -s ','
Explanation for "${a[$key]:-, , }${a[$key]:+, }$value" please check Shell-Parameter-Expansion
Using GNU Awk:
gawk '{ a[$1] = substr($1, 1); b[$1, ARGIND] = $2 }
END {
PROCINFO["sorted_in"] = "#val_num_asc"
for (i in a) {
t = i
for (j = 1; j <= ARGIND; ++j)
t = t OFS b[i, j]
print t
}
}' file{1..3} | column -t
There is a simple tool called join that allows you to perform this operation:
#!/usr/bin/env bash
cut -d ' ' -f1 file{1,2,3} | sort -k1,1 -u > ftmp
for f in file1 file2 file3; do
mv -- ftmp file4
join -a1 -e "---" -o auto file4 <(sort -k1,1 "$f") > ftmp
done
sort -k1,1V ftmp > file4
cat file4
This outputs
A1 B5 --- ---
A2 B10 --- ---
A3 B15 C10 D1
A4 --- C8 ---
A6 B19 C5 D10
A10 B2 C4 D5
A12 --- C14 ---
A15 B6 C18 ---
A20 --- --- D3
A21 --- --- D11
A22 --- --- D9
I used --- to indicate an empty field. If you want to pretty print this, you have to re-parse it with awk or anything else.
This might work for you (GNU sed and sort):
s=''; for f in file{1,2,3}; do s="$s\t"; sed -E "s/\s+/$s/" $f; done |
sort -V |
sed -Ee '1i\col1\tcol2\tcol3\tcol4' -e ':a;N;s/^((\S+\t).*\S).*\n\2\t+/\1\t/;ta;P;D'
Replace spaces by tabs and insert the number of tabs between the key and value depending on which file is being processed.
Sort the output by key column order.
Coalesce each line with its key and print the result.

Sort file by different format of date field

I am trying to sort a file by a date field. I realize this has been done before, however, I cannot find an example that has the following date format.
Canada Goose + 1x03 + For the Triumph of Evil + Sep/30/2013
Rucksack + 10x03 + Everybody's Crying Mercy + Oct/03/13
Test + 4x01 + Season 4, Episode 1 + Jun/01/14
New Family + 3x03 + Double Date + Oct/01/2013
I tried this command but it doesn't work
sort -t '+' -k 4.8,4.11 -k 4.4M -k 4.1,4.2 -b Test.txt
If you have a GNU awk installed, you may want to try this approach.
sort.awk
#!/bin/gawk -f
function convertToSeconds(date, fields) {
split(date, fields, /\//)
fields[1]=months[tolower(fields[1])]
fields[2]=sprintf("%02d", fields[2])
fields[3]=(length(fields[3]) == 2) ? sprintf("2%03d", fields[3]) : fields[3]
return mktime(sprintf("%s %s %s 00 00 00", fields[3], fields[1], fields[2]))
}
BEGIN {
FS="( \\+ )"
months["jan"]="01"; months["feb"]="02"; months["mar"]="03"; months["apr"]="04"
months["may"]="05"; months["jun"]="06"; months["jul"]="07"; months["aug"]="08"
months["sep"]="09"; months["oct"]="10"; months["nov"]="11"; months["dec"]="12"
}
{
arr[convertToSeconds($4)]=$0
}
END {
asorti(arr, dst)
for(i=1; i<=FNR; ++i) {
print arr[dst[i]]
}
}
Give it an execute permission, then run it:
$ chmod +x ./sort.awk
$ ./sort.awk Test.txt
To save the changes into a new file, append this > operator.
$ ./sort.awk Test.txt > SortedTest.txt
** UPDATE 1 **
revised sort key to explicitly list 4 digit year as prefix to circumvent year-end crossover issues
since OP only wants to sort date field, the exact epochs mapping isn't needed at all ::
mawk '$++NF = 366 * ( (_=($3) % 100) + 1900 + 100 * (_<50) ) \
+ int(_ * 10^8) + ($2) + (31) * \
(index(" JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC", toupper($2)) / 3 - 1)'
23284 SEP 30 2013 201300737036
23285 OCT 1 2013 201300737038
23287 OCT 3 2013 201300737040
23541 JUN 14 2014 201400737293
1st column is original date generation order (the correct rank ordering), and the last column is the calculated sort index value - i tested every date from jan 1st 1950 to dec 31 2025, and this simplistic approach ranks order just fine, even though it doesn't bother to calculate exact julian dates, or exact leap years,
since the objective is merely finding a rank ordering method that yields the same sorting output as exact epoch seconds
You're nearly there. Use sed, for example, to add the missing centuries
then the M option of the 2nd KEYDEF field works with GNU sort:
sed 's:/\([0-9][0-9]\)$:/20\1:' << 'HERE' |
f1 + f2 + f3 + NoV/30/15
f1 + f2 + f3 + Sep/30/2013
f1 + f2 + f3 + Oct/03/13
f1 + f2 + f3 + Jun/01/14
f1 + f2 + f3 + Oct/01/2013
f1 + f2 + f3 + mAr/11/11
f1 + f2 + f3 + oct/03/2013
f1 + f2 + f3 + juL/17/1998
HERE
LC_ALL=C sort -t '+' -k 4.9 -k 4.2M,4.4 -k 4.6,4.7
Output:
f1 + f2 + f3 + juL/17/1998
f1 + f2 + f3 + mAr/11/2011
f1 + f2 + f3 + Sep/30/2013
f1 + f2 + f3 + Oct/01/2013
f1 + f2 + f3 + Oct/03/2013
f1 + f2 + f3 + oct/03/2013
f1 + f2 + f3 + Jun/01/2014
f1 + f2 + f3 + NoV/30/2015

Using bash to get Two variables in for-loop form two different lists

I'm working with bash and I have two lists:
AZU SJI IOP
A1 B1 C1
Using the bash code below:
for f1 in AZU SJI IOP
do
for f2 in A1 B1 C1
do
echo $f1 + $f2
done
done
I get this result:
$ bash dir.sh
AZU + A1
AZU + B1
AZU + C1
SJI + A1
SJI + B1
SJI + C1
IOP + A1
IOP + B1
IOP + C1
I would like to get the result in this way
AZU A1
SJI B1
IOP C1
Define two arrays such that they have the same indices, then iterate over the indices of one array:
list1=(AZU SJI IOP)
list2=(A1 B1 C1)
for i in "${!list1[#]}"; do
echo "${list1[i]} ${list2[i]}"
done
You could of course use paste, but since your lists are not in files, you might be interested in a solution without external commands:
set -- A1 B1 C1
for f1 in AZU SJI IOP
do echo $f1 $1
shift
done

Merging multiple tables on key column and different types of columns

I'm trying to merge multiple TSV tables but I'm struggling to get the outputs I need.
Lets say we have file1:
K1 V1
K2 V2
K3 V3
K4 V4
file2:
K1 X1 Y1
K2 X2 Y2
K4 X4 Y4
file3: (UX is a column we don't want to include in the final merge)
K1 UX A1
K2 UX A2
K3 UX A3
K4 UX A4
now lets say I want to merge file1, file2 and file3 all on their keys and selecting certain columns.
So suppose I want a certain output:
K1 V1 X1 Y1 A1
K2 V2 X2 Y2 A2
K4 V4 X4 Y4 A4
Currently I'm trying to use join -t$'\t' <(sort -t$'\t' -k1,1 file1) etc. ... but I'm facing difficulties because I'm trying to choose certain columns in various different tables.
Does anyone know a solution to this?
Thank you!
EDIT: So currently I have merged the tables like this:
join -t$'\t' <(sort -t$'\t' -k1,1 file1) \
<(sort -t$'\t' -k1,1 file2) \
<(sort -t$'\t' -k1,1 file3) > join1.txt
...but obviously this does not let me select the columns. I'm trying to use a awk loop to try and do it, but it seems more complicated than it should be.
I'm not sure your attempt with join would have worked since join accepts only two files at a time.
You can always tell join which columns to report. The following works with your data:
join -t$'\t' -o1.1,1.2,1.3,1.4,2.3 \
<(join -t$'\t' \
<(sort -t$'\t' -k1,1 file1) \
<(sort -t$'\t' -k1,1 file2) ) \
<(sort -t$'\t' -k1,1 file3)
Output:
K1 V1 X1 Y1 A1
K2 V2 X2 Y2 A2
K4 V4 X4 Y4 A4

Add different value to each column in array

How can i add a different value to each column in a bash script?
Example: Three function f1(x) f2(x) f3(x) plotted over x
test.dat:
# x f1 f2 f3
1 0.1 0.01 0.001
2 0.2 0.02 0.002
3 0.3 0.03 0.003
Now i want to add to each function a different offset value
values = 1 2 3
Desired result:
# x f1 f2 f3
1 1.1 2.01 3.001
2 1.2 2.02 3.002
3 1.3 2.03 3.003
So first column should be unaffected, otherwise the value added.
I tried this, but it doesn work
declare -a energy_array=( 1 2 3 )
for (( i =0 ; i < ${#energy_array[#]} ; i ++ ))
do
local energy=${energy_array[${i}]}
cat "test.dat" \
| awk -v "offset=${energy}" \
'{ for(j=2; j<NF;j++) printf "%s",$j+offset OFS; if (NF) printf "%s",$NF; printf ORS} '
done
You can try the following:
declare -a energy_array=( 1 2 3 )
awk -voffset="${energy_array[*]}" \
'BEGIN { n=split(offset,a) }
NR> 1{
for(j=2; j<=NF;j++)
$j=$j+a[j-1]
print;next
}1' test.dat
With output:
# x f1 f2 f3
1 1.1 2.01 3.001
2 1.2 2.02 3.002
3 1.3 2.03 3.003

Resources