Concatenation of two columns from the same file - bash

From a text file
file
a d
b e
c f
how are the tab delimited columns concatenated into one column
a
b
c
d
e
f
Now I use awk to output columns to two files that I then concatenated using cat. But there must be a better one line command?

for a generalized approach
$ f() { awk '{print $'$1'}' file; }; f 1; f 2
a
b
c
d
e
f
if the file is tab delimited perhaps simply with cut (the inverse operation of paste)
$ cut -f1 file.t; cut -f2 file.t

This simple awk command should do the job:
awk '{print $1; s=s $2 ORS} END{printf "%s", s}' file
a
b
c
d
e
f

You can use process substitution; that would eliminate the need to create file for each column.
$ cat file
a d
b e
c f
$ cat <(awk '{print $1}' file) <(awk '{print $2}' file)
a
b
c
d
e
f
$
OR
as per the comment you can just combine multiple commands and redirect their output to a different file like this:
$ cat file
a d
b e
c f
$ (awk '{print $1}' file; awk '{print $2}' file) > output
$ cat output
a
b
c
d
e
f
$

try: Without reading file twice or without any external calls of any other commands, only single awk to rescue. Also considering that your Input_file is same like shown sample.
awk '{VAL1=VAL1?VAL1 ORS $1:$1;VAL2=VAL2?VAL2 ORS $2:$2} END{print VAL1 ORS VAL2}' Input_file
Explanation: Simply creating a variable named VAL1 which will contain $1's value and keep on concatenating in it's own value, VAL2 will have $2's value and keep on concatenating value in it's own. In END section of awk printing the values of VAL1 and VAL2.

You can combine bash commands with ; to get a single stream:
$ awk '{print $1}' file; awk '{print $2}' file
a
b
c
d
e
f
Use process substitution if you want that to be as if it were a single file:
$ txt=$(awk '{print $1}' file; awk '{print $2}' file)
$ echo "$txt"
a
b
c
d
e
f
Or for a Bash while loop:
$ while read -r line; do echo "line: $line"; done < <(awk '{print $1}' file; awk '{print $2}' file)
line: a
line: b
line: c
line: d
line: e
line: f

If you're using notepadd++ you could replace all tab values with the newline char "\r\n"

another approach:
for i in $(seq 1 2); do
awk '{print $'$i'}' file
done
output:
a
b
c
d
e
f

Related

How to print lines with the specified word in the path?

Let's say I have file abc.txt which contains the following lines:
a b c /some/path/123/path/120
a c b /some/path/312/path/098
a p t /some/path/123/path/321
a b c /some/path/098/path/123
and numbers.txt:
123
321
123
098
I want to print the whole line which contain "123" only in the third place under "/some/path/123/path",
I don't want to print line "a c b/some/path/312/path" or
"a b c /some/path/098/path/123/". I want to save all files with the "123" in the third place in the new file.
I tried several methods and the best way seems to be use awk. Here is my example code which is not working correctly:
for i in `cat numbers.txt | xargs`
do
cat abc.txt | awk -v i=$i '$4 ~ /i/ {print $0}' > ${i}_number.txt;
done
because it's catching also for example "a b c /some/path/098/path/123/".
Example:
For number "123" I want to save only one line from abc.txt in 123_number.txt:
a b c /some/path/123/path/120
For number "312" I want to save only one line from abc.txt in 312_number.txt:
a c b /some/path/312/path/098
this can be accomplished in a single awk call:
$ awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){f=$4"_number.txt";print >>f;close(f)}' numbers.txt abc.txt
$ cat 098_number.txt
a b c /some/path/098/path/123
$ cat 123_number.txt
a b c /some/path/123/path/120
a p t /some/path/123/path/321
keep numbers in an array and use it for matching lines, append matching lines to corresponding files.
if your files are huge you may speed up the process using sort:
sort -t'/' -k4 abc.txt | awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){if($4!=p){close(f);f=(p=$4)"_number.txt"};print >>f}' numbers.txt -

Splitting csv file into multiple files with 2 columns in each file

I am trying to split a file (testfile.csv) that contains the following:
1,2,4,5,6,7,8,9
a,b,c,d,e,f,g,h
q,w,e,r,t,y,u,i
a,s,d,f,g,h,j,k
z,x,c,v,b,n,m,z
into a file
1,2
a,b
q,w
a,s
z,x
and another file
4,5
c,d
e,r
d,f
c,v
but I cannot seem to do that in awk using an iterative solution.
awk -F, '{print $1, $2}'
awk -F, '{print $3, $4}'
does it for me but I would like a looping solution.
I tried
awk -F, '{ for (i=1;i< NF;i+=2) print $i, $(i+1) }' testfile.csv
but it gives me a single column. It appears that I am iterating over the first row and then moving onto the second row skipping every other element of that specific row.
You can use cut:
$ cut -d, -f1,2 file > file_1
$ cut -d, -f3,4 file > file_2
If you are going to use awk be sure to set the OFS so that the columns remain a CSV file:
$ awk 'BEGIN{FS=OFS=","}
{print $1,$2 >"f1"; print $3,$4 > "f2"}' file
$ cat f1
1,2
a,b
q,w
a,s
z,x
$cat f2
4,5
c,d
e,r
d,f
c,v
Is there a quick and dirty way of renaming the resulting files with the first row and first column (like first file would be 1.csv, second file would be 4.csv:
awk 'BEGIN{FS=OFS=","}
FNR==1 {n1=$1 ".csv"; n2=$3 ".csv"}
{print $1,$2 >n1; print $3,$4 > n2}' file
awk -F, '{ for (i=1; i < NF; i+=2) print $i, $(i+1) > i ".csv"}' tes.csv
works for me. I was trying to get the output in bash which was all jumbled up.
It's do-able in bash, but it will be much slower than awk:
f=testfile.csv
IFS=, read -ra first < <(head -1 "$f")
for ((i = 0; i < (${#first[#]} + 1) / 2; i++)); do
slice_file="${f%.csv}$((i+1)).csv"
cut -d, -f"$((2 * i + 1))-$((2 * (i + 1)))" "$f" > "$slice_file"
done
with sed:
sed -r '
h
s/(.,.),./\1/w file1.txt
g
s/.,.,(.,.),./\1/w file2.txt' file.txt

How do I read a file into a matrix in bash?

I have a text file like this
A;green;3
B;blue;2
A;red;4
C;red;2
C;blue;3
B;green;3
I have to write a script that if started with parameter "B" gives me the color of the row with the biggest number (from the rows starting with B). In this case it would be the last line, so the output would be "green".
How do I separate the elements by ";"-s and newlines and store them into a matrix so I can work with it? Do I even need to do that, or is there an easier solution?
Thanks in advance!
awk + sort solution:
awk -v param="B" -F';' '$1==param{ print $2; exit }' <(sort -t';' -k1,1 -k3nr file.txt)
The output:
green
Or in addition to #William Pursell's answer - to extract only color value:
awk -F';' '/^B/ && $3>m{ m=$3; c=$2 }END{ print c }' file.txt
green
Via bash script:
get_max_color.sh script:
#!/bin/bash
awk -F';' -v p="$1" '$0~"^"p && $3>m{ m=$3; c=$2 }END{ print c }' "$2"
Usage:
bash get_max_color.sh B "file.txt"
green
You just need to filter out the appropriate lines and store the one with the max value seen. The obvious solution is:
awk '/^B/ && $3 > m{s=$0} END { print s}' FS=\; input
To use a parameter, do
awk "/^$1/"' && $3 > m{s=$0} END { print s}' FS=\; input
A non-awk solution, possibly less elegant and slower than the already proposed solution:
sort -r -t\; -k1,1 -k3 file | uniq -w1 | grep "B" | cut -f2 -d\;
awk to the rescue!
probably not understood what you want to achieve but
awk -v key="$c" -F\; 'm[$1]<$3{m[$1]=$3; c[$1]=$2} END{print c[key]}' file
will pick the highest coded color from the file for the key
some poor usage pattern
$ for c in A B C;
do
echo $c "->" $(awk -v key="$c" -F\; 'm[$1]<$3 {m[$1]=$3; c[$1]=$2}
END {print c[key]}' file);
done;
A -> red
B -> green
C -> blue
you can probably implement the rest of the script in awk and do this process once.
Or, perhaps you want an associative array, can be done as below:
$ declare -A colors;
while IFS=\; read k c _ ;
do
colors[$k]=$c;
done < <(sort -t\; -k1,1 -k3nr file | uniq -w1)
$ echo ${colors[A]}
red

awk find and replace variable file2 into file1 when matched

Tried couple of answers from similar questions but not quite getting correct results. Trying to search second file for variable and replace with second variable if there, otherwise keep original...
File1.txt
a
2
c
4
e
f
File2.txt
2 b
4 d
Wanted Output.txt
a
b
c
d
e
f
So far what I have seems to sort of work, but anywhere the replacement is happening I'm getting a blank row instead of the new variable...
Current Output.txt
a
c
e
f
Curent code....
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {print (($1 in a) ? a[$1] : $1)}' file2.txt file1.txt > output.txt
Also tried and got same results...
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {$1 = a[$1]}1' file2.txt file1.txt > output.txt
Sorry first wrote incorrectly..fixed the key value issue.
Did try what you did, still not getting missing in output.txt
awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1 = a[$1]}1' file2.txt file1.txt > output.txt
your key value pair is not right... $1 is the key, $2 is the value.
$ awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1=a[$1]}1' file.2 file.1
a
b
c
d
e
f
try below solution -
awk 'NR==FNR{a[$1]=$NF;next} {print (a[$NF]?a[$NF]:$1)}' file2.txt file1.txt
a
b
c
d
e
f

Substracting row-values from two different text files

I have two text files, and each file has one column with several rows:
FILE1
a
b
c
FILE2
d
e
f
I want to create a file that has the following output:
a - d
b - e
c - f
All the entries are meant to be numbers (decimals). I am completely stuck and do not know how to proceed.
Using paste seems like the obvious choice but unfortunately you can't specify a multiple character delimiter. To get around this, you can pipe the output to sed:
$ paste -d- file1 file2 | sed 's/-/ - /'
a - d
b - e
c - f
Paste joins the two files together and sed adds the spaces around the -.
If your desired output is the result of the subtraction, then you could use awk:
paste file1 file2 | awk '{ print $1 - $2 }'
given:
$ cat /tmp/a.txt
1
2
3
$ cat /tmp/b.txt
4
5
6
awk is a good bet to process the two files and do arithmetic:
$ awk 'FNR==NR { a[FNR""] = $0; next } { print a[FN""]+$1 }' /tmp/a.txt /tmp/b.txt
5
7
9
Or, if you want the strings rather than arithmetic:
$ awk 'FNR==NR { a[FNR""] = $0; next } { print a[FNR""] " - "$0 }' /tmp/a.txt /tmp/b.txt
1 - 4
2 - 5
3 - 6
Another solution using while and file descriptors :
while read -r line1 <&3 && read -r line2 <&4
do
#printf '%s - %s\n' "$line1" "$line2"
printf '%s\n' $(($line1 - $line2))
done 3<f1.txt 4<f2.txt

Resources