How to multiply AWK output - bash

I have a file data.csv with multiple lines that reads:
A
B
C
and I want the output of the code to be multiplied n times:
A
B
C
A
B
C
Here is an example of a line I've been trying and what it returns:
awk '{for (i=0; i<3 ;i++){ print $1}}' input.csv
A
A
A
B
B
B
C
C
C
Same with cat and other tools

$ awk -v n=3 'BEGIN{ for (i=1;i<n;i++) {ARGV[ARGC]=ARGV[1]; ARGC++} } 1' file
A
B
C
A
B
C
A
B
C
Note that the above only stores the name of the file n times, not the contents of the file and so it'd work for any file of any size as it uses negligible memory.

This would do:
for i in {1..3}; do cat data.csv; done
It won't work with pipes, though.
Thanks for the comments

You can use cat and printf
cat $(printf "%0.sfile " {1..3})

Here is a single efficient 1-liner: yes data | head -3 | xargs cat
$ cat data
A
B
C
$ yes data | head -3 | xargs cat
A
B
C
A
B
C
A
B
C
$
head -3 => here 3 indicates n number of times.
Or using an awk solution:
$ cat data
A
B
C
$ awk 'BEGIN{i=0} {a[i]=$0;i++} END {for(i=0;i<=3;i++) for(j=0;j<=NR;j++) print a[j]}' data | sed '/^$/d'
A
B
C
A
B
C
A
B
C
A
B
C
$

Try this :
seq 2 | xargs -Inone cat input.csv

probably the shortest
cat input.csv{,,}

Supposing you're writing a shell-script, why use awk?
for i in `seq 3`; do
cat data.csv
done
If you want to do this using pipes, e.g. with awk, you'll need to store the file data in memory or save it temporarily to disk. For example:
cat data.csv | \
awk '{a = a $0 "\n"} END { for (i=0; i<3 ;i++){ printf "%s",a; }}'

for (( c=1; c<=3; c++ ))
do
cat Input_file.csv
done

With sed and hold/pattern space:
In this given situation with only single letters. Respectively ABC:
If you want to print once:
cat temp | sed 's/\(.*\)/\1/;N;N;H'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
[anthony#aguevara ~]$
Twice(Just append a semi-colon/capital G tot he end):
cat temp | sed 's/\(.*\)/\1/;N;N;H;G'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
A
B
C
[anthony#aguevara ~]$
Three times(Another G):
cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
A
B
C
A
B
C
[anthony#aguevara ~]$
and so on.
File(Has no newlines in file):
[anthony#aguevara ~]$ cat temp
A
B
C
[anthony#aguevara ~]$

Related

How to print lines with the specified word in the path?

Let's say I have file abc.txt which contains the following lines:
a b c /some/path/123/path/120
a c b /some/path/312/path/098
a p t /some/path/123/path/321
a b c /some/path/098/path/123
and numbers.txt:
123
321
123
098
I want to print the whole line which contain "123" only in the third place under "/some/path/123/path",
I don't want to print line "a c b/some/path/312/path" or
"a b c /some/path/098/path/123/". I want to save all files with the "123" in the third place in the new file.
I tried several methods and the best way seems to be use awk. Here is my example code which is not working correctly:
for i in `cat numbers.txt | xargs`
do
cat abc.txt | awk -v i=$i '$4 ~ /i/ {print $0}' > ${i}_number.txt;
done
because it's catching also for example "a b c /some/path/098/path/123/".
Example:
For number "123" I want to save only one line from abc.txt in 123_number.txt:
a b c /some/path/123/path/120
For number "312" I want to save only one line from abc.txt in 312_number.txt:
a c b /some/path/312/path/098
this can be accomplished in a single awk call:
$ awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){f=$4"_number.txt";print >>f;close(f)}' numbers.txt abc.txt
$ cat 098_number.txt
a b c /some/path/098/path/123
$ cat 123_number.txt
a b c /some/path/123/path/120
a p t /some/path/123/path/321
keep numbers in an array and use it for matching lines, append matching lines to corresponding files.
if your files are huge you may speed up the process using sort:
sort -t'/' -k4 abc.txt | awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){if($4!=p){close(f);f=(p=$4)"_number.txt"};print >>f}' numbers.txt -

Unix awk command to return all matching lines

I have a file which looks like the below -
A
B
C
D
E
-----
A
B
C
D
C
---
X
Y
A
B
XEC
---
When the fifth row of each block is/contains E, I want the previous 4 lines to be returned. I wrote the below command but it is buggy
awk '{a[NR]=$0} $0~s {f=NR} END {print a[f-4]; print a[f-6]; print a[f-8];}' s="E" file.txt
But it is returning only the last match. I want all the matched lines to be returned.
For the above entries, the output needs to be
A
B
C
D
---
X
Y
A
B
Is there any other way to achieve this?
Using gawk : multi-character RS is only supported in gnu-awk
awk -v RS='\n\n[-]+\n\n*' -v FS="\n" '$5 ~ /E/{printf "%s\n%s\n%s\n%s\n---\n",$1,$2,$3,$4}' inputfile
A
B
C
D
---
X
Y
A
B
---
Not sure really how you want, you really need --- and then newline char ???
Using tac and awk you can try below one
Print the N records after some regexp:
awk -v n=4 'c&&c--;/regexp/{c=n}' <input_file>
Print the N records before some regexp:
tac <input_file> | awk -v n=4 'c&&c--;/regexp/{c=n}' | tac
^ ^ ^ ^
| | | |
reverse file no of lines to print when regexp found again reverse
Input
$ cat infile
A
B
C
D
E
-----
A
B
C
D
C
---
X
Y
A
B
XEC
---
When n=4
$ tac infile | awk -v n=4 'c&&c--;/E/{c=n}' | tac
A
B
C
D
X
Y
A
B
When n=2
$ tac infile | awk -v n=2 'c&&c--;/E/{c=n}' | tac
C
D
A
B

Concatenation of two columns from the same file

From a text file
file
a d
b e
c f
how are the tab delimited columns concatenated into one column
a
b
c
d
e
f
Now I use awk to output columns to two files that I then concatenated using cat. But there must be a better one line command?
for a generalized approach
$ f() { awk '{print $'$1'}' file; }; f 1; f 2
a
b
c
d
e
f
if the file is tab delimited perhaps simply with cut (the inverse operation of paste)
$ cut -f1 file.t; cut -f2 file.t
This simple awk command should do the job:
awk '{print $1; s=s $2 ORS} END{printf "%s", s}' file
a
b
c
d
e
f
You can use process substitution; that would eliminate the need to create file for each column.
$ cat file
a d
b e
c f
$ cat <(awk '{print $1}' file) <(awk '{print $2}' file)
a
b
c
d
e
f
$
OR
as per the comment you can just combine multiple commands and redirect their output to a different file like this:
$ cat file
a d
b e
c f
$ (awk '{print $1}' file; awk '{print $2}' file) > output
$ cat output
a
b
c
d
e
f
$
try: Without reading file twice or without any external calls of any other commands, only single awk to rescue. Also considering that your Input_file is same like shown sample.
awk '{VAL1=VAL1?VAL1 ORS $1:$1;VAL2=VAL2?VAL2 ORS $2:$2} END{print VAL1 ORS VAL2}' Input_file
Explanation: Simply creating a variable named VAL1 which will contain $1's value and keep on concatenating in it's own value, VAL2 will have $2's value and keep on concatenating value in it's own. In END section of awk printing the values of VAL1 and VAL2.
You can combine bash commands with ; to get a single stream:
$ awk '{print $1}' file; awk '{print $2}' file
a
b
c
d
e
f
Use process substitution if you want that to be as if it were a single file:
$ txt=$(awk '{print $1}' file; awk '{print $2}' file)
$ echo "$txt"
a
b
c
d
e
f
Or for a Bash while loop:
$ while read -r line; do echo "line: $line"; done < <(awk '{print $1}' file; awk '{print $2}' file)
line: a
line: b
line: c
line: d
line: e
line: f
If you're using notepadd++ you could replace all tab values with the newline char "\r\n"
another approach:
for i in $(seq 1 2); do
awk '{print $'$i'}' file
done
output:
a
b
c
d
e
f

Substracting row-values from two different text files

I have two text files, and each file has one column with several rows:
FILE1
a
b
c
FILE2
d
e
f
I want to create a file that has the following output:
a - d
b - e
c - f
All the entries are meant to be numbers (decimals). I am completely stuck and do not know how to proceed.
Using paste seems like the obvious choice but unfortunately you can't specify a multiple character delimiter. To get around this, you can pipe the output to sed:
$ paste -d- file1 file2 | sed 's/-/ - /'
a - d
b - e
c - f
Paste joins the two files together and sed adds the spaces around the -.
If your desired output is the result of the subtraction, then you could use awk:
paste file1 file2 | awk '{ print $1 - $2 }'
given:
$ cat /tmp/a.txt
1
2
3
$ cat /tmp/b.txt
4
5
6
awk is a good bet to process the two files and do arithmetic:
$ awk 'FNR==NR { a[FNR""] = $0; next } { print a[FN""]+$1 }' /tmp/a.txt /tmp/b.txt
5
7
9
Or, if you want the strings rather than arithmetic:
$ awk 'FNR==NR { a[FNR""] = $0; next } { print a[FNR""] " - "$0 }' /tmp/a.txt /tmp/b.txt
1 - 4
2 - 5
3 - 6
Another solution using while and file descriptors :
while read -r line1 <&3 && read -r line2 <&4
do
#printf '%s - %s\n' "$line1" "$line2"
printf '%s\n' $(($line1 - $line2))
done 3<f1.txt 4<f2.txt

Merge columns cut & cat

I have file.txt 3 columns.
1 A B
2 C D
3 E F
I want to add #1&#3 as the end of #2. Result should look like this:
1A
2C
3E
1B
2D
3F
I am doing this by
cut -f 1,2 > tmp1
cut -f 1,3 > tmp2
cat *tmp * > final_file
But I am getting repeated lines! If I check the final output with:
cat * | sort | uniq -d
there are plenty of repeated lines and there are none in the primary file.
Can anyone suggest other way of doing this? I believe the one I am trying to use is too complex and that's why I am getting such a weird output.
pzanoni#vicky:/tmp$ cat file.txt
1 A B
2 C D
3 E F
pzanoni#vicky:/tmp$ cut -d' ' -f1,2 file.txt > result
pzanoni#vicky:/tmp$ cut -d' ' -f1,3 file.txt >> result
pzanoni#vicky:/tmp$ cat result
1 A
2 C
3 E
1 B
2 D
3 F
I'm using bash
Preserves the order with one pass through the file
awk '
{print $1 $2; pass2 = pass2 sep $1 $3; sep = "\n"}
END {print pass2}
' file.txt
The reason this (cat tmp* * > final_file) is wrong:
I assume *tmp was a typo
I assume as this point the directory only contains "tmp1" and "tmp2"
Look at how those wildcards will be expanded:
tmp* expands to "tmp1" and "tmp2"
* also expands to "tmp1" and "tmp2"
So your command line becomes cat tmp1 tmp2 tmp1 tmp2 > final_file and hence you get all the duplicated lines.
cat file.txt | awk '{print $1 $2 "\n" $1 $3};'

Resources