Loop comma separated values in awk function - shell

I want to extract the values from value2 variable, but this function is not printing the values.
value2=aaaa,bbbb,cccc,dddd
awk -F '|' -v value1=".$1" -v value2="$2" '
{
print "value1: " value1
print "value2: " value2
for ( i in value2//./ )
{
print "looping: " i
}
}
input value value2=aaaa,bbbb,cccc,dddd
expected output:
aaaa
bbbb
cccc
dddd
How would I print all values using awk?

echo 'aaaa,bbbb,cccc,dddd' |
mawk NF=NF FS=',' OFS='\n'
aaaa
bbbb
cccc
dddd

Related

How to use awk to print a line before match and until next blank space after the match

For example, I have a variable $var
awk '/'$var'/ { }' file.txt
Once awk matches the variable in text file, I wanna to start printing one line before the match until next blank space.
Edit:
My File
AAAA
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK
SSEE
AASS
if $var = SSSS, my ouput should look like:
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK
Sorry I am new here If my explanation is not very clear
With your shown samples and attempts, please try following GNU grep solution. Written and tested in GNU grep:
grep -ozP '(?:[^\n]+\n)?AAAA(?:\n[^\n]+)*' Input_file
Few scenarios Checking above code with shown samples and with 3 different input strings.
1st scenario: Checking with input string SSSS:
grep -ozP '(?:[^\n]+\n)?SSSS(?:\n[^\n]+)*' Input_file
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK
2nd scenario: Checking with string AAAA in code:
grep -ozP '(?:[^\n]+\n)?AAAA(?:\n[^\n]+)*' Input_file
AAAA
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK
3rd scenario: Checking with input string BEKK:
grep -ozP '(?:[^\n]+\n)?BEKK(?:\n[^\n]+)*' Input_file
ASAD
BEKK
awk -v tgt="$var" '!f && ($0==tgt){print prev; f=1} f{if (NF) print; else exit} {prev=$0}' file
The above assumes you just want the first such range printed. If that's wrong then change to:
awk -v tgt="$var" '!f && ($0==tgt){print prev; f=1} f{if (NF) print; else f=0} {prev=$0}' file
Both scripts assume you want to do a full-line string match.
You may use this awk solution using match function and empty RS:
awk -v var='SSSS' -v RS= 'match($0, "(^|[^\n]+\n[^\n]*)" var) {print substr($0, RSTART)}' file
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK
# more testing
awk -v var='BEKK' -v RS= 'match($0, "(^|[^\n]+\n[^\n]*)" var) {print substr($0, RSTART)}' file
ASAD
BEKK
awk -v var='AAAA' -v RS= 'match($0, "(^|[^\n]+\n[^\n]*)" var) {print substr($0, RSTART)}' file
AAAA
BBBB
SSSS
CCCC
DDDD
LLLL
PPPP
ASAD
BEKK

How to crop text after pattern in bash

If the text is
aaaa
bbbb
cccc
====
dddd
I want dddd as the result
If the text is
aaaa
====
bbbb
cccc
dddd
I want
bbbb
cccc
dddd
as the result.
I'm trying something like awk '{print $1}' | sed '/.*\n=*$/d' but it seems like sed can only delete a line.
You can try something like
n=$(grep -n "^=*$" $1 | awk -F: '{print $1}')
let n+=1
tail +$n $1
You can indicate a range of lines, e.g. from line 1 to the line containing the pattern:
sed '1,/====/d'

awk inline command and full script has different output

I want to count the number of starting space at the beginning of line. My sample text file is following
aaaa bbbb cccc dddd
aaaa bbbb cccc dddd
aaaa bbbb cccc dddd
aaaa bbbb cccc dddd
Now when I write a simple script to count, I notice the different between inline command and full script of awk ouput.
First try
#!/bin/bash
while IFS= read -r line; do
echo "$line" | awk '
{
FS="[^ ]"
print length($1)
}
'
done < "tmp"
The output is
4
4
4
4
Second try
#!/bin/bash
while IFS= read -r line; do
echo "$line" | awk -F "[^ ]" '{print length($1)}'
done < "tmp"
The output is
0
2
4
0
I want to write a full script which has inline type output.
Could anyone explain me about this different? Thank you very much.
Fixed your first try:
$ while IFS= read -r line; do
echo "$line" | awk '
BEGIN { # you forgot the BEGIN
FS="[^ ]" # gotta set FS before record is read
}
{
print length($1)
}'
done < file
Output now:
0
2
4
0
And to speed it up, just use awk for it:
$ awk '
BEGIN {
FS="[^ ]"
}
{
print length($1)
}' file
Could you please try following without changing FS. Written and tested it in https://ideone.com/N8QcC8
awk '{if(match($0,/^ +/)){print RSTART+RLENGTH-1} else{print 0}}' Input_file
OR try:
awk '{match($0,/^ */); print RLENGTH}' Input_file
Output will be:
0
2
4
0
Explanation: in first solution simply using if and else condition. In if part I am using match function of awk and giving regex in it to match initial spaces of line in it. Then printing sum of RSTART+RLENGTH-1 to print number of spaces. Why it prints it because RSTART and RLENGTH are default variables of awk who gets set when a regex match is found.
On 2nd solution as per rowboat suggestion simply printing RLENGTH which will take care of printing 0 too without using if else condition.
You can try Perl. Simply capture the leading spaces in a group and print its length.
"a"=~/a/ is just to reset the regex captures at the end of each line.
perl -nle ' /(^\s+)/; print length($1)+0; "a"=~/a/ ' count_space.txt
0
2
4
0

how to fetch multiple pattern and numbers

i have this this file ( pattern1 and pattern2 is fixed but numbers is randoms )
aaaa patern1[1234] bbbb cccc pattern2[5678]
jjjj patern1[9999] hhhhhhhh
and I want to extract the following patterns with bash script
pattern1[1234] pattern2[5678]
pattern1[9999]
I try by grep -Eo 'pattern1\[[0-9]{1,4}' it works for one pattern not for two,
$ cat ip.txt
aaaa pattern1[1234] bbbb cccc pattern2[5678]
jjjj pattern1[9999] hhhhhhhh
$ perl -lne 'print join " ", /pattern[12]\[\d+\]/g' ip.txt
pattern1[1234] pattern2[5678]
pattern1[9999]
pattern[12]\[\d+\] pattern to extract
print join " ", to print the results separated by space
If lines not containing the desired pattern are to be omitted:
perl -lne 'print join " ", //g if /pattern[12]\[\d+\]/' ip.txt
You can use the pipe character | to allow for multiple patterns:
grep -oP '(patern1|pattern2)\[[0-9]{1,4}\]' file
patern1[1234]
pattern2[5678]
patern1[9999]
Since the patterns are similar, you can simplify like this:
grep -oP 'patt?ern[12]\[[0-9]{1,4}\]' file
$ awk '{ c=0; while ( match($0,/(patern1|pattern2)[[][^][]+[]]/) ) { printf "%s%s", (c++?OFS:""), substr($0,RSTART,RLENGTH); $0=substr($0,RSTART+RLENGTH) } if (c) print "" }' file
patern1[1234] pattern2[5678]
patern1[9999]
If you prefer brevity over clarity then consider this, using GNU awk for multi-char RS and RT and run against the same input file as shown in https://stackoverflow.com/a/39453928/1745001:
$ awk -v RS='pattern[12][[][0-9]+[]]|\n' '{$0=RT;ORS=(/\n/?x:FS)} 1' file
pattern1[1234] pattern2[5678]
pattern1[9999]

awk compare two files -erase row from second file from condtion of first file

I need some help.
first file
0.5
0.4
0.1
0.6
0.9
second file .bam
(I have to use samtools view)
aaaa bbbb cccc
aaab bbaa ccaa
hoho jojo toto
sese rere baba
jouj douj trou
And I need output:
aaaa bbbb cccc
aaab bbaa ccaa
sese rere baba
Condition: if $1 from first file is in <0.3;0.6> print same row from the second file, if it is not, erase it. I want to get filtrate second file from condition of first file. I prefer awk or bash code, but It is not important.
condition for the first file:
awk '{if($1>0.3 && $1<0.6) {print $0}}'
Please could you help me?
Thanks a lot
Another way
paste file1 file2 | awk '$1<=0.6&&$1>=0.3{$1="";print substr($0,2) }'
Here is one awk solution:
awk 'FNR==NR {a[NR]=$1;next} a[FNR]>0.3 && a[FNR]<0.6' firstfile secondfile
aaaa bbbb cccc
aaab bbaa ccaa
sese rere baba is not printed since you say <0.6 and not <=0.6
You can use awk and its getline function. It reads lines from second file, and for each one use getline to read one from first one, compare its number and print if it matches:
awk '
BEGIN { f = ARGV[2]; --ARGC }
{
getline n <f
if ( (n >= 0.3) && (n <= 0.6) ) {
print $0
}
}
' file2 file1
It yields:
aaaa bbbb cccc
aaab bbaa ccaa
sese rere baba

Resources