bash - replace all occurrences in a line with a captured pattern from that line - bash

I have an input file:
a=,1,2,3
b=,4,5,6,7
c=,8,9
d=,10,11,12
e=,13,14,15
That I need to transform into
a/1 a/2 a/3
b/4 b/5 b/6 b/7
c/8 c/9
d/10 d/11 d/12
e/13 e/14 e/15
So I need to capture the phrase before the = sign and replace every comma with  \1/.
My most successful attempt was:
sed 's#\([^,]*\)=\([^,]*\),#\2 \1/#g'
but that would only replace the first occurrence.
Any suggestions?

With awk:
awk -F'[=,]' '{ for(i=3;i<=NF;i++) printf "%s/%s%s", $1,$i,(i==NF? ORS:OFS) }' file
The output:
a/1 a/2 a/3
b/4 b/5 b/6 b/7
c/8 c/9
d/10 d/11 d/12
e/13 e/14 e/15
Or a shorter one with gsub/sub substitution:
awk -F'=' '{ gsub(",", OFS $1"/"); sub(/^[^ ]+ /, "") }1' file

Following awk may help you in same.
awk -F"=" '{gsub(/\,/,FS $1"/");$1="";gsub(/^ +| +$/,"")} 1' Input_file
Explanation: Adding explanation too now for above solution:
awk -F"=" '{
gsub(/\,/,FS $1"/"); ##Using global substitution and replacing comma with FS(field separator) $1 and a / for all occurrences of comma(,).
$1=""; ##Nullifying the first column now.
gsub(/^ +| +$/,"") ##Globally substituting initial space and space at last with NULL here.
}
1 ##awk works on method of condition then action, so by mentioning 1 making condition TRUE here and not mentioning any action so by default action is print of the current line.
' Input_file ##Mentioning the Input_file name here.
Output will be as follows:
a/1 a/2 a/3
b/4 b/5 b/6 b/7
c/8 c/9
d/10 d/11 d/12
e/13 e/14 e/15

With sed
sed -E '
:A
s/([^=]*)(=[^,]*),([^,]*)/\1\2\1\/\3 /
tA
s/.*=//
' infile

Related

Condition on Nth character of string in a Mth column in bash

I have a sample
$ cat c.csv
a,1234543,c
b,1231456,d
c,1230654,e
I need to grep only numbers where 4th character of 2nd column but not be 0 or 1
Output must be
a,1234543,c
I know this only
awk -F, 'BEGIN { OFS = FS } $2 ~/^[2-9]/' c.csv
Is it possible to put a condition on 4th character?
Could you please try following.
awk 'BEGIN{FS=","} substr($2,4,1)!=0 && substr($2,4,1)!=1' Input_file
OR as per Ed site's suggestion:
awk 'BEGIN{FS=","} substr($2,4,1)!~[01]' Input_file
Explanation: Adding a detailed explanation for above code here.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS="," ##Setting field separator as comma here.
} ##Closing BLOCK for this program BEGIN section.
substr($2,4,1)!=0 && substr($2,4,1)!=1 ##Checking conditions if 4th character of current line is NOT 0 and 1 then print the current line.
' Input_file ##Mentioning Input_file name here.
This might work for you (GNU sed or grep):
grep -vE '^([^,]*,){1}[^,]{3}[01]' file
or:
sed -E '/^([^,]*,){1}[^,]{3}[01]/d' file
Replace the 1 for the m'th-1 column and the 3 for the n'th-1 character in that column.
Grep is the answer.
But here is another way using array and variable substitution
test=( $(cat c.csv) ) # load c.csv data to an array
echo ${test[#]//*,???[0-1]*/} # print all items from an array,
# but remove the ones that correspond to this regex *,???[0-1]*
# so 'b,1231456,d' and 'c,1230654,e' from example will be removed
# and only 'a,1234543,c' will be printed
There are many ways to do this with awk. the most literal form would be:
4th character of 2nd column is not 0 or 1
$ awk -F, '($2 !~ /^...[01]/)' file
$ awk -F, '($2 ~ /^...[^01]/)' file
These will also match a line a,abcdefg,b
2nd column is an integer and 4th character is not 0 or 1
$ awk -F, '($2+0==$2) && ($2!~[.]) && ($2 !~ /^...[01]/)'
$ awk -F, '($2 ~ /^[0-9][0-9][0-9][^01][0-9]*$/)'

How to add single quote after specific word using sed?

I am trying to write a script to add a single quote after a "GOOD" word .
For example, I have file1 :
//WER GOOD=ONE
//WER1 GOOD=TWO2
//PR1 GOOD=THR45
...
Desired change is to add single quotes :
//WER GOOD='ONE'
//WER1 GOOD='TWO2'
//PR1 GOOD='THR45'
...
This is the script which I am trying to run:
#!/bin/bash
for item in `grep "GOOD" file1 | cut -f2 -d '='`
do
sed -i 's/$item/`\$item/`\/g' file1
done
Thank you for the help in advance !
Could you please try following.
sed "s/\(.*=\)\(.*\)/\1'\2'/" Input_file
OR as per OP's comment to remove empty line use:
sed "s/\(.*=\)\(.*\)/\1'\2'/;/^$/d" Input_file
Explanation: following is only for explanation purposes.
sed " ##Starting sed command from here.
s/ ##Using s to start substitution process from here.
\(.*=\)\(.*\) ##Using sed buffer capability to store matched regex into memory, saving everything till = in 1st buffer and rest of line in 2nd memory buffer.
/\1'\2' ##Now substituting 1st and 2nd memory buffers with \1'\2' as per OP need adding single quotes before = here.
/" Input_file ##Closing block for substitution, mentioning Input_file name here.
Please use -i option in above code in case you want to save output into Input_file itself.
2nd solution with awk:
awk 'match($0,/=.*/){$0=substr($0,1,RSTART) "\047" substr($0,RSTART+1,RLENGTH) "\047"} 1' Input_file
Explanation: Adding explanation for above code.
awk '
match($0,/=.*/){ ##Using match function to mmatch everything from = to till end of line.
$0=substr($0,1,RSTART) "\047" substr($0,RSTART+1,RLENGTH) "\047" ##Creating value of $0 with sub-strings till value of RSTART and adding ' then sub-strings till end of line adding ' then as per OP need.
} ##Where RSTART and RLENGTH are variables which will be SET once a TRUE matched regex is found.
1 ##1 will print edited/non-edited line.
' Input_file ##Mentioning Input_file name here.
3rd solution: In case you have only 2 fields in your Input_file then try more simpler in awk:
awk 'BEGIN{FS=OFS="="} {$2="\047" $2 "\047"} 1' Input_file
Explanation of 3rd code: Use only for explanation purposes, for running please use above code itself.
awk ' ##Starting awk program here.
BEGIN{FS=OFS="="} ##Setting FS and OFS values as = for all line for Input_file here.
{$2="\047" $2 "\047"} ##Setting $2 value with adding a ' $2 and then ' as per OP need.
1 ##Mentioning 1 will print edited/non-edited lines here.
' Input_file ##Mentioning Input_file name here.

Copy numbers at the beginning of each line to the end of line

I have a file that produces this kind of lines . I wanna edit these lines and put them in passageiros.txt
a82411:x:1015:1006:Adriana Morais,,,:/home/a82411:/bin/bash
a60395:x:1016:1006:Afonso Pichel,,,:/home/a60395:/bin/bash
a82420:x:1017:1006:Afonso Alves,,,:/home/a82420:/bin/bash
a69225:x:1018:1006:Afonso Alves,,,:/home/a69225:/bin/bash
a82824:x:1019:1006:Afonso Carreira,,,:/home/a82824:/bin/bash
a83112:x:1020:1006:Aladje Sanha,,,:/home/a83112:/bin/bash
a82652:x:1022:1006:Alexandre Ferreira,,,:/home/a82652:/bin/bash
a83063:x:1023:1006:Alexandre Feijo,,,:/home/a83063:/bin/bash
a82540:x:1024:1006:Ana Santana,,,:/home/a82540:/bin/bash
With the following code i'm able to get something like this:
cat /etc/passwd |grep "^a[0-9]" | cut -d ":" -f1,5 | sed "s/a//" | sed "s/,//g" > passageiros.txt
sed -e "s/$/:::a/" -i passageiros.txt
82411:Adriana Morais:::a
60395:Afonso Pichel:::a
82420:Afonso Alves:::a
69225:Afonso Alves:::a
82824:Afonso Carreira:::a
83112:Aladje Sanha:::a
82652:Alexandre Ferreira:::a
83063:Alexandre Feijo:::a
82540:Ana Santana:::a
So my goal is to create something like this:
82411:Adriana Morais:::a82411#
60395:Afonso Pichel:::a60395#
82420:Afonso Alves:::a82420#
69225:Afonso Alves:::a69225#
82824:Afonso Carreira:::a82824#
83112:Aladje Sanha:::a83112#
82652:Alexandre Ferreira:::a82652#
83063:Alexandre Feijo:::a83063#
82540:Ana Santana:::a82540#
How can I do this?
Could you please try following.
awk -F'[:,]' '{val=$1;sub(/[a-z]+/,"",$1);print $1,$5,_,_,val"#"}' OFS=":" Input_file
Explanation: Adding explanation for above code too.
awk -F'[:,]' ' ##Starting awk script here and making field seprator as colon and comma here.
{ ##Starting main block here for awk.
val=$1 ##Creating a variable val whose value is first field.
sub(/[a-z]+/,"",$1) ##Using sub for substituting any kinf of alphabets small a to z in first field with NULL here.
print $1,$5,_,_,val"#" ##Printing 1st, 5th field and printing 2 NULL variables and printing variable val with #.
} ##Closing block for awk here.
' OFS=":" Input_file ##Mentioning OFS value as colon here and mentioning Input_file name here.
EDIT: Adding #Aserre's solution too here.
awk -F'[:,]' '{print substr($1, 2),$5,_,_,$1"#"}' OFS=":" Input_file
You may use the following awk:
awk 'BEGIN {FS=OFS=":"} {sub(/^a/, "", $1); gsub(/,/, "", $5); print $1, $5, _, _, "a" $1 "#"}' file > passageiros.txt
See the online demo
Details
BEGIN {FS=OFS=":"} sets the input and output field separator to :
sub(/^a/, "", $1) removes the first a from Field 1
gsub(/,/, "", $5) removes all , from Field 5
print $1, $5, _, _, "a" $1 "#" prints only the necessary fields to the output.
You can use just one sed:
grep '^a' file | cut -d: -f1,5 | sed 's/a\([^:]*\)\(.*\)/\1\2:::a\1#/;s/,,,//'

Shell script to add values to a specific column

I have semicolon-separated columns, and I would like to add some characters to a specific column.
aaa;111;bbb
ccc;222;ddd
eee;333;fff
to the second column I want to add '#', so the output should be;
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
I tried
awk -F';' -OFS=';' '{ $2 = "#" $2}1' file
It adds the character but removes all semicolons with space.
You could use sed to do your job:
# replaces just the first occurrence of ';', note the absence of `g` that
# would have made it a global replacement
sed 's/;/;#/' file > file.out
or, to do it in place:
sed -i 's/;/;#/' file
Or, use awk:
awk -F';' '{$2 = "#"$2}1' OFS=';' file
All the above commands result in the same output for your example file:
aaa;#111;bbb
ccc;#222;ddd
eee;#333;fff
#atb: Try:
1st:
awk -F";" '{print $1 FS "#" $2 FS $3}' Input_file
Above will work only when your Input_file has 3 fields only.
2nd:
awk -F";" -vfield=2 '{$field="#"$field} 1' OFS=";" Input_file
Above code you could put any field number and could make it as per your request.
Here I am making field separator as ";" and then taking a variable named field which will have the field number in it and then that concatenating "#" in it's value and 1 is for making condition TRUE and not making and action so by default print action will happen of current line.
You just misunderstood how to set variables. Change -OFS to -v OFS:
awk -F';' -v OFS=';' '{ $2 = "#" $2 }1' file
but in reality you should set them both to the same value at one time:
awk 'BEGIN{FS=OFS=";"} { $2 = "#" $2 }1' file

awk delete all lines not containing substring using if condition

I want to delete lines where the first column does not contain the substring 'cat'.
So if string in col 1 is 'caterpillar', i want to keep it.
awk -F"," '{if($1 != cat) ... }' file.csv
How can i go about doing it?
I want to delete lines where the first column does not contain the substring 'cat'
That can be taken care by this awk:
awk -F, '!index($1, "cat")' file.csv
If that doesn't work then I would suggest you to provide your sample input and expected output in question.
This awk does the job too
awk -F, '$1 ~ /cat/{print}' file.csv
Explanation
-F : "Delimiter"
$1 ~ /cat/ : match pattern cat in field 1
{print} : print
A shorter command is:
awk -F, '$1 ~ "cat"' file.csv
-F is the field delimiter: (,)
$1 ~ "cat" is a (not anchored) regular expression match, match at any position.
As no action has been given, the default: {print} is assumed by awk.

Resources