Modify eachline characters between two patterns - shell

I need to modify the certain characters between two patterns in each line.
Eample:: (File content saved as myfile.txt)
abc, def, 1, {,jsdfsd,kfgdsf,lgfgd}, 2, pqr, stu
abc, def, 1, {,jsdfsqwe,k,fdfsfl}, 2, pqr, stu
abc, def, 1, {,asdasdj,kgfdgdf,ldsfsdf}, 2, pqr, stu
abc, def, 1, {,jsds,kfdsf,fdsl}, 2, pqr, stu
I want to edit & save myfile.txt like mentioned below
abc, def, 1, {jsdfsd kfgdsf lgfgd}, 2, pqr, stu
abc, def, 1, {jsdfsqwe k fdfsfl}, 2, pqr, stu
abc, def, 1, {asdasdj kgfdgdf ldsfsdf}, 2, pqr, stu
abc, def, 1, {jsds kfdsf fdsl}, 2, pqr, stu
I've used following command to edit & save myfile.txt
sed '/1,/,/,2/{/1,/n;/,2/!{s/,/ /g}}' myfile.txt
This command did not helped me to achive my goal. Please help to fix this issue.

awk would be more suitable in such case:
awk 'BEGIN{ FS=OFS=", " }{ gsub(/,/, " ", $4); sub(/\{ /, "{", $4) }1' file
The output:
abc, def, 1, {jsdfsd kfgdsf lgfgd}, 2, pqr, stu
abc, def, 1, {jsdfsqwe k fdfsfl}, 2, pqr, stu
abc, def, 1, {asdasdj kgfdgdf ldsfsdf}, 2, pqr, stu
abc, def, 1, {jsds kfdsf fdsl}, 2, pqr, stu

In vim there is also the possibility of lookaheads and lookbehinds:
%s/\v(\{.*)#<=,(.*})#=/ /g
Matches every , between a { and a } and replaces them with a space.
if it is the case, that a , directly afer a { should be deleted, not replaced with a space, it is possible to run this line in a slightly modified version first:
%s/\v(\{)#<=,(.*})#=/ /g

Since you also have the tag vim, you can do it in vim via:
:%normal 0f{vi{:s/\%V,/ /g^M
Where the last two characters are actually Ctrl+V followed by Ctrl+M

One option is to use a sub-replace-expression in your :substitute command.
:%s/{\zs[^}]*\ze}/\=substitute(submatch(0)[1:], ',', ' ', 'g')
This matches in between your curly braces and then replaces each , with a space while avoiding the first comma.
For more help see:
:h sub-replace-expression
:h /\zs
:h submatch()
:h sublist
:h substitute()

Could you please try following awk and let me know if this helps you.
awk 'match($0,/{.*}/){val=substr($0,RSTART,RLENGTH);gsub(/,/," ",val);gsub(/{ /,"{",val);gsub(/} /,"}",val);print substr($0,1,RSTART-1) val substr($0,RSTART+RLENGTH+1);next} 1' Input_file
Adding a non-one liner form of solution too now.
awk '
match($0,/{.*}/){
val=substr($0,RSTART,RLENGTH);
gsub(/,/," ",val);
gsub(/{ /,"{",val);
gsub(/} /,"}",val);
print substr($0,1,RSTART-1) val substr($0,RSTART+RLENGTH+1)
next
}
1
' Input_file

Using vim:
:%normal! 4f,xf,r f,r
: ........... command mode
% ........... in the whole file
normal! ..... normal mode
4f, ......... jump inside { block
x ........... delete the first comma
f, .......... jump to the next comma
r<Space> .... replace comma with space

Some more awk:
awk 'NR>1{gsub(/,/," ",$1); $0=RS $0}1' FS=} OFS=} RS={ ORS= file
or
awk '{gsub(/,/, " ", $2); $2="{" $2 "}"}1' FS='[{}]' OFS= file

Related

Check if there is space in specific range of indexes in a line in a file and replace the space with 0

I have the following issue which I am facing. I have several lines in a file and I have an example shown below.
The values between position 4-10 can contain any combinations of values and spaces and it has be replaced with 0 only if it is an empty space.
#input.txt
014 789000
234455 899800
1213839489404040
In the example line above, we can see that we have empty spaces between position 4-10. This position to check is fixed. I want to be able to check if every line in my file has empty spaces between position 4-10, and if there is empty space, I want to be able to replace it with 0. I have provided below the desired output in the file.
#desired output in the input.txt
0140000000789000 # note that 0 has been added in the position 4-10
2344550000899800 # # note that 0 has been added in the position 7-10
1213839489404040
I was able to do the following in my code below. The code below is only able to add values at specified position using the sed command and I want to be able to modify this code such that it can do the task mentioned above.
May someone please help me ?
My Code:
#append script
function append_vals_to_line {
insert_pos=$(($1 - 1))
sed -i 's/\(.\{'$insert_pos'\}\)/\1'$2'/' "$3"
}
column_num_to_insert="$1"
append_value ="$2"
input_file_to_append_vals_in ="$3"
append_vals_to_line "$column_num_to_insert" "$append_value" "$input_file_to_append_vals_in"
I suggest an awk solution for this:
awk '{s = substr($0, 4, 7); gsub(/ /, 0, s); print substr($0, 1, 3) s substr($0, 11)}' file
#input.txt
0140000000789000
2344550000899800
1213839489404040
A more readable version:
awk '{
s = substr($0, 4, 7)
gsub(/ /, 0, s)
print substr($0, 1, 3) s substr($0, 11)
}' file
This command first gets a substring from 4th to 10th position in variable s. Then using gsub we replace each space with a 0 and finally we recompose and print whole line.
Consider this awk command that takes start and end position from arguments:
awk -v p1=4 -v p2=10 '{
s = substr($0, p1, p2-p1+1)
gsub(/ /, 0, s)
print substr($0, 1, p1-1) s substr($0, p2+1)
}' file
You can do:
awk '
/[[:blank:]]/{
sub(/[[:blank:]]*$/,"") # remove trailing spaces if any
gsub(/[[:blank:]]/,"0") # replace each /[[:blank:]]/ with "0"
} 1' file
Alternatively, you can do:
awk '
length($2) {
$0=sprintf("%s%0*d",$1,16-length($1),$2)
} 1' file
Either prints:
0140000000789000
2344550000899800
1213839489404040

Awk to Add [ ] in special characters

I am trying to add [] if the column contains special characters or number except for a comma , at the end. The first line needs to be as it is in the file.
Current:
CREATE TEST
a,
b,
23_test,
Expectation:
CREATE TEST
a,
b,
[23_test],
Assuming that the special characters are digits, whitespaces, minus signs, plus signs, dots, and underscores (please modify the patten according to your definition), how about:
awk 'NR>1 && /[-0-9_+. ]/ {$0 = "[" gensub(",$", "", 1) "],"} {print}' input.txt
If you can be specific that the special characters are any characters other than alphabets and commas, try instead:
awk 'NR>1 && /[^a-zA-Z,]/ {$0 = "[" gensub(",$", "", 1) "],"} {print}' input.txt
Hope this helps.
awk 'NR>1 && /[-0-9_+. ]/ {$0 = "[" gensub(",$", "", 1) "],"} {print}' <filename.out> | sed 's/, ]/]/'
awk '{sub(/23_test/,"[23_test]")}1' file
CREATE TEST
a,
b,
[23_test],

Switch column if value is found in an array

If the table contains a string from the file I need to replace the with a '-' and then change column four to what ever column two had.
I have the following .txt file:
0
1
2
and I have a csv:
carrot, 0, cat, r
orange, 2, cat, m
banana, 4, robin, d
output:
carrot, -, cat, 0
orange, -, cat, 2
banana, 4, robin, d
What I've currently got is I have done a for loop for the csv file line by line and used grep to change if it contains the word. If greater than one replace it with a dash. I think this method is very inefficient and was wondering if there was a better method.
This is classical case for awk tool:
awk 'BEGIN{ FS = OFS = ", " }
NR == FNR{ a[$1]; next }
{ if ($2 in a) { $4 = $2; $2 = "-" } }1' file.txt file.csv
The output:
carrot, -, cat, 0
orange, -, cat, 2
banana, 4, robin, d

Filtering w/ AWK

Hello I am a bit new with coding so sorry if this does not make sense. I need help with a script I am trying to work on for a file. The file is comma delimited and would like to filter it to only show specific character and also keep it's header. I am trying to filter the file by collateral codes which is in field 10 and I have over 2000 records on this file.
Example:
Name, Address, Phone, Zip, Coll Code,
Susan Mary, abc, 12345678, 12345, T, etc..
Jon Doe, abc, 12345678, 12345, Y, etc..
Carry Mclaughlin, abc, 12345678, 12345, T, etc..
Larry Burk, abc, 12345678, 12345, M, etc..
Wanted Output:
Name, Address, Phone, Zip, Coll Code, etc..
Susan Mary, abc, 12345678, 12345, T, etc..
Carry Mclaughlin, abc, 12345678, 12345, T, etc..
here's the sample I am currently using (code is in field 10).
awk -F, '{if ($10 == "T") print $0}' originalfile > newfile
Only problem I am having right now is keepign the HEADER on this file.
-Thank you
It's hard to tell from your question but it sounds like this might be what you want:
awk -F, 'NR==1 || $10=="T"' file
okay so took me an hour to finally figure this one out. Here is the command i used
awk -F, 'NR==1; NR > 1{if ($10 == "T") print $0}' originalfile > newfile

Split column using awk or sed

I have a file containing the following text.
dog
aa 6469
bb 5946
cc 715
cat
aa 5692
Bird
aa 3056
bb 2893
cc 1399
dd 33
I need the following output:
A-Z ,aa ,bb, cc, dd
dog, 6469, 5946 ,715, 0
cat ,5692, 0, 0, 0
Bird ,3056, 2893, 1399, 33
I tried:
awk '{$1=$1}1' OFS="," RS=
But is not giving the format I need.
Thanks in advance for your help.
Cris
With Perl
perl -00 -nE'
($t, %p) = split /[\n\s]/; $h{$t} = {%p}; # Top line, Pairs on lines
$o{$t} = ++$c; # remember Order
%k = map { $_, 1} keys %p; # find full set of subKeys
}{ # END block starts
say join ",", "A-Z", sort keys %k;
for $t (sort { $o{$a} <=> $o{$b} } keys %h) {
say join ",", $k, map { ($h{$k}{$_} // 0) } sort keys %k;
}
' data.txt
prints, in the original order
A-Z,aa,bb,cc,dd
dog,6469,5946,715,0
cat,5692,0,0,0
Bird,3056,2893,1399,33
Here's a sed solution, which works on your input, but requires that you know the column names in advance and that the column names are given as sorted full ranges starting with the first column name (so nothing like aa, cc or bb, aa or bb, cc) and that every paragraph is followed by one empty line. You would also need to adjust the script if you don't have exactly four numeric columns:
echo 'A-Z, aa, bb, cc, dd';sed -e '/./{s/.* //;H;d};x;s/\n/, /g;s/, //;s/$/, 0, 0, 0, 0/;:a;s/,[^,]*//5;ta' file
If you need to look up the sed commands, you can look at info sed, especially 3.5 Less Frequently-Used Commands.
awk to the rescue!
awk -v OFS=, 'NF==1 {h[++c]=$1}
NF==2 {v[c,$1]=$2; ks[$1]}
END {printf "%s", "A-Z";
for(k in ks) printf "%s", OFS k;
print "";
for(i=1;i<=c;i++)
{printf "%s", h[i];
for(k in ks) printf "%s", OFS v[i,k]+0;
print ""}}' file'
order of the columns will be random.

Resources