Related
I have scenario where we want to replace multiple double quotes to single quotes between the data, but as the input data is separated with "comma" delimiter and all column data is enclosed with double quotes "" got an issue and the same explained below:
The sample data looks like this:
"int","","123","abd"""sf123","top"
So, the output would be:
"int","","123","abd"sf123","top"
tried below approach to get the resolution, but only first occurrence is working, not sure what is the issue??
sed -ie 's/,"",/,"NULL",/g;s/""/"/g;s/,"NULL",/,"",/g' inputfile.txt
replacing all ---> from ,"", to ,"NULL",
replacing all multiple occurrences of ---> from """ or "" or """" to " (single occurrence)
replacing 1 step changes back to original ---> from ,"NULL", to ,"",
But, only first occurrence is getting changed and remaining looks same as below:
If input is :
"int","","","123","abd"""sf123","top"
the output is coming as:
"int","","NULL","123","abd"sf123","top"
But, the output should be:
"int","","","123","abd"sf123","top"
You may try this perl with a lookahead:
perl -pe 's/("")+(?=")//g' file
"int","","123","abd"sf123","top"
"int","","","123","abd"sf123","top"
"123"abcs"
Where input is:
cat file
"int","","123","abd"""sf123","top"
"int","","","123","abd"""sf123","top"
"123"""""abcs"
Breakup:
("")+: Match 1+ pairs of double quotes
(?="): If those pairs are followed by a single "
Using sed
$ sed -E 's/(,"",)?"+(",)?/\1"\2/g' input_file
"int","","123","abd"sf123","top"
"int","","NULL","123","abd"sf123","top"
"int","","","123","abd"sf123","top"
In awk with your shown samples please try following awk code. Written and tested in GNU awk, should work in any version of awk.
awk '
BEGIN{ FS=OFS="," }
{
for(i=1;i<=NF;i++){
if($i!~/^""$/){
gsub(/"+/,"\"",$i)
}
}
}
1
' Input_file
Explanation: Simple explanation would be, setting field separator and output field separator as , for all the lines of Input_file. Then traversing through each field of line, if a field is NOT NULL then Globally replacing all 1 or more occurrences of " with single occurrence of ". Then printing the line.
With sed you could repeat 1 or more times sets of "" using a group followed by matching a single "
Then in the replacement use a single "
sed -E 's/("")+"/"/g' file
For this content
$ cat file
"int","","123","abd"""sf123","top"
"int","","","123","abd"""sf123","top"
"123"""""abcs"
The output is
"int","","123","abd"sf123","top"
"int","","","123","abd"sf123","top"
"123"abcs"
sed s'#"""#"#' file
That works. I will demonstrate another method though, which you may also find useful in other situations.
#!/bin/sh -x
cat > ed1 <<EOF
3s/"""/"/
wq
EOF
cp file stack
cat stack | tr ',' '\n' > f2
ed -s f2 < ed1
cat f2 | tr '\n' ',' > stack
rm -v ./f2
rm -v ./ed1
The point of this is that if you have a big csv record all on one line, and you want to edit a specific field, then if you know the field number, you can convert all the commas to carriage returns, and use the field number as a line number to either substitute, append after it, or insert before it with Ed; and then re-convert back to csv.
I have a comma separated strings inside brackets and I need to replace the string in matches the pattern.
And we have unknown string at the start and at the end. In the below example I need to replace c++ string with c if the row has string ruby.
I tried below sed command but it didnt work.
```
("java","php","ruby",".net","scala","c++",...n),
(".net","ruby","php","java","c++",...n),
("java",".net","ruby","php","c++",...n),
("ruby","java",".net","php","c++",...n);
```
```
sed -e "s/(\(.*\),\("ruby"\),\(.*\),"c++",\(.*\))/(\1,\2,\3,"c",\4)/g"
```
("java","php","ruby",".net","scala","c++",...n),
(".net","ruby","php","java","c++",...n),
("java",".net","ruby","php","c++",...n),
("ruby","java",".net","php","c++",...n);
'
{m,n,g}awk '/\42ruby\42/ ? NF = NF : NF' FS='"c[+][+]"' OFS='"c"'
'
("java","php","ruby",".net","scala","c",...n),
(".net","ruby","php","java","c",...n),
("java",".net","ruby","php","c",...n),
("ruby","java",".net","php","c",...n);
it seems like your sed command is not escaping double quotes
sed -e "s/(\(.*\),\("ruby"\),\(.*\),"c++",\(.*\))/(\1,\2,\3,"c",\4)/g"
change it to single quotes.
sed -e 's/(\(.*\),\("ruby"\),\(.*\),"c++",\(.*\))/(\1,\2,\3,"c",\4)/g' file.txt
or more simply use the below one...
sed -e 's/\("ruby"\),\(.*\),"c++"/\1,\2,"c"/g' my_file.txt
which will output
("jsjs","java",".net","php","c++",...n);
("java","php","ruby",".net","scala","c",...n),
(".net","ruby","php","java","c",...n),
("java",".net","ruby","php","c",...n),
("ruby","java",".net","php","c",...n);
("rubys","java",".net","php","c++",...n);
Requirement is to find a string from txt file and store it to variable.
file look like this(rsa.txt)
Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
Required output (variable name : encstring):
encstring = $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
I tried below code but showing no result
encstring=$(grep -oE '$ENC[^()]*==)' <<< rsa.txt)
With awk, could you please try following. Simply, search for string /Encrypted string along with a condition to check if last field of that line has $ENC in it then last field for that line by using $NF.
encstring=$(awk '/Encrypted string/ && $NF~/\$ENC/{print $NF}'
You can use
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' rsa.txt)
# OR
encstring=$(grep -oP '\$ENC\(.*?\)' rsa.txt)
See an online demo:
s='Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)'
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' <<< "$s")
echo "$encstring"
# => $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
The sed -n 's/.*\(\$ENC(.*)\).*/\1/p' command does the following:
-n suppresses the default line output
s/.*\(\$ENC(.*)\).*/\1/ - finds any text, then captures $ENC(...) into Group 1 and then matches the rest of the string, and replaces the match with the Group 1 value
p - prints the result of the substitution.
The grep -oP '\$ENC\(.*?\)' command extracts all $ENC(...) matches, with any chars, as few as possible, between ( and ).
You are searching for ENC which is followed by 0 or more occurances of something which is not an open or closed parenthesis. However, in your input file, there is an open parenthese after ENC. Therefore [^()]* matches the null string. After this you expect the string ==). This would match only for the input ENC==)`.
You need to escape $ as \$ as it means "end of string" with -E
I have this shell line to concatenate 2 string:
new_group="second result is: ${result},\"${policyActivite}_${policyApplication}\""
echo "result is: ${new_group}"
The result:
result is: "Team","Application_Team"
How can change the result to: result is: "Team, Application_Team"
Use sed:
echo "$new_group" | sed 's/"//g;s/^\(.*\)$/"\1"/'
The first statement is removing all double quotes. The second one add double at the start and the end of the line.
Alternatively, if you want to replace "," with ,, use this sed command: sed 's/","/, /g'
I have a csv file which has the following string:
"2016-10-25T14:07:49.298-07:00"
which I would like to replace with:
"2016-10-25", "14:07:49"
I matched the original string with a regular expression:
([0-9]{4}-[0-9]{2}-[0-9]{2})[T]([0-9]{2}\:[0-9]{2}\:[0-9]{2})\.[0-9]{3}-07\:00
but I need some help
With awk, assuming T and . are unique
$ echo '"2016-10-25T14:07:49.298-07:00"' | awk -F'[T.]' '{print $1 "\", \"" $2 "\""}'
"2016-10-25", "14:07:49"
-F'[T.]' assign T or . as field separator
Then print first and second field with required formatting
With sed:
sed -E 's/^([^T]+)T([^.]+).*/\1", "\2"/'
^([^T]+) matches the portion upto T, and put that in captured group 1
T matches T literally
([^.]+) matches upto next ., and put that in captured group (2)
.* matches the rest
in the replacement, the captured groups are used with proper formatting to get desired output, \1", "\2"
Example:
$ sed -E 's/^([^T]+)T([^.]+).*/\1", "\2"/' <<<'"2016-10-25T14:07:49.298-07:00"'
"2016-10-25", "14:07:49"