Sed converting underscore string to CamelCase fails on numbers - bash

I have an assigment to convert function names that are written like this: function_name() to camelCase. There are some restrictions:
don't convert functions with uppercase character in them
don't convert part of function with two underscores (two__underscores())
I thought of sed command that works fairly well, except it fails on single digit between underscores:
command:
sed -re '/[A-Z]+/!s/([0-9a-z])(_)([a-z0-9])/\1\u\3/g'
What it does:
this_is_simple() -> thisIsSimple()
this_is_2_simple() -> thisIs2_simple()
this_is_22_simple() -> thisIs22Simple()
The problem is second example. Why it fails on single digit but not on number with more digits? I tried using [[:digit:]] and replacing ([0-9a-z]) with ([a-z0-9]|[[:digit:]]) . They work same.
Thank you in advance.

Loop through it manually and replace up until there is nothing more to replace.
sed -re '/[A-Z]+/!{ : again; /([0-9a-zA-Z])_([a-z0-9])/{ s//\1\u\2/; b again; }; }'
I have added A-Z in the first regex to handle cases like:
this_is_a_simple -> thisIsASimple
After the first match it becomes thisIsA_simple, so in the second loop we want to match A_simple.
Maybe a better version would be:
sed -re '/[A-Z]+/!{ : again; /(.*[0-9a-z])_([a-z0-9])/{ s//\1\u\2/; b again; }; }'
Because regex is greedy, this will replace from the end, so this_is_a_simple at first becomes this_is_aAimple, then this_isASimple, then thisIsASimple.

Related

sed replace string with pipe and stars

I have the following string:
|**barak**.version|2001.0132012031539|
in file text.txt.
I would like to replace it with the following:
|**barak**.version|2001.01.2012031541|
So I run:
sed -i "s/\|\*\*$module\*\*.version\|2001.0132012031539/|**$module**.version|$version/" text.txt
but the result is a duplicate instead of replacing:
|**barak**.version|2001.01.2012031541|**barak**.version|2001.0132012031539|
What am I doing wrong?
Here is the value for module and version:
$ echo $module
barak
$ echo $version
2001.01.2012031541
Assumptions:
lines of interest start and end with a pipe (|) and have one more pipe somewhere in the middle of the data
search is based solely on the value of ${module} existing between the 1st/2nd pipes in the data
we don't know what else may be between the 1st/2nd pipes
the version number is the only thing between the 2nd/3rd pipes
we don't know the version number that we'll be replacing
Sample data:
$ module='barak'
$ version='2001.01.2012031541'
$ cat text.txt
**barak**.version|2001.0132012031539| <<<=== leave this one alone
|**apple**.version|2001.0132012031539|
|**barak**.version|2001.0132012031539| <<<=== replace this one
|**chuck**.version|2001.0132012031539|
|**barak**.peanuts|2001.0132012031539| <<<=== replace this one
One sed solution with -Extended regex support enabled and making use of a capture group:
$ sed -E "s/^(\|[^|]*${module}[^|]*).*/\1|${version}|/" text.txt
Where:
\| - first occurrence (escaped pipe) tells sed we're dealing with a literal pipe; follow-on pipes will be treated as literal strings
^(\|[^|]*${module}[^|]*) - first capture group that starts at the beginning of the line, starts with a pipe, then some number of non-pipe characters, then the search pattern (${module}), then more non-pipe characters (continues up to next pipe character)
.* - matches rest of the line (which we're going to discard)
\1|${version}| - replace line with our first capture group, then a pipe, then the new replacement value (${version}), then the final pipe
The above generates:
**barak**.version|2001.0132012031539|
|**apple**.version|2001.0132012031539|
|**barak**.version|2001.01.2012031541| <<<=== replaced
|**chuck**.version|2001.0132012031539|
|**barak**.peanuts|2001.01.2012031541| <<<=== replaced
An awk alternative using GNU awk:
awk -v mod="$module" -v vers="$version" -F \| '{ OFS=FS;split($2,map,".");inmod=substr(map[1],3,length(map[1])-4);if (inmod==mod) { $3=vers } }1' file
Pass two variables mod and vers to awk using $module and $version. Set the field delimiter to |. Split the second field into array map using the split function and using . as the delimiter. Then strip the leading and ending "**" from the first index of the array to expose the module name as inmod using the substr function. Compare this to the mod variable and if there is a match, change the 3rd delimited field to the variable vers. Print the lines with short hand 1
Pipe is only special when you're using extended regular expressions: sed -E
There's no reason why you need extended here, stick with basic regex:
sed "
# for lines matching module.version
/|\*\*$module\*\*.version|/ {
# replace the version
s/|2001.0132012031539|/|$version|/
}
" text.txt
or as an unreadable one-liner
sed "/|\*\*$module\*\*.version|/ s/|2001.0132012031539|/|$version|/" text.txt

sed preserve wildcard value inside pattern

I have some app config file tmp.cfg. And need to change some given values inside.
Here are the string examples:
app-stat!error!25871a5f-9f50-40ac-923d-c80a660fe21d!1!2
app-stat!queued!25871a5f-9f50-40ac-923d-c80a660fe21d!5!10
app-stat!error!fbbf0e80-8a21-4ebf-9a78-b1017c58a19d!1!2
app-stat!error!5670b363-6a5d-4fcd-819e-85786c5957f1!120!200
For all strings that contains
!error! then following some GUID and then values !1!2 change to
!error! then preserve some GUID and then NEW values !7!10
I do not need to touch other string that contains !error! then GUID but different values in the end
Here what I've tried:
sed -i "s/error\!.*\!1\!2/error\!.*\!4\!8/g" tmp.cfg
It finds all string that I need but replaces a GUID actually with symbols .* instead of GUID number itself.
How to build sed expression in that way to preserve the wildcard part?
The expected result is:
app-stat!error!fbbf0e80-8a21-4ebf-9a78-b1017c58a19d!4!8
The actual result is:
app-stat!error!.*!4!8
sed 's/\(!error!.*\)!1!2/\1!4!8/g' file
Guess you need something like this.
Pattern enclosed within
\( ... \)
are saved in registers for later use and can be accessed as \1, \2 … upto \9.
In the above sed expression, pattern from !error!<GUID> is captured in \1 and used while replacing as \1!4!8.
You can omit g from the sed expression if you are sure that the same pattern won't occur twice on a line.
This is easy to do with awk
awk '$2=="error" && $4==1 && $5==2 {$4=7;$5=10}1' FS="!" OFS="!" file
app-stat!error!25871a5f-9f50-40ac-923d-c80a660fe21d!7!10
app-stat!queued!25871a5f-9f50-40ac-923d-c80a660fe21d!5!10
app-stat!error!fbbf0e80-8a21-4ebf-9a78-b1017c58a19d!7!10
app-stat!error!5670b363-6a5d-4fcd-819e-85786c5957f1!120!200
Separate fields by !
Then if field 2=error, filed 4=1 and field 5=1
Set field 4 and 5 to 7 and 10
1 do print the lines
This sed command should work:
sed -r 's/(.*)!error!(.*)!1!2$/\1!error!\2!4!8/g' file_name

Replace and increment letters and numbers with awk or sed

I have a string that contains
fastcgi_cache_path /var/run/nginx-cache15 levels=1:2 keys_zone=MYSITEP:100m inactive=60m;
One of the goals of this script is to increment nginx-cache two digits based on the value find on previous file. For doing that I used this code:
# Replace cache_path
PREV=$(ls -t /etc/nginx/sites-available | head -n1) #find the previous cache_path number
CACHE=$(grep fastcgi_cache_path $PREV | awk '{print $2}' |cut -d/ -f4) #take the string to change
SUB=$(echo $CACHE |sed "s/nginx-cache[0-9]*[0-9]/&#/g;:a {s/0#/1/g;s/1#/2/g;s/2#/3/g;s/3#/4/g;s/4#/5/g;s/5#/6/g;s/6#/7/g;s/7#/8/g;s/8#/9/g;s/9#/#0/g;t a};s/#/1/g") #increment number
sed -i "s/nginx-cache[0-9]*/$SUB/g" $SITENAME #replace number
Maybe not so elegant, but it works.
The other goal is to increment last letter of all occurrences of MYSITEx (MYSITEP, in that case, should become MYSITEQ, after MYSITEP, etc. etc and once MYSITEZ will be reached add another letter, like MYSITEAA, MYSITEAB, etc. etc.
I thought something like:
sed -i "s/MYSITEP[A-Z]*/MYSITEGG/g" $SITENAME
but it can't works cause MYSITEGG is a static value and can't be used.
How can I calculate the last letter, increment it to the next one and once the last Z letter will be reached, add another letter?
Thank you!
Perl's autoincrement will work on letters as well as digits, in exactly the manner you describe
We may as well tidy your nginx-cache increment as well while we're at it
I assume SITENAME holds the name of the file to be modified?
It would look like this. I have to assign the capture $1 to an ordinary variable $n to increment it, as $1 is read-only
perl -i -pe 's/nginx-cache\K(\d+)/ ++($n = $1) /e; s/MYSITE\K(\w+)/ ++($n = $1) /e;' $SITENAME
If you wish, this can be done in a single substitution, like this
perl -i -pe 's/(?:nginx-cache|MYSITE)\K(\w+)/ ++($n = $1) /ge' $SITENAME
Note: The solution below is needlessly complicated, because as Borodin's helpful answer demonstrates (and #stevesliva's comment on the question hinted at), Perl directly supports incrementing letters alphabetically in the manner described in the question, by applying the ++ operator to a variable containing a letter (sequence); e.g.:
$ perl -E '$letters = "ZZ"; say ++$letters'
AAA
The solution below may still be of interest as an annotated showcase of how Perl's power can be harnessed from the shell, showing techniques such as:
use of s///e to determine the replacement string with an expression.
splitting a string into a character array (split //, "....")
use of the ord and chr functions to get the codepoint of a char., and convert a(n incremented) codepoint back to a char.
string replication (x operator)
array indexing and slices:
getting an array's last element ($chars[-1])
getting all but the last element of an array (#chars[0..$#chars-1])
A perl solution (in effect a re-implementation of what ++ can do directly):
perl -pe 's/\bMYSITE\K([A-Z]+)/
#chars = split qr(), $1; $chars[-1] eq "Z" ?
"A" x (1 + scalar #chars)
:
join "", #chars[0..$#chars-1], chr (1 + ord $chars[-1])
/e' <<'EOF'
...=MYSITEP:...
...=MYSITEZP:...
...=MYSITEZZ:...
EOF
yields:
...=MYSITEQ:... # P -> Q
...=MYSITEZQ:... # ZP -> ZQ
...=MYSITEAAA:... # ZZ -> AAA
You can use perl's -i option to replace the input file with the result
(perl -i -pe '...' "$SITENAME").
As Borodin's answer demonstrates, it's not hard to solve all tasks in the question using perl alone.
The s function's /e option allows use of a Perl expression for determining the replacement string, which enables sophisticated replacements:
$1 references the current MYSITE suffix in the expression.
#chars = split qr(), $1 splits the suffix into a character array.
$chars[-1] eq "Z" tests if the last suffix char. is Z
If so: The suffix is replaced with all As, with an additional A appended
("A" x (1 + scalar #chars)).
Otherwise: The last suffix char. is replaced with the following letter in the alphabet
(join "", #chars[0..$#chars-1], chr (1 + ord $chars[-1]))

Deleting all until you find capital letter in bash

I have an output
timeout.o:
U alarm
000000000000t000 T catch_sig_alarm
0000000000000b13 T set_timeout
U signal
0000000g00000000 B timeout
and I need to get rid of the numers and letters before T and U and B so output will be like this:
timeout.o:
U alarm
T catch_sig_alarm
T set_timeout
U signal
B timeout
How can I do that using sed? I tried something like sed 's/[0-9]*//;s/ *//' but I dont know how to say to delete the letters too.
Update
Based on the real input data (I thought timeout.o was the file name):
... | awk 'NF>1 {sub("^[^A-Z]*","")} {print}'
timeout.o:
U alarm
T catch_sig_alarm
T set_timeout
U signal
B timeout
It does the substitution just in case the line contains more than one field. This way, the first line is skipped. It would be the same in this case to do NR>1.
You can use this:
$ sed 's/^[^A-Z]*//' timeout.o
U alarm
T catch_sig_alarm
T set_timeout
U signal
B timeout
What it does is to fetch all the characters from the beginning (^ indicates beginning of the line) not being a capital letter ([^A-Z]* means that) and replacing them with an empty string.
Note the expression sed 's/hello/bye/' replaces once hello with bye. If you want to do multiple substitution (is not this case), you can do sed 's/hello/bye/g'.
If you want to do an in-place substitution, do sed -i ....
input | sed '/^[a-zA-Z0-9.]\+\.[a-z]\+:$/!s/^[^A-Z]*//'
Explanation: [^A-Z] is everything not uppercase letter. The first ^ makes sure, the expression starts at line beginning and doesn't go rogue in the middle of the line. The expression simply starts deleting everything in a line, until it finds an uppercase letter.
The first part /^[a-zA-Z0-9]\+\.[a-z]\+:$/! up until the s constricts the removal to all lines, that do not (the final !) match exactly [letter]...[a dot][letter]...[a colon], which looks like a filename production.
cat timeout.o | sed 's/^[^BUT]* //'
Or
sed 's/[0-9a-z]* //;s/ *//'
Given that the first column seems to have a fixed width, I'd just use
{ read; echo "$REPLY"; cut -c18-; } < timeout.o
to remove the first 17 characters (while preserving the initial line in full).

sed command to edit stream on given rule

I have an input stream like this:
afs=1;bgd=1;cgd=1;djh=1;fgjhh=1;
Now the rule I have to edit the stream is:
(1)if we have
"djh=number;"
replace it with
"djh=number,"
(2)else replace "string=number;"it with
"string,"
I can handle case 2 as:
sed 's/afs=1/afs,/g;s/dbg=1/dbg,/g;..... so on for rest
How to take care for condition 1?
The "djh" number can be any number(1,12,100), the other numbers are always 1.
all the double quotes I have used are for reference only; no double quotes are present in the input stream. "afs" can be "Afs" also.
Thanks in advance.
sed -e 's/;/,/g; s/,djh=/,#=/; s/\([a-z][a-z]*\)=[0-9]*,/\1,/g; s/#/djh/g'
This does the following
replace all ; by ,
replace djh with #
remove =number from all lower cased strings
replace # with djh
This results in afs,bgd,cgd,djh=1,fgjhh, for your input. Of course you could substitute djh with any other character that makes it easy to match the other strings. This is just illustrating the idea.
echo 'afs=1;bgd=1;cgd=1;djh=1;fgjhh=1;' |
sed -e 's/\(djh=[0-9]\+\);/\1,/g' -e 's/\([a-zA-Z0-9]\+\)=1;/\1,/g'
This might work for you:
echo "afs=1;bgd=1;cgd=1;djh=1;fgjhh=1;" |
sed 's/^/\n/;:a;/\n\(djh=[0-9]*\);/s//\1,\n/;ta;s/\n\([^=]*\)=1;/\1,\n/;ta;s/.$//'
afs,bgd,cgd,djh=1,fgjhh,
Explanation:
This method uses a unique marker (\n is a good choice because it cannot appear in the pattern space as it is used by sed as the line delimiter) as anchor for comparing throughout the input string. It is slow but can scale if more than one exception is needed.
Place the marker in front of the string s/^/\n/
Name a loop label :a
Match the exception(s) /\n\(djh=[0-9]*\)/
If the exception occurs substitute as necessary. Also bump the marker along /s//\1,\n/
If the above is true break to loop label ta
Match the normal and substitute. Also bump the marker along s/\n\([^=]*\)=1;/\1,\n/
If the above is true break to loop label ta
All done remove the marker s/.$//
or:
echo "afs=1;bgd=1;cgd=1;djh=1;fgjhh=1;" |
sed 's/\<djh=/\n/g;s/=[^;]*;/,/g;s/\n\([^;]*\);/djh=\1,/g'
afs,bgd,cgd,djh=1,fgjhh,
Explanation:
This is fast but does not scale for multiple exceptions:
Globaly replace the exception string with a new line s/\<djh=/\n/g
Globaly replace the normal condition s/=[^;]*;/,/g
Globaly replace the \n by the exception string s/\n\([^;]*\);/djh=\1,/g
N.B. When replacing the exception string make sure that it begins on a word boundary \<

Resources