Case-insensitive search and replace with sed - macos

I'm trying to use SED to extract text from a log file. I can do a search-and-replace without too much trouble:
sed 's/foo/bar/' mylog.txt
However, I want to make the search case-insensitive. From what I've googled, it looks like appending i to the end of the command should work:
sed 's/foo/bar/i' mylog.txt
However, this gives me an error message:
sed: 1: "s/foo/bar/i": bad flag in substitute command: 'i'
What's going wrong here, and how do I fix it?

Update: Starting with macOS Big Sur (11.0), sed now does support the I flag for case-insensitive matching, so the command in the question should now work (BSD sed doesn't reporting its version, but you can go by the date at the bottom of the man page, which should be March 27, 2017 or more recent); a simple example:
# BSD sed on macOS Big Sur and above (and GNU sed, the default on Linux)
$ sed 's/ö/#/I' <<<'FÖO'
F#O # `I` matched the uppercase Ö correctly against its lowercase counterpart
Note: I (uppercase) is the documented form of the flag, but i works as well.
Similarly, starting with macOS Big Sur (11.0) awk now is locale-aware (awk --version should report 20200816 or more recent):
# BSD awk on macOS Big Sur and above (and GNU awk, the default on Linux)
$ awk 'tolower($0)' <<<'FÖO'
föo # non-ASCII character Ö was properly lowercased
The following applies to macOS up to Catalina (10.15):
To be clear: On macOS, sed - which is the BSD implementation - does NOT support case-insensitive matching - hard to believe, but true. The formerly accepted answer, which itself shows a GNU sed command, gained that status because of the perl-based solution mentioned in the comments.
To make that Perl solution work with foreign characters as well, via UTF-8, use something like:
perl -C -Mutf8 -pe 's/öœ/oo/i' <<< "FÖŒ" # -> "Foo"
-C turns on UTF-8 support for streams and files, assuming the current locale is UTF-8-based.
-Mutf8 tells Perl to interpret the source code as UTF-8 (in this case, the string passed to -pe) - this is the shorter equivalent of the more verbose -e 'use utf8;'.Thanks, Mark Reed
(Note that using awk is not an option either, as awk on macOS (i.e., BWK awk and BSD awk) appears to be completely unaware of locales altogether - its tolower() and toupper() functions ignore foreign characters (and sub() / gsub() don't have case-insensitivity flags to begin with).)
A note on the relationship of sed and awk to the POSIX standard:
BSD sed and awk limit their functionality mostly to what the POSIX sed and
POSIX awk specs mandate, whereas their GNU counterparts implement many more extensions.

Editor's note: This solution doesn't work on macOS (out of the box), because it only applies to GNU sed, whereas macOS comes with BSD sed.
Capitalize the 'I'.
sed 's/foo/bar/I' file

Another work-around for sed on Mac OS X is to install gsedfrom MacPorts or HomeBrew and then create the alias sed='gsed'.

If you are doing pattern matching first, e.g.,
/pattern/s/xx/yy/g
then you want to put the I after the pattern:
/pattern/Is/xx/yy/g
Example:
echo Fred | sed '/fred/Is//willma/g'
returns willma; without the I, it returns the string untouched (Fred).

The sed FAQ addresses the closely related case-insensitive search. It points out that a) many versions of sed support a flag for it and b) it's awkward to do in sed, you should rather use awk or Perl.
But to do it in POSIX sed, they suggest three options (adapted for substitution here):
Convert to uppercase and store original line in hold space; this won't work for substitutions, though, as the original content will be restored before printing, so it's only good for insert or adding lines based on a case-insensitive match.
Maybe the possibilities are limited to FOO, Foo and foo. These can be covered by
s/FOO/bar/;s/[Ff]oo/bar/
To search for all possible matches, one can use bracket expressions for each character:
s/[Ff][Oo][Oo]/bar/

The Mac version of sed seems a bit limited. One way to work around this is to use a linux container (via Docker) which has a useable version of sed:
cat your_file.txt | docker run -i busybox /bin/sed -r 's/[0-9]{4}/****/Ig'

Use following to replace all occurrences:
sed 's/foo/bar/gI' mylog.txt

I had a similar need, and came up with this:
this command to simply find all the files:
grep -i -l -r foo ./*
this one to exclude this_shell.sh (in case you put the command in a script called this_shell.sh), tee the output to the console to see what happened, and then use sed on each file name found to replace the text foo with bar:
grep -i -l -r --exclude "this_shell.sh" foo ./* | tee /dev/fd/2 | while read -r x; do sed -b -i 's/foo/bar/gi' "$x"; done
I chose this method, as I didn't like having all the timestamps changed for files not modified. feeding the grep result allows only the files with target text to be looked at (thus likely may improve performance / speed as well)
be sure to backup your files & test before using. May not work in some environments for files with embedded spaces. (?)

Following should be fine:
sed -i 's/foo/bar/gi' mylog.txt

Related

How to use sed to remove ./ between two characters in Unix shell

I am trying to remove ./ between two characters using sed but not getting the desired output.
Sample:
e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt
I tried the below but it is not working as expected, even the . in the ".txt" is getting removed.
sed -i 's/[./,]//g'
Beware: don't even think of using the -i option until you know the code is working. You can screw things up big time!
Use:
sed -e 's%[.]/%%g'
You can choose the delimiter in a s/// command, and when the regular expressions involve /, it is sensible to choose something else — I often use % when it doesn't figure in the text. The -e is optional. Using [.] to detect an actual dot is one way; you can write \. if you prefer, but I'm allergic to avoidable backslashes (if you've never had to write 16 backslashes in a row to get troff to do what you want, you haven't suffered enough).
Be aware that the -i option behaves differently in GNU sed and BSD (macOS) sed. Using -i.bak works in both (for an arbitrary, non-empty string such as .bak). Otherwise, your code isn't portable (which may or may not matter to you now, but might well do later on).
You have:
sed -i 's/[./,]//g'
The trouble with this is that it looks for any of the characters ., / or , in isolation — so it removes the . in .txt as well as the . and / in ./. You need to look for consecutive characters — as in my suggested solution.
try this:
echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed -e 's|\./||'
You need to use escape character \
's#\.\/##g'
:=>echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed 's#\.\/##g'
e2b66a3d84ee448c33d7f2a2f7e51c58 2017_06_10_0400.txt
:=>

I need to replace literal \n with newline in shell

We receive a file which is essentially an ssh token key.
This upon inception has values, say
Foo\nBarFoo,\nFoo\nBarFoo
Now, I want to replace these with
Foo
BarFoo,
Foo
BarFoo
I have tried sed and tr commands by copying the entire key in a variable.
One thing that seemed to work was : %s/\\n/\r/g, but this is not acceptable since I cannot open the vi editor.
I recently tried echo -e 'Foo\nBarFoo,\nFoo\nBarFoo, but want to be it more subtle.
You are facing the problem because you are using OSX or BSD probably. With GNU sed, 's/\\n/\n/g’ should have worked.
For POSIX sed, use this,
echo "Foo\nBarFoo,\nFoo\nBarFoo" | sed 's/\\n/\
/g'

Sed command to uppercase text between two specific strings

I want to parse a file and replace the text between "::" and ":::" with the text already there, just now capitalized.
I've tried using this command:
sed 's/\(::\)\(.*\)\(:::\)/\1\U\2\E\3/' filename
but the output just puts a U in beginning and E at the end of the string I want capitalized
Works for me, which makes me think you may not be on Linux?
echo "This is :: some sample text ::: to test uppercasing" | sed 's/\(::\)\(.*\)\(:::\)/\1\U\2\E\3/'
This is :: SOME SAMPLE TEXT ::: to test uppercasing
If Perl is your option, you can say something like:
echo "This is :: some sample text ::: to test uppercasing" | perl -pe 's/(::)(.*)(:::)/\1\U\2\E\3/'
This is :: SOME SAMPLE TEXT ::: to test uppercasing
gawk '{match($0,/::.*:::/,a) ;gsub(/::.*::/,toupper(a[0]))}1' input
Here ,bit less cryptic solution with gawk:, match is used to find the desired string ,later that string is used by gsub to convert it to upped cause using toupper function.
You are pretty close.
On Mac OS X, you will need to install GNU sed, because the feature you are using - \U - is a GNU extension.
So, start by installing it:
▶ brew install gnu-sed
Then I normally stick in some code like this somewhere:
shopt -s expand_aliases
alias sed='/usr/local/bin/gsed'
And then your GNU sed will work.
Finally, I would simplify that code as:
▶ sed -E 's/(::)(.*)(::)/\1\U\2\E\3/' <<< "foo::bar::baz"
foo::BAR::baz
Noting that -E gives you Extended Regular Expressions, and a cleaner syntax when you are doing captures.
This might work for you (GNU sed):
sed 's/::[^:]*:::/\U&/' file
or perhaps:
sed 's/::[^:]*:::/\n&\n/;h;y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/;G;s/.*\n\(.*\)\n.*\n\(.*\)\n.*\n/\2\1/' file
Using seds y native translate command, pattern matching and a copy held in the hold space.

SED adding e suffixed file when used in OSX?

Not sure whats going on, I replaced with gnu sed but I am getting backup files somehow. This is exactly what I am doing
mkdir tmp && cd $_
echo 'test' > test.txt
ls
test.txt
sed -ie 's/test/replaced/g' test.txt
ls
test.txt
test.txte
What's going on here and how do I prevent this? It should edit in place, not create a backup
sed version (GNU sed) 4.2.2
As per the fantastic comments from mklement0 the POSIX spec tells us that:
With optional option-arguments, POSIX utility conventions require that (emphasis his) "a conforming application shall place any option-argument for that option directly adjacent to the option in the same argument string, without intervening characters.
Since GNU sed considers the suffix argument to -i to be optional it requires it to be cuddled up against the option argument so when you write -ie GNU sed interprets that as requesting a suffix of e for the -i argument. (BSD sed would interpret it in the same manner for reasons that are explained in the additional info at the bottom.)
What this all means is that you need to use -i -e to get the behavior you want (for GNU sed) instead (for BSD sed you would need -i '' -e).
Additional details about an unfortunate but interesting distinction between GNU sed and BSD sed:
GNU sed and BSD sed (OSX sed) disagree on whether the suffix value for the -i argument is optional or mandatory.
This matters because complementing the POSIX requirement above we find the following in the spec as well (emphasis mkelement0's again):
an option with a mandatory option-argument [...], a conforming application shall use separate arguments for that option and its option-argument. However, a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening characters.
GNU sed considers the suffix to be optional (and this causes the behavior above) because it accepts the cuddled e for the optional argument but ignores the separated -e (or anything else) as a separate argument.
BSD sed considers the suffix mandatory (even though it may be empty) this then implies that the option should be separated from the flag with a space (e.g. -i .bak or -i '') though as the "However" note indicates, BSD sed also allows any non-empty suffix to be cuddled up against the -i flag.
This disagreement, as mklement0 points out and which comes up on SO every now and then, means that you cannot use an empty in-place edit suffix in a manner that is portable to both GNU and BSD versions of sed.
As always, the man page helps:
-i extension:
Edit files in-place, saving backups with the specified extension. If a zero-length extension is given, no backup will be saved. It is not recommended to give a zero-length extension when in-place edit
ing files, as you risk corruption or partial content in situations where disk space is exhausted, etc.
Your use of -ie adds e to the end. Remove that.

Insert line after match using sed

For some reason I can't seem to find a straightforward answer to this and I'm on a bit of a time crunch at the moment. How would I go about inserting a choice line of text after the first line matching a specific string using the sed command. I have ...
CLIENTSCRIPT="foo"
CLIENTFILE="bar"
And I want insert a line after the CLIENTSCRIPT= line resulting in ...
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Try doing this using GNU sed:
sed '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
if you want to substitute in-place, use
sed -i '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
Output
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Doc
see sed doc and search \a (append)
Note the standard sed syntax (as in POSIX, so supported by all conforming sed implementations around (GNU, OS/X, BSD, Solaris...)):
sed '/CLIENTSCRIPT=/a\
CLIENTSCRIPT2="hello"' file
Or on one line:
sed -e '/CLIENTSCRIPT=/a\' -e 'CLIENTSCRIPT2="hello"' file
(-expressions (and the contents of -files) are joined with newlines to make up the sed script sed interprets).
The -i option for in-place editing is also a GNU extension, some other implementations (like FreeBSD's) support -i '' for that.
Alternatively, for portability, you can use perl instead:
perl -pi -e '$_ .= qq(CLIENTSCRIPT2="hello"\n) if /CLIENTSCRIPT=/' file
Or you could use ed or ex:
printf '%s\n' /CLIENTSCRIPT=/a 'CLIENTSCRIPT2="hello"' . w q | ex -s file
Sed command that works on MacOS (at least, OS 10) and Unix alike (ie. doesn't require gnu sed like Gilles' (currently accepted) one does):
sed -e '/CLIENTSCRIPT="foo"/a\'$'\n''CLIENTSCRIPT2="hello"' file
This works in bash and maybe other shells too that know the $'\n' evaluation quote style. Everything can be on one line and work in
older/POSIX sed commands. If there might be multiple lines matching the CLIENTSCRIPT="foo" (or your equivalent) and you wish to only add the extra line the first time, you can rework it as follows:
sed -e '/^ *CLIENTSCRIPT="foo"/b ins' -e b -e ':ins' -e 'a\'$'\n''CLIENTSCRIPT2="hello"' -e ': done' -e 'n;b done' file
(this creates a loop after the line insertion code that just cycles through the rest of the file, never getting back to the first sed command again).
You might notice I added a '^ *' to the matching pattern in case that line shows up in a comment, say, or is indented. Its not 100% perfect but covers some other situations likely to be common. Adjust as required...
These two solutions also get round the problem (for the generic solution to adding a line) that if your new inserted line contains unescaped backslashes or ampersands they will be interpreted by sed and likely not come out the same, just like the \n is - eg. \0 would be the first line matched. Especially handy if you're adding a line that comes from a variable where you'd otherwise have to escape everything first using ${var//} before, or another sed statement etc.
This solution is a little less messy in scripts (that quoting and \n is not easy to read though), when you don't want to put the replacement text for the a command at the start of a line if say, in a function with indented lines. I've taken advantage that $'\n' is evaluated to a newline by the shell, its not in regular '\n' single-quoted values.
Its getting long enough though that I think perl/even awk might win due to being more readable.
A POSIX compliant one using the s command:
sed '/CLIENTSCRIPT="foo"/s/.*/&\
CLIENTSCRIPT2="hello"/' file
Maybe a bit late to post an answer for this, but I found some of the above solutions a bit cumbersome.
I tried simple string replacement in sed and it worked:
sed 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
& sign reflects the matched string, and then you add \n and the new line.
As mentioned, if you want to do it in-place:
sed -i 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
Another thing. You can match using an expression:
sed -i 's/CLIENTSCRIPT=.*/&\nCLIENTSCRIPT2="hello"/' file
Hope this helps someone
The awk variant :
awk '1;/CLIENTSCRIPT=/{print "CLIENTSCRIPT2=\"hello\""}' file
I had a similar task, and was not able to get the above perl solution to work.
Here is my solution:
perl -i -pe "BEGIN{undef $/;} s/^\[mysqld\]$/[mysqld]\n\ncollation-server = utf8_unicode_ci\n/sgm" /etc/mysql/my.cnf
Explanation:
Uses a regular expression to search for a line in my /etc/mysql/my.cnf file that contained only [mysqld] and replaced it with
[mysqld]
collation-server = utf8_unicode_ci
effectively adding the collation-server = utf8_unicode_ci line after the line containing [mysqld].
I had to do this recently as well for both Mac and Linux OS's and after browsing through many posts and trying many things out, in my particular opinion I never got to where I wanted to which is: a simple enough to understand solution using well known and standard commands with simple patterns, one liner, portable, expandable to add in more constraints. Then I tried to looked at it with a different perspective, that's when I realized i could do without the "one liner" option if a "2-liner" met the rest of my criteria. At the end I came up with this solution I like that works in both Ubuntu and Mac which i wanted to share with everyone:
insertLine=$(( $(grep -n "foo" sample.txt | cut -f1 -d: | head -1) + 1 ))
sed -i -e "$insertLine"' i\'$'\n''bar'$'\n' sample.txt
In first command, grep looks for line numbers containing "foo", cut/head selects 1st occurrence, and the arithmetic op increments that first occurrence line number by 1 since I want to insert after the occurrence.
In second command, it's an in-place file edit, "i" for inserting: an ansi-c quoting new line, "bar", then another new line. The result is adding a new line containing "bar" after the "foo" line. Each of these 2 commands can be expanded to more complex operations and matching.

Resources