I tried sed -i '/^/d' myfile and it deleted the entire file. How to avoid this? I want to remove all lines with ^ in it.
sed -i '/\^/d' myfile
You need to escape the ^ special character.
In regular expressions, characters that are "special" lose their special meaning when they exist within a bracket expression (square brackets). So you'd think that a search for [^] would be what you need.
Alas, it turns out that while this works for the caret, the caret also gains a different special meaning when it is the first character of a bracket expression. It is used to negate the expressions. So [^] is actually invalid regex syntax, and this character still needs to be escaped.
What you're looking for, in GNU sed, might look like:
sed -i '/[\^]/d' myfile
This looks awkward (especially when compared to #threadp's answer), but I prefer the square bracket approach to escape specials because it works on all other special characters the same way and its behaviour is consistent across regex parsers. Backslashes are used for other things -- continuing lines in shell scripts, converting characters to specials (\n, \t, etc). Too many backslashes can make things confusing.
One interesting thing to note is that the caret is only special within a bracket expression if it is the FIRST character. So the following works:
$ printf 'one\ntwo^\n' | sed -ne '/[X^]/p'
two^
Related
I expect the following to work:
ls -l | grep '^.{38}<some date>'
It should give me the files which have said date in modification time. But it does not work. The following works:
ls -l | grep '^.\{38\}<some date>'
Isn't '...' supposed to turn off special meaning for all the meta characters? Why should we have to escape braces?
The regular expression .{38}, as interpreted here by grep, matches an arbitrary string of exactly 38 characters. To match literal braces, you need to escape them.
.\{38\}
In order to ensure that that exact 7-character sequence is seen by grep, you need to quote the string so that the shell doesn't perform quote removal and reduce it to .{38} before grep gets a chance to see it.
Misunderstanding the question, it appears grep is using basic regular expressions, in which unescaped braces are the literal characters and the escaped ones introduce a brace expression. In extended regular expressions, it's the other way around. In either case, though, the single quotes are protecting all enclosed characters from special treatment by the shell; whether grep treats them specially is another question.
There are many variants of regular expression syntax. By default, grep uses the "basic" ("BRE" or "obsolete") regular expression syntax, in which braces must be escaped to be treated as repetition bounds (what you're trying to do here); without the escapes, they're treated as just literal characters. In the "extended" ("ERE" or "modern"), Perl-compatible ("PCRE"), and ... well, pretty much all other variants, it's the other way around: escaped braces are treated as literal characters, and unescaped ones define repetition bounds.
grep '^.{38}<some date>' # Matches any character followed by literal braces around "38"
grep '^.\{38\}<some date>' # Matches 38 characters
grep -E '^.{38}<some date>' # Matches 38 characters (-E invokes "extended" syntax)
egrep '^.{38}<some date>' # Matches 38 characters (egrep uses "extended" syntax)
BTW, parentheses are the same: literal unless escaped in the basic syntax, literal if escaped in the extended syntax. And there are a few other differences; see the re_format man page. There are also many other syntax variants (Perl-compatible, etc). It's important to know what variant the tool you're using accepts, and format your RE appropriately for it.
BTW2, as #Charles Duffy pointed out in a comment, parsing ls output isn't a good idea. In this case, the number of characters before the date will depend on the width of other fields (user, group, size), which will not be consistent, so skipping 38 characters might skip part of the date field or not skip enough. You'd be much better off using something like find with the -mtime or -mmin tests, or at least using stat instead of ls (since you can control the fields with the format string, and e.g. put the date at the beginning of the line) (but stat will still have some of ls's other problems).
I have to variables in a bash script:
$string = "The cat is green.\n"
$line = "Sunny day today.\n"
each of those variables contain "\n" character, how can I use sed to search and replace:
sed 's/$string/$line/g' file.txt
This doesn't seem to work, if I erase the "\n" from the strings sed works properly.
If I had only the text I could escape "\n" by adding a backslash:
sed 's/"The cat is green.\\n"/"Sunny day today.\\n"/g' file.txt
How can I manage to do search/replace when variables contain "\n" in them.
Thank you for the help.
It looks like you are trying to match the two-character sequence \n, as opposed to the single newline character that together they represent in some contexts. There is a tremendous difference between these.
As part of your example, you presented
sed 's/$string/$line/g' file.txt
, but that won't work at all, because variable references are not expanded within single-quoted strings. That has nothing whatever to do with the values of shell variables string and line.
But let's consider those values:
$string="The cat is green.\n"
$line="Sunny day today.\n"
[Extra spaces removed.]
Of course, the problem you're focusing on is that sed recognizes \n as a code for a newline character, but you also have the problem that in a regular expression, the . character matches any character, so if you want it to be treated as a literal then it, too, needs to be escaped (in the pattern, but not in the replacement). If you're trying to support search and replace for arbitrary text, then there are other characters you'll need to escape, too.
Answering the question as posed (escaping only \n sequences) you might do this:
sed "s/${string//\\n/\\\\n}/${line//\\n/\\\\n}/g"
The ${foo//pat/repl} form of parameter expansion performs pattern substitution on the expanded value, but note well that the pattern (pat) is interpreted according to shell globbing rules, not as a regular expression. That specific form replaces every appearance of the pattern; read the bash manual for alternatives that match only the first appearance and/or that match only at the beginning or the end of the parameter's value. Note, too, the extra doubling of the \ characters in the pattern substitution -- they need to be escaped for the shell, too.
Given your variable definitions, that command would be equivalent to this:
sed 's/The cat is green.\\n/Sunny day today.\\n/g'
In other words, exactly what you wanted. Again, however, be warned: that is not a general solution for arbitrary search & replace. If you want that, then you'll want to study the sed manual to determine which characters need to be escaped in the regex, and which need to be escaped in the replacement. Moreover, I don't see a way to do it with just one pattern substitution for each variable.
I am having a sql file (samplesqlfile) and I want to replace a string which contains backticks with another string. Below is the code.
actualtext="FROM sampledatabase.\`Datatype\`"
replacetext="FROM sampledatabase.\`Datatype_details\`"
sed -i "s/\<${actualtext}\>/${replacetext}/g" samplesqlfile
This is not working. The actual word to be replaced is
FROM sampledatabase.`Datatype`
I added back slashes to escape the backticks. But still it is not working. Please help.
Observe that this does not work:
$ sed "s/\<${actualtext}\>/${replacetext}/g" samplesqlfile
FROM sampledatabase.`Datatype`
But this does:
$ sed "s/\<${actualtext}/${replacetext}/g" samplesqlfile
FROM sampledatabase.`Datatype_details`
The problem was the \>. The string variable $actualtext does not end with a word-character. It ends with a quote. Consequently, \> will never match there. The solution is to remove \>.
To clarify, \> matches at the boundary between a word character and a non-word character where the word character appears first. Word characters can be alphanumerics or underlines.
\> is a GNU extension. The behavior under BSD/OSX sed will be different.
For purposes of illustration here, I removed the -i option. For your intended use, of course, add it back.
I'm trying to write a script in bash which extracts a database name from a PHP file. For example I want to copy CRM_123456789 from the below line:
$sugar_config['dbconfig']['db_name'] = 'CRM_123456789';
I have tried using sed, so essentially I want to copy the text between
['db_name'] = '
and
';
sed -n '/['db_name'] = /,/';/p' myfile.php
However this does not return anything. Does anyone know what I'm doing wrong?
Thanks
You cannot nest single quotes. Your expression evaluates to single-quoted /[ next to unquoted db_name where clearly you want to match on a literal single quote.
One workaround is to use double quotes for the outermost quoting, but make sure you make any necessary changes, because double quotes are weaker than single quotes in the shell. In your case, there's nothing to change in that respect, though.
However, you also appear to misunderstand how sed address expressions work. They identify lines, not substrings on a line. So your script would print between a line matching ['db_name'] and a line matching ';. To extract something from within a line, the common idiom is to substitute out the parts you don't want, then print what's left.
Also, because opening square bracket is a metacharacter in sed, you need to backslash-escape it to match it literally.
sed -n "s/.*\['db_name'] = '\([^']*\)'.*/\1/p" myfile.php
This matches up through ['db_name'] = ', then captures whatever is inside the single-quoted string into \1, then matches anything from the next single quote through the end of line, and substitutes it with just the captured string; and prints that line after performing the substitution.
If the config file supports variable whitespace, a useful improvement would be to allow for optional whitespace around the equals sign, and possibly also within the square brackets. [ ]* will match zero or more spaces (the square brackets aren't really necessary around a single space, but I include them here for legibility reasons).
You could try the below sed command.
$ sed -n "s/.*\['db_name'\] = '\([^']*\)';.*/\1/p" file
CRM_123456789
how to use egrep regex ?
source
exec pro..do_pr_ddd_sum 123039246, 995, 201705848
egrep '*pr_ddd_sum*123039246*995*' *
-- no result found
In the code above, it can't get any result back.
Perhaps you mean 'pr_ddd_sum.*123039246.*995'.
You're confusing shell wildcards with regular expression metacharacters. In the shell, a "" means any character. In a regex, this metacharacter means zero or more of the preceding character. Look at Michael's suggestion. In it, the dot ('.') stands for any character, so '.' means any character followed by zero or more repetitions of any character.