how to use egrep regex? - bash

how to use egrep regex ?
source
exec pro..do_pr_ddd_sum 123039246, 995, 201705848
egrep '*pr_ddd_sum*123039246*995*' *
-- no result found
In the code above, it can't get any result back.

Perhaps you mean 'pr_ddd_sum.*123039246.*995'.

You're confusing shell wildcards with regular expression metacharacters. In the shell, a "" means any character. In a regex, this metacharacter means zero or more of the preceding character. Look at Michael's suggestion. In it, the dot ('.') stands for any character, so '.' means any character followed by zero or more repetitions of any character.

Related

How to NOT replace a word with an asterisk at the end?

I specify the command below to find strings that match EXACTLY returnType but the command also searches for strings that contain returnType* and performs the substitution.
sed -e "/\<returnType\>/ s/word1/word2/" input.cpp
How do I make the command so that it ignores returnType*? Why is it performing a substitution on a returnType*?
I'm trying to distinguish between regular returnType and pointer returnType*, and it's not working.
I'd appreciate any help, thanks.
Append '[^*]' to the pattern to require anything but *.
sed -e "/\<returnType\>[^*]/ s/word1/word2/"
Note '*' not having special meaning as part of bracket expression.

Brace needs to be escaped with \ inside single quotes

I expect the following to work:
ls -l | grep '^.{38}<some date>'
It should give me the files which have said date in modification time. But it does not work. The following works:
ls -l | grep '^.\{38\}<some date>'
Isn't '...' supposed to turn off special meaning for all the meta characters? Why should we have to escape braces?
The regular expression .{38}, as interpreted here by grep, matches an arbitrary string of exactly 38 characters. To match literal braces, you need to escape them.
.\{38\}
In order to ensure that that exact 7-character sequence is seen by grep, you need to quote the string so that the shell doesn't perform quote removal and reduce it to .{38} before grep gets a chance to see it.
Misunderstanding the question, it appears grep is using basic regular expressions, in which unescaped braces are the literal characters and the escaped ones introduce a brace expression. In extended regular expressions, it's the other way around. In either case, though, the single quotes are protecting all enclosed characters from special treatment by the shell; whether grep treats them specially is another question.
There are many variants of regular expression syntax. By default, grep uses the "basic" ("BRE" or "obsolete") regular expression syntax, in which braces must be escaped to be treated as repetition bounds (what you're trying to do here); without the escapes, they're treated as just literal characters. In the "extended" ("ERE" or "modern"), Perl-compatible ("PCRE"), and ... well, pretty much all other variants, it's the other way around: escaped braces are treated as literal characters, and unescaped ones define repetition bounds.
grep '^.{38}<some date>' # Matches any character followed by literal braces around "38"
grep '^.\{38\}<some date>' # Matches 38 characters
grep -E '^.{38}<some date>' # Matches 38 characters (-E invokes "extended" syntax)
egrep '^.{38}<some date>' # Matches 38 characters (egrep uses "extended" syntax)
BTW, parentheses are the same: literal unless escaped in the basic syntax, literal if escaped in the extended syntax. And there are a few other differences; see the re_format man page. There are also many other syntax variants (Perl-compatible, etc). It's important to know what variant the tool you're using accepts, and format your RE appropriately for it.
BTW2, as #Charles Duffy pointed out in a comment, parsing ls output isn't a good idea. In this case, the number of characters before the date will depend on the width of other fields (user, group, size), which will not be consistent, so skipping 38 characters might skip part of the date field or not skip enough. You'd be much better off using something like find with the -mtime or -mmin tests, or at least using stat instead of ls (since you can control the fields with the format string, and e.g. put the date at the beginning of the line) (but stat will still have some of ls's other problems).

Delete all lines containing a caret (^)

I tried sed -i '/^/d' myfile and it deleted the entire file. How to avoid this? I want to remove all lines with ^ in it.
sed -i '/\^/d' myfile
You need to escape the ^ special character.
In regular expressions, characters that are "special" lose their special meaning when they exist within a bracket expression (square brackets). So you'd think that a search for [^] would be what you need.
Alas, it turns out that while this works for the caret, the caret also gains a different special meaning when it is the first character of a bracket expression. It is used to negate the expressions. So [^] is actually invalid regex syntax, and this character still needs to be escaped.
What you're looking for, in GNU sed, might look like:
sed -i '/[\^]/d' myfile
This looks awkward (especially when compared to #threadp's answer), but I prefer the square bracket approach to escape specials because it works on all other special characters the same way and its behaviour is consistent across regex parsers. Backslashes are used for other things -- continuing lines in shell scripts, converting characters to specials (\n, \t, etc). Too many backslashes can make things confusing.
One interesting thing to note is that the caret is only special within a bracket expression if it is the FIRST character. So the following works:
$ printf 'one\ntwo^\n' | sed -ne '/[X^]/p'
two^

Unix shell replacing a word containing backtick in a file

I am having a sql file (samplesqlfile) and I want to replace a string which contains backticks with another string. Below is the code.
actualtext="FROM sampledatabase.\`Datatype\`"
replacetext="FROM sampledatabase.\`Datatype_details\`"
sed -i "s/\<${actualtext}\>/${replacetext}/g" samplesqlfile
This is not working. The actual word to be replaced is
FROM sampledatabase.`Datatype`
I added back slashes to escape the backticks. But still it is not working. Please help.
Observe that this does not work:
$ sed "s/\<${actualtext}\>/${replacetext}/g" samplesqlfile
FROM sampledatabase.`Datatype`
But this does:
$ sed "s/\<${actualtext}/${replacetext}/g" samplesqlfile
FROM sampledatabase.`Datatype_details`
The problem was the \>. The string variable $actualtext does not end with a word-character. It ends with a quote. Consequently, \> will never match there. The solution is to remove \>.
To clarify, \> matches at the boundary between a word character and a non-word character where the word character appears first. Word characters can be alphanumerics or underlines.
\> is a GNU extension. The behavior under BSD/OSX sed will be different.
For purposes of illustration here, I removed the -i option. For your intended use, of course, add it back.

egrep and grep difference with dollar

I'm having touble understanding the different behaviors of grep end egrep when using \$ in a pattern.
To be more specific:
grep "\$this->db" file # works
egrep "\$this->db" file # does not work
egrep "\\$this->db" file # works
Can some one tell me why or link some explanation?
Thank you very much.
The backslash is being eaten by the shell's escape processing, so in the first two cases the regexp is just $this->db. The difference is that grep treats a $ that isn't at the end of the regexp as an ordinary character, but egrep treats it as a regular expression that matches the end of the line.
In the last case, the double backslash causes the backslash to be sent to egrep. This escapes the $, so it gets treated as an ordinary character rather than matching the end of the line.
See man grep:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
If regex are activated (through the usage of egrep) metacharacters like the backslash have to be escaped with a backslash. Therefore the need of \\ to match a literal backslash.

Resources