Optimize sed on logcat - bash

I've been using grep and sed on some logcat output to make it more readable and I noticed my output was noticeably slower than just grep-ing the output.
I understand sed is obviously going to add more runtime, but I wanted to check for any optimization techniques.
My commands look something like this for reference:
adb logcat | grep arg | sed $'s/{/\\\n{/g

The useless grep is well-documented and easy to get rid of.
adb logcat | sed $'/\\*arg/s/{/\\\n{/g'
To briefly reiterate the linked web page, anything that looks like grep 'x' | sed 'y' can be refactored to sed '/x/y' (and similarly for grep 'x' | awk 'y', which reduces to awk '/x/ y'). sed and Awk are both generalized regex tools which can do everything grep can do (though in fairness some complex grep options are tedious to reimplement in a sed or Awk script; but this is obviously not one of these cases).
However, *arg* is not a well-defined regex; so I have to guess what you actually mean.
* at the beginning of a regex isn't well-defined; but many grep implementations will understand it to mean a literal asterisk. If that's not what you meant, probably take away the first \\*.
arg* is exactly equivalent to ar; if you don't care whether there are g characters after the match, just don't specify them. But perhaps you actually meant arg followed by anything?
But then I guess you probably meant just arg (implicitly preceded by and followed by anything).
In case it's not already obvious, * is not a wildcard character in regex. Instead, it says to repeat the preceding expression as many times as possible, zero or more (and thus the way to say "any string at all" in regex is .*, i.e. the "any character (except newline)" wildcard character . repeated zero or more times).
Also, grep (and sed, and Awk) look for the regex anywhere in a line (unless you put in explicit regex anchors or use grep -x or equivalent options in sed or Awk) so you don't need to specify "preceded by anything" or "followed by anything".
The Bash "C-style string" $'...' offers some conveniences, but also requires any literal backslash to be doubled. So $'/\\*/' is equivalent to '/\*/' in regular single quotes.
The reason the sed slows you down is probably buffering, but getting rid of the useless grep also coincidentally gets rid of that buffering.

Related

Remove word from url

I Need to remove /%(tenant_id)s from this source:
https://ext.an1.test.dev:8776/v3/%(tenant_id)s
To make it look like this:
https://ext.an1.test.dev:8776/v3
I'm trying through sed, but unsuccessfully.
curl ....... | jq -r .endpoints[].url | grep '8776/v3' | sed -e 's/[/%(tenant_id)s] //g'
I get it again:
https://ext.an1.test.dev:8776/v3/%(tenant_id)s
You seem to be confused about the meaning of square brackets.
curl ....... |
jq -r '.endpoints[].url' |
sed -n '\;8776/v3;s;/%.*;;p'
fixes the incorrect regex, loses the useless grep, and somewhat simplifies the processing by switching to a different delimiter. To protect against (fairly unlikely) shell wildcard matches on the text in the jq search expression, I also added single quotes around that.
In some more detail, sed -n avoids printing input lines, and the address expression \;8776/v3; selects only input lines which match the regex 8776/v3; we use ; as the delimiter around the regex, which (somewhat obscurely) requires the starting delimiter to be backslashed. Then, we perform the substitution: again, we use ; as the delimiter so that slashes and percent signs in the regex do not need to be escaped. The p flag on the substitution causes sed to print lines where the substitution was performed successfully; we remove the g flag, as we don't expect more than one match per input line. The substitution replaces everything after the first occurrence of /% with nothing.
(Equivalently, with slash delimiters, you would have to backslash all literal slashes: sed -n '/8776\/v3/s/\/%.*//p'.)
For the record, square brackets in regular expressions form a character class; the expression [abc] matches a single character which can be one of a, b, or c. Perhaps review the tips on the Stack Overflow regex tag info page for a quick rerun on this and other common beginner mistakes.
Besides the incorrect square brackets, your regex specified a space after s, which is unlikely to be there. Other than that, your regex should work fine if you are sure the string you want to remove is always exactly /%(tenant_id)s. (Many regex dialects require round parentheses to be escaped, but sed without -E or -r is not one of those.)
If you've managed to get the address into a variable then one parameter expansion idea:
$ myaddr='https://ext.an1.test.dev:8776/v3/%(tenant_id)s'
$ echo "${myaddr%/*}"
https://ext.an1.test.dev:8776/v3
$ mynewaddr="${myaddr%/*}"
$ echo "${mynewaddr}"
https://ext.an1.test.dev:8776/v3

How to use sed to remove ./ between two characters in Unix shell

I am trying to remove ./ between two characters using sed but not getting the desired output.
Sample:
e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt
I tried the below but it is not working as expected, even the . in the ".txt" is getting removed.
sed -i 's/[./,]//g'
Beware: don't even think of using the -i option until you know the code is working. You can screw things up big time!
Use:
sed -e 's%[.]/%%g'
You can choose the delimiter in a s/// command, and when the regular expressions involve /, it is sensible to choose something else — I often use % when it doesn't figure in the text. The -e is optional. Using [.] to detect an actual dot is one way; you can write \. if you prefer, but I'm allergic to avoidable backslashes (if you've never had to write 16 backslashes in a row to get troff to do what you want, you haven't suffered enough).
Be aware that the -i option behaves differently in GNU sed and BSD (macOS) sed. Using -i.bak works in both (for an arbitrary, non-empty string such as .bak). Otherwise, your code isn't portable (which may or may not matter to you now, but might well do later on).
You have:
sed -i 's/[./,]//g'
The trouble with this is that it looks for any of the characters ., / or , in isolation — so it removes the . in .txt as well as the . and / in ./. You need to look for consecutive characters — as in my suggested solution.
try this:
echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed -e 's|\./||'
You need to use escape character \
's#\.\/##g'
:=>echo "e2b66a3d84ee448c33d7f2a2f7e51c58 ./2017_06_10_0400.txt" | sed 's#\.\/##g'
e2b66a3d84ee448c33d7f2a2f7e51c58 2017_06_10_0400.txt
:=>

Grepping for exact string while ignoring regex for dot character

So here's my issue. I need to develop a small bash script that can grep a file containing account names (let's call it file.txt). The contents would be something like this:
accounttest
account2
account
accountbtest
account.test
Matching an exact line SHOULD be easy but apparently it's really not.
I tried:
grep "^account$" file.txt
The output is:
account
So in this situation the output is OK, only "account" is displayed.
But if I try:
grep "^account.test$" file.txt
The output is:
accountbtest
account.test
So the next obvious solution that comes to mind, in order to stop interpreting the dot character as "any character", is using fgrep, right?
fgrep account.test file.txt
The output, as expected, is correct this time:
account.test
But what if I try now:
fgrep account file.txt
Output:
accounttest
account2
account
accountbtest
account.test
This time the output is completely wrong, because I can't use the beginning/end line characters with fgrep.
So my question is, how can I properly grep a whole line, including the beginning and end of line special characters, while also matching exactly the "." character?
EDIT: Please note that I do know that the "." character needs to be escaped, but in my situation, escaping is not an option, because of further processing that needs to be done to the account name, which would make things too complicated.
The . is a special character in regex notation which needs to be escaped to match it as a literal string when passing to grep, so do
grep "^account\.test$" file.txt
Or if you cannot afford to modify the search string use the -F flag in grep to treat it as literal string and not do any extra processing in it
grep -Fx 'account.test' file.txt
From man grep
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines, any of which is to be matched.
-x, --line-regexp
Select only those matches that exactly match the whole line. For a regular expression pattern, this is like parenthesizing the pattern and then surrounding it with ^ and $.
fgrep is the same as grep -F. grep also has the -x option which matches against whole lines only. You can combine these to get what you want:
grep -Fx account.test file.txt

recursively replace text with sed

I want to use sed to replace each occurence of a particular text in a full source file tree. I've attempted the following:
$ grep -rlI name2port\(\"Wan1\"\) . --exclude-dir=.svn --exclude=*.vxs | xargs sed -i 's/name2port\(\"Wan1\"\)/T_PORT_ID_WAN1/g' but it doesn't seem to work, I think my sed cmd isn't correct. How do I do this?
The problem is, that the replacements just do not happen.
I tried this: $ sed -i 's/name2port\(\"Wan1\"\)/T_PORT_ID_WAN1/g' ./rtos_core/jpax_switch/api/src/nms/switch_l3_route.c but turns out, the occurences of name2port("Wan1") would not be replaced.
sed uses BREs (basic regular expressions) by default, which, for historical reasons - and surprisingly for someone used to modern regular expressions - require escaping of certain metacharacters in order to be recognized as such.
In BREs, ( and ) are ordinary (literal) characters, and only become special when \-escaped.
Therefore, to match literal name2port("Wan1"), use that literal as-is in a BRE (given that you also don't need to \-escape " instances):
sed -i 's/name2port("Wan1")/T_PORT_ID_WAN1/g' ./rtos_core/jpax_switch/api/src/nms/switch_l3_route.c
If you're not concerned about portability, you can use -r (or -E for limited portability to macOS, though with -i that won't work), which then enables EREs (extended regular expressions), whose syntax and features are more likely to work as you expect:
sed -r -i 's/name2port\("Wan1"\)/T_PORT_ID_WAN1/g' ./rtos_core/jpax_switch/api/src/nms/switch_l3_route.c
Note how literals ( and ) now do need to be \-escaped, lest they be interpreted as enclosing a capture group.
In this particular case, it is the BRE that requires less escaping than the ERE; generally, though, it is the opposite.

grep pipe searching for one word, not line

For some reason I cannot get this to output just the version of this line. I suspect it has something to do with how grep interprets the dash.
This command:
admin#DEV:~/TEMP$ sendemail
Yields the following:
sendemail-1.56 by Brandon Zehm
More output below omitted
The first line is of interest. I'm trying to store the version to variable.
TESTVAR=$(sendemail | grep '\s1.56\s')
Does anyone see what I am doing wrong? Thanks
TESTVAR is just empty. Even without TESTVAR, the output is empty.
I just tried the following too, thinking this might work.
sendemail | grep '\<1.56\>'
I just tried it again, while editing and I think I have another issue. Perhaps im not handling the output correctly. Its outputting the entire line, but I can see that grep is finding 1.56 because it highlights it in the line.
$ TESTVAR=$(echo 'sendemail-1.56 by Brandon Zehm' | grep -Eo '1.56')
$ echo $TESTVAR
1.56
The point is grep -Eo '1.56'
from grep man page:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output
line.
Your regular expression doesn't match the form of the version. You have specified that the version is surrounded by spaces, yet in front of it you have a dash.
Replace the first \s with the capitalized form \S, or explicit set of characters and it should work.
I'm wondering: In your example you seem to know the version (since you grep for it), so you could just assign the version string to the variable. I assume that you want to obtain any (unknown) version string there. The regular expression for this in sed could be (using POSIX character classes):
sendemail |sed -n -r '1 s/sendemail-([[:digit:]]+\.[[:digit:]]+).*/\1/ p'
The -n suppresses the normal default output of every line; -r enables extended regular expressions; the leading 1 tells sed to only work on line 1 (I assume the version appears in the first line). I anchored the version number to the telltale string sendemail- so that potential other numbers elsewhere in that line are not matched. If the program name changes or the hyphen goes away in future versions, this wouldn't match any longer though.
Both the grep solution above and this one have the disadvantage to read the whole output which (as emails go these days) may be long. In addition, grep would find all other lines in the program's output which contain the pattern (if it's indeed emails, somebody might discuss this problem in them, with examples!). If it's indeed the first line, piping through head -1 first would be efficient and prudent.
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail
sendemail-1.56 by Brandon Zehm
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail | cut -f2 -d "-" | cut -f1 -d" "
1.56

Resources