Search all occurences of a instance ids in the variable

Search all occurences of a instance ids in the variable - bash

I have a bash variable which has the following content:
SSH exit status 255 for i-12hfhf578568tn
i-12hdfghf578568tn is able to connect
i-13456tg is not able to connect
SSH exit status 255 for 1.2.3.4
I want to search the string starting with i- and then extract only that instance id. So, for the above input, I want to have output like below:
i-12hfhf578568tn
i-12hdfghf578568tn
i-13456tg
I am open to use grep, awk, sed.
I am trying to achieve my task by using following command but it gives me whole line:
grep -oE 'i-.*'<<<$variable
Any help?

You can just change your grep command to:
grep -oP 'i-[^\s]*' <<<$variable
Tested on your input:
$ cat test
SSH exit status 255 for i-12hfhf578568tn
i-12hdfghf578568tn is able to connect
i-13456tg is not able to connect
SSH exit status 255 for 1.2.3.4
$ var=`cat test`
$ grep -oP 'i-[^\s]*' <<<$var
i-12hfhf578568tn
i-12hdfghf578568tn
i-13456tg
grep is exactly what you need for this task, sed would be more suitable if you had to reformat the input and awk would be nice if you had either to reformat a string or make some computation of some fields in the rows, columns
Explanation:
-P is to use perl regex
i-[^\s]* is a regex that will match literally i- followed by 0 to N non space character, you could change the * by a + if you want to impose that there is at least 1 char after the - or you could use {min,max} syntax to impose a range.
Let me know if there is something unclear.
Bonus:
Following the comment of Sundeep, you can use one of the improved versions of the regex I have proposed (the first one does use PCRE and the second one posix regex):
grep -oP 'i-\S*' <<<$var
or
grep -o 'i-[^[:blank:]]*' <<<$var

You could use following too(I tested it with GNU awk):
echo "$var" | awk -v RS='[ |\n]' '/^i-/'

You can also use this code (Tested in unix)
echo $test | grep -o "i-[0-z]*"
Here,
-o # Prints only the matching part of the lines
i-[0-z]* # This regular expression, matches all the alphabetical and numerical characters following 'i-'.

Related

Finding a particular string from a file

I have a file that contains one particular string many times. How can I print all occurrences of the string on the screen along with letters following that word till the next space is encountered.
Suppose a line contained:
example="123asdfadf" foo=bar
I want to print example="123asdfadf".
I had tried using less filename | grep -i "example=*" but it was printing the complete lines in which example appeared.

$ grep -o "example[^ ]*" foo
example="abc"
example="123asdfadf"

Since -o is only supported by GNU grep, a portable solution would be to use sed:
sed -n 's/.*\(example=[^[:space:]]*\).*/\1/p' file

grep exact pattern from a file in bash

I have the following IP addresses in a file
3.3.3.1
3.3.3.11
3.3.3.111
I am using this file as input file to another program. In that program it will grep each IP address. But when I grep the contents I am getting some wrong outputs.
like
cat testfile | grep -o 3.3.3.1
but I am getting output like
3.3.3.1
3.3.3.1
3.3.3.1
I just want to get the exact output. How can I do that with grep?

Use the following command:
grep -owF "3.3.3.1" tesfile
-o returns the match only and not the whole line.-w greps for whole words, meaning the match must be enclosed in non word chars like <space>, <tab>, ,, ; the start or the end of the line etc. It prevents grep from matching 3.3.3.1 out of 3.3.3.111.
-F greps for fixed strings instead of patterns. This prevents the . in the IP address to be interpreted as any char, meaning grep will not match 3a3b3c1 (or something like this).

To match whole words only, use grep -ow 3.3.3.1 testfile
UPDATE: Use the solution provided by hek2mgl as it is more robust.

You may use anhcors.
grep '^3\.3\.3\.1$' file
Since by default grep uses regex, you need to escape the dots in-order to make grep to match literal dot character.

Getting max version by file name

I need to write a shell script that does the following:
In a given folder with files that fit the pattern: update-8.1.0-v46.sql I need to find the maximum version
I need to write the maximum version I've found into a configuration file
For 1, I've found the following answer: Shell script: find maximum value in a sequence of integers without sorting
The only problem I have is that I can't get down to a list of only the versions,
I tried:
ls | grep -o "update-8.1.0-v\(\d*\).sql"
but I get the entire file name in return and not just the matching part
Any ideas?
Maybe move everything to awk?
I ended up using:
SCHEMA=`ls database/targets/oracle/ | grep -o "update-$VERSION-v.*.sql" | sed "s/update-$VERSION-v\([0-9]*\).sql/\1/p" | awk '$0>x{x=$0};END{print x}'`
based on dreamer's answer

you can use sed for this:
echo "update-8.1.0-v46.sql" | sed 's/update-8.1.0-v\([0-9]*\).sql/\1/p'
The output in this case will be 46

grep isn't really the best tool for extracting captured matches, but you can use look-behind assertions if you switch it to use perl-like regular expressions. Anything in the assertion will not be printed when using the -o flag.
ls | grep -Po "(?<=update-8.1.0-v)\d+"
46

Grep characters before and after match?

Using this:
grep -A1 -B1 "test_pattern" file
will produce one line before and after the matched pattern in the file. Is there a way to display not lines but a specified number of characters?
The lines in my file are pretty big so I am not interested in printing the entire line but rather only observe the match in context. Any suggestions on how to do this?

3 characters before and 4 characters after
$> echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}'
23_string_and

grep -E -o ".{0,5}test_pattern.{0,5}" test.txt
This will match up to 5 characters before and after your pattern. The -o switch tells grep to only show the match and -E to use an extended regular expression. Make sure to put the quotes around your expression, else it might be interpreted by the shell.

You could use
awk '/test_pattern/ {
match($0, /test_pattern/); print substr($0, RSTART - 10, RLENGTH + 20);
}' file

You mean, like this:
grep -o '.\{0,20\}test_pattern.\{0,20\}' file
?
That will print up to twenty characters on either side of test_pattern. The \{0,20\} notation is like *, but specifies zero to twenty repetitions instead of zero or more.The -o says to show only the match itself, rather than the entire line.

I'll never easily remember these cryptic command modifiers so I took the top answer and turned it into a function in my ~/.bashrc file:
cgrep() {
# For files that are arrays 10's of thousands of characters print.
# Use cpgrep to print 30 characters before and after search pattern.
if [ $# -eq 2 ] ; then
# Format was 'cgrep "search string" /path/to/filename'
grep -o -P ".{0,30}$1.{0,30}" "$2"
else
# Format was 'cat /path/to/filename | cgrep "search string"
grep -o -P ".{0,30}$1.{0,30}"
fi
} # cgrep()
Here's what it looks like in action:
$ ll /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
-rw-r--r-- 1 rick rick 25780 Jul 3 19:05 /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
$ cat /tmp/rick/scp.Mf7UdS/Mf7UdS.Source | cgrep "Link to iconic"
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri
$ cgrep "Link to iconic" /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri
The file in question is one continuous 25K line and it is hopeless to find what you are looking for using regular grep.
Notice the two different ways you can call cgrep that parallels grep method.
There is a "niftier" way of creating the function where "$2" is only passed when set which would save 4 lines of code. I don't have it handy though. Something like ${parm2} $parm2. If I find it I'll revise the function and this answer.

With gawk , you can use match function:
x="hey there how are you"
echo "$x" |awk --re-interval '{match($0,/(.{4})how(.{4})/,a);print a[1],a[2]}'
ere are
If you are ok with perl, more flexible solution : Following will print three characters before the pattern followed by actual pattern and then 5 character after the pattern.
echo hey there how are you |perl -lne 'print "$1$2$3" if /(.{3})(there)(.{5})/'
ey there how
This can also be applied to words instead of just characters.Following will print one word before the actual matching string.
echo hey there how are you |perl -lne 'print $1 if /(\w+) there/'
hey
Following will print one word after the pattern:
echo hey there how are you |perl -lne 'print $2 if /(\w+) there (\w+)/'
how
Following will print one word before the pattern , then the actual word and then one word after the pattern:
echo hey there how are you |perl -lne 'print "$1$2$3" if /(\w+)( there )(\w+)/'
hey there how

If using ripgreg this is how you would do it:
grep -E -o ".{0,5}test_pattern.{0,5}" test.txt

You can use regexp grep for finding + second grep for highlight
echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}' | grep string
23_string_and

With ugrep you can specify -ABC context with option -o (--only-matching) to show the match with extra characters of context before and/or after the match, fitting the match plus the context within the specified -ABC width. For example:
ugrep -o -C30 pattern testfile.txt
gives:
1: ... long line with an example pattern to match. The line could...
2: ...nother example line with a pattern.
The same on a terminal with color highlighting gives:
Multiple matches on a line are either shown with [+nnn more]:
or with option -k (--column-number) to show each individually with context and the column number:
The context width is the number of Unicode characters displayed (UTF-8/16/32), not just ASCII.

I personally do something similar to the posted answers.. but since the dot key, like any keyboard key, can be tapped or held down.. and I often don't need a lot of context(if I needed more I might do the lines like grep -C but often like you I don't want lines before and after), so I find it much quicker for entering the command, to just tap the dot key for how many dots / how many characters, if it's a few then tapping the key, or hold it down for more.
e.g. echo zzzabczzzz | grep -o '.abc..'
Will have the abc pattern with one dot before and two after. ( in regex language, Dot matches any character). Others used dot too but with curly braces to specify repetition.
If I wanted to be strict re between (0 or x) characters and exactly y characters, then i'd use the curlies.. and -P, as others have done.
There is a setting re whether dot matches new line but you can look into that if it's a concern/interest.

Colorized grep -- viewing the entire file with highlighted matches

I find grep's --color=always flag to be tremendously useful. However, grep only prints lines with matches (unless you ask for context lines). Given that each line it prints has a match, the highlighting doesn't add as much capability as it could.
I'd really like to cat a file and see the entire file with the pattern matches highlighted.
Is there some way I can tell grep to print every line being read regardless of whether there's a match? I know I could write a script to run grep on every line of a file, but I was curious whether this was possible with standard grep.

Here are some ways to do it:
grep --color 'pattern\|$' file
grep --color -E 'pattern|$' file
egrep --color 'pattern|$' file
The | symbol is the OR operator. Either escape it using \ or tell grep that the search text has to be interpreted as regular expressions by adding -E or using the egrep command instead of grep.
The search text "pattern|$" is actually a trick, it will match lines that have pattern OR lines that have an end. Because all lines have an end, all lines are matched, but the end of a line isn't actually any characters, so it won't be colored.
To also pass the colored parts through pipes, e.g. towards less, provide the always parameter to --color:
grep --color=always 'pattern\|$' file | less -r
grep --color=always -E 'pattern|$' file | less -r
egrep --color=always 'pattern|$' file | less -r

Here's something along the same lines. Chances are, you'll be using less anyway, so try this:
less -p pattern file
It will highlight the pattern and jump to the first occurrence of it in the file.
You can jump to the next occurence with n and to the previous occurence with p. Quit with q.

I'd like to recommend ack -- better than grep, a power search tool for programmers.
$ ack --color --passthru --pager="${PAGER:-less -R}" pattern files
$ ack --color --passthru pattern files | less -R
$ export ACK_PAGER_COLOR="${PAGER:-less -R}"
$ ack --passthru pattern files
I love it because it defaults to recursive searching of directories (and does so much smarter than grep -r), supports full Perl regular expressions (rather than the POSIXish regex(3)), and has a much nicer context display when searching many files.

You can use my highlight script from https://github.com/kepkin/dev-shell-essentials
It's better than grep because you can highlight each match with its own color.
$ command_here | highlight green "input" | highlight red "output"

You can also create an alias. Add this function in your .bashrc (or .bash_profile on osx)
function grepe {
grep --color -E "$1|$" $2
}
You can now use the alias like this: "ifconfig | grepe inet" or "grepe css index.html".
(PS: don't forget to source ~/.bashrc to reload bashrc on current session)

Use colout program: http://nojhan.github.io/colout/
It is designed to add color highlights to a text stream. Given a regex and a color (e.g. "red"), it reproduces a text stream with matches highlighted. e.g:
# cat logfile but highlight instances of 'ERROR' in red
colout ERROR red <logfile
You can chain multiple invocations to add multiple different color highlights:
tail -f /var/log/nginx/access.log | \
colout ' 5\d\d ' red | \
colout ' 4\d\d ' yellow | \
colout ' 3\d\d ' cyan | \
colout ' 2\d\d ' green
Or you can achieve the same thing by using a regex with N groups (parenthesised parts of the regex), followed by a comma separated list of N colors.
vagrant status | \
colout \
'\''(^.+ running)|(^.+suspended)|(^.+not running)'\'' \
green,yellow,red

The -z option for grep is also pretty slick!
cat file1 | grep -z "pattern"

As grep -E '|pattern' has already been suggested, just wanted to clarify that it's possible to highlight a whole line too.
For example, tail -f somelog | grep --color -E '| \[2\].*' (specifically, the part -E '|):

I use rcg from "Linux Server Hacks", O'Reilly. It's perfect for what you want and can highlight multiple expressions each with different colours.
#!/usr/bin/perl -w
#
# regexp coloured glasses - from Linux Server Hacks from O'Reilly
#
# eg .rcg "fatal" "BOLD . YELLOW . ON_WHITE" /var/adm/messages
#
use strict;
use Term::ANSIColor qw(:constants);
my %target = ( );
while (my $arg = shift) {
my $clr = shift;
if (($arg =~ /^-/) | !$clr) {
print "Usage: rcg [regex] [color] [regex] [color] ...\n";
exit(2);
}
#
# Ugly, lazy, pathetic hack here. [Unquote]
#
$target{$arg} = eval($clr);
}
my $rst = RESET;
while(<>) {
foreach my $x (keys(%target)) {
s/($x)/$target{$x}$1$rst/g;
}
print
}

I added this to my .bash_aliases:
highlight() {
grep --color -E "$1|\$"
}

The sed way
As there is already a lot of different solution, but none show sed as solution,
and because sed is lighter and quicker than grep, I prefer to use sed for this kind of job:
sed 's/pattern/\o33[47;31;1m&\o033[0m/' file
This seems less intuitive.
\o33 is the sed syntax to generate the character octal 033 -> Escape.
(Some shells and editors also allow entering <Ctrl>-<V> followed by <Esc>, to type the character directly.)
Esc [ 47 ; 31 ; 1 m is an ANSI escape code: Background grey, foreground red and bold face.
& will re-print the pattern.
Esc [ 0 m returns the colors to default.
You could also highlight the entire line, but mark the pattern as red:
sed -E <file -e \
's/^(.*)(pattern)(.*)/\o33[30;47m\1\o33[31;1m\2\o33[0;30;47m\3\o33[0m/'
Dynamic tail -f, following logfiles
One of advantage of using sed: You could send a alarm beep on console, using bell ascii character 0x7. I often use sed like:
sudo tail -f /var/log/kern.log |
sed -ue 's/[lL]ink .*\([uU]p\|[dD]own\).*/\o33[47;31;1m&\o33[0m\o7/'
-u stand for unbuffered. This ensure that line will be treated immediately.
So I will hear some beep instantly, when I connect or disconnect my ethernet cable.
Of course, instead of link up pattern, you could watch for USB in same file, or even search for from=.*alice#bobserver.org in /var/log/mail.log (If
you're Charlie, anxiously awaiting an email from Alice;)...

To highlight patterns while viewing the whole file, h can do this.
Plus it uses different colors for different patterns.
cat FILE | h 'PAT1' 'PAT2' ...
You can also pipe the output of h to less -R for better reading.
To grep and use 1 color for each pattern, cxpgrep could be a good fit.

Use ripgrep, aka rg: https://github.com/BurntSushi/ripgrep
rg --passthru...
Color is the default:
rg -t tf -e 'key.*tfstate' -e dynamodb_table
--passthru
Print both matching and non-matching lines.
Another way to achieve a similar effect is by modifying your pattern to
match the empty string.
For example, if you are searching using rg foo then using
rg "^|foo" instead will emit every line in every file searched, but only
occurrences of foo will be highlighted.
This flag enables the same behavior without needing to modify the pattern.
Sacrilege, granted, but grep has gotten complacent.
brew/apt/rpm/whatever install ripgrep
You'll never go back.

another dirty way:
grep -A80 -B80 --color FIND_THIS IN_FILE
I did an
alias grepa='grep -A80 -B80 --color'
in bashrc.

Here is a shell script that uses Awk's gsub function to replace the text you're searching for with the proper escape sequence to display it in bright red:
#! /bin/bash
awk -vstr=$1 'BEGIN{repltext=sprintf("%c[1;31;40m&%c[0m", 0x1B,0x1B);}{gsub(str,repltext); print}' $2
Use it like so:
$ ./cgrep pattern [file]
Unfortunately, it doesn't have all the functionality of grep.
For more information , you can refer to an article "So You Like Color" in Linux Journal

One other answer mentioned grep's -Cn switch which includes n lines of Context. I sometimes do this with n=99 as a quick-and-dirty way of getting [at least] a screenfull of context when the egrep pattern seems too fiddly, or when I'm on a machine on which I've not installed rcg and/or ccze.
I recently discovered ccze which is a more powerful colorizer. My only complaint is that it is screen-oriented (like less, which I never use for that reason) unless you specify the -A switch for "raw ANSI" output.
+1 for the rcg mention above. It is still my favorite since it is so simple to customize in an alias. Something like this is usually in my ~/.bashrc:
alias tailc='tail -f /my/app/log/file | rcg send "BOLD GREEN" receive "CYAN" error "RED"'

Alternatively you can use The Silver Searcher and do
ag <search> --passthrough

I use following command for similar purpose:
grep -C 100 searchtext file
This will say grep to print 100 * 2 lines of context, before & after of the highlighted search text.

It might seem like a dirty hack.
grep "^\|highlight1\|highlight2\|highlight3" filename
Which means - match the beginning of the line(^) or highlight1 or highlight2 or highlight3. As a result, you will get highlighted all highlight* pattern matches, even in the same line.

Ok, this is one way,
wc -l filename
will give you the line count -- say NN, then you can do
grep -C NN --color=always filename

If you want highlight several patterns with different colors see this bash script.
Basic usage:
echo warn error debug info 10 nil | colog
You can change patterns and colors while running pressing one key and then enter key.

Here's my approach, inspired by #kepkin's solution:
# Adds ANSI colors to matched terms, similar to grep --color but without
# filtering unmatched lines. Example:
# noisy_command | highlight ERROR INFO
#
# Each argument is passed into sed as a matching pattern and matches are
# colored. Multiple arguments will use separate colors.
#
# Inspired by https://stackoverflow.com/a/25357856
highlight() {
# color cycles from 0-5, (shifted 31-36), i.e. r,g,y,b,m,c
local color=0 patterns=()
for term in "$#"; do
patterns+=("$(printf 's|%s|\e[%sm\\0\e[0m|g' "${term//|/\\|}" "$(( color+31 ))")")
color=$(( (color+1) % 6 ))
done
sed -f <(printf '%s\n' "${patterns[#]}")
}
This accepts multiple arguments (but doesn't let you customize the colors). Example:
$ noisy_command | highlight ERROR WARN

Is there some way I can tell grep to print every line being read
regardless of whether there's a match?
Option -C999 will do the trick in the absence of an option to display all context lines. Most other grep variants support this too. However: 1) no output is produced when no match is found and 2) this option has a negative impact on grep's efficiency: when the -C value is large this many lines may have to be temporarily stored in memory for grep to determine which lines of context to display when a match occurs. Note that grep implementations do not load input files but rather reads a few lines or use a sliding window over the input. The "before part" of the context has to be kept in a window (memory) to output the "before" context lines later when a match is found.
A pattern such as ^|PATTERN or PATTERN|$ or any empty-matching sub-pattern for that matter such as [^ -~]?|PATTERN is a nice trick. However, 1) these patterns don't show non-matching lines highlighted as context and 2) this can't be used in combination with some other grep options, such as -F and -w for example.
So none of these approaches are satisfying to me. I'm using ugrep, and enhanced grep with option -y to efficiently display all non-matching output as color-highlighted context lines. Other grep-like tools such as ag and ripgrep also offer a pass-through option. But ugrep is compatible with GNU/BSD grep and offers a superset of grep options like -y and -Q. For example, here is what option -y shows when combined with -Q (interactive query UI to enter patterns):
ugrep -Q -y FILE ...

Also try:
egrep 'pattern1|pattern2' FILE.txt | less -Sp 'pattern1|pattern2'
This will give you a tabular output with highlighted pattern/s.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio