Colorized grep -- viewing the entire file with highlighted matches - bash

I find grep's --color=always flag to be tremendously useful. However, grep only prints lines with matches (unless you ask for context lines). Given that each line it prints has a match, the highlighting doesn't add as much capability as it could.
I'd really like to cat a file and see the entire file with the pattern matches highlighted.
Is there some way I can tell grep to print every line being read regardless of whether there's a match? I know I could write a script to run grep on every line of a file, but I was curious whether this was possible with standard grep.

Here are some ways to do it:
grep --color 'pattern\|$' file
grep --color -E 'pattern|$' file
egrep --color 'pattern|$' file
The | symbol is the OR operator. Either escape it using \ or tell grep that the search text has to be interpreted as regular expressions by adding -E or using the egrep command instead of grep.
The search text "pattern|$" is actually a trick, it will match lines that have pattern OR lines that have an end. Because all lines have an end, all lines are matched, but the end of a line isn't actually any characters, so it won't be colored.
To also pass the colored parts through pipes, e.g. towards less, provide the always parameter to --color:
grep --color=always 'pattern\|$' file | less -r
grep --color=always -E 'pattern|$' file | less -r
egrep --color=always 'pattern|$' file | less -r

Here's something along the same lines. Chances are, you'll be using less anyway, so try this:
less -p pattern file
It will highlight the pattern and jump to the first occurrence of it in the file.
You can jump to the next occurence with n and to the previous occurence with p. Quit with q.

I'd like to recommend ack -- better than grep, a power search tool for programmers.
$ ack --color --passthru --pager="${PAGER:-less -R}" pattern files
$ ack --color --passthru pattern files | less -R
$ export ACK_PAGER_COLOR="${PAGER:-less -R}"
$ ack --passthru pattern files
I love it because it defaults to recursive searching of directories (and does so much smarter than grep -r), supports full Perl regular expressions (rather than the POSIXish regex(3)), and has a much nicer context display when searching many files.

You can use my highlight script from https://github.com/kepkin/dev-shell-essentials
It's better than grep because you can highlight each match with its own color.
$ command_here | highlight green "input" | highlight red "output"

You can also create an alias. Add this function in your .bashrc (or .bash_profile on osx)
function grepe {
grep --color -E "$1|$" $2
}
You can now use the alias like this: "ifconfig | grepe inet" or "grepe css index.html".
(PS: don't forget to source ~/.bashrc to reload bashrc on current session)

Use colout program: http://nojhan.github.io/colout/
It is designed to add color highlights to a text stream. Given a regex and a color (e.g. "red"), it reproduces a text stream with matches highlighted. e.g:
# cat logfile but highlight instances of 'ERROR' in red
colout ERROR red <logfile
You can chain multiple invocations to add multiple different color highlights:
tail -f /var/log/nginx/access.log | \
colout ' 5\d\d ' red | \
colout ' 4\d\d ' yellow | \
colout ' 3\d\d ' cyan | \
colout ' 2\d\d ' green
Or you can achieve the same thing by using a regex with N groups (parenthesised parts of the regex), followed by a comma separated list of N colors.
vagrant status | \
colout \
'\''(^.+ running)|(^.+suspended)|(^.+not running)'\'' \
green,yellow,red

The -z option for grep is also pretty slick!
cat file1 | grep -z "pattern"

As grep -E '|pattern' has already been suggested, just wanted to clarify that it's possible to highlight a whole line too.
For example, tail -f somelog | grep --color -E '| \[2\].*' (specifically, the part -E '|):

I use rcg from "Linux Server Hacks", O'Reilly. It's perfect for what you want and can highlight multiple expressions each with different colours.
#!/usr/bin/perl -w
#
# regexp coloured glasses - from Linux Server Hacks from O'Reilly
#
# eg .rcg "fatal" "BOLD . YELLOW . ON_WHITE" /var/adm/messages
#
use strict;
use Term::ANSIColor qw(:constants);
my %target = ( );
while (my $arg = shift) {
my $clr = shift;
if (($arg =~ /^-/) | !$clr) {
print "Usage: rcg [regex] [color] [regex] [color] ...\n";
exit(2);
}
#
# Ugly, lazy, pathetic hack here. [Unquote]
#
$target{$arg} = eval($clr);
}
my $rst = RESET;
while(<>) {
foreach my $x (keys(%target)) {
s/($x)/$target{$x}$1$rst/g;
}
print
}

I added this to my .bash_aliases:
highlight() {
grep --color -E "$1|\$"
}

The sed way
As there is already a lot of different solution, but none show sed as solution,
and because sed is lighter and quicker than grep, I prefer to use sed for this kind of job:
sed 's/pattern/\o33[47;31;1m&\o033[0m/' file
This seems less intuitive.
\o33 is the sed syntax to generate the character octal 033 -> Escape.
(Some shells and editors also allow entering <Ctrl>-<V> followed by <Esc>, to type the character directly.)
Esc [ 47 ; 31 ; 1 m is an ANSI escape code: Background grey, foreground red and bold face.
& will re-print the pattern.
Esc [ 0 m returns the colors to default.
You could also highlight the entire line, but mark the pattern as red:
sed -E <file -e \
's/^(.*)(pattern)(.*)/\o33[30;47m\1\o33[31;1m\2\o33[0;30;47m\3\o33[0m/'
Dynamic tail -f, following logfiles
One of advantage of using sed: You could send a alarm beep on console, using bell ascii character 0x7. I often use sed like:
sudo tail -f /var/log/kern.log |
sed -ue 's/[lL]ink .*\([uU]p\|[dD]own\).*/\o33[47;31;1m&\o33[0m\o7/'
-u stand for unbuffered. This ensure that line will be treated immediately.
So I will hear some beep instantly, when I connect or disconnect my ethernet cable.
Of course, instead of link up pattern, you could watch for USB in same file, or even search for from=.*alice#bobserver.org in /var/log/mail.log (If
you're Charlie, anxiously awaiting an email from Alice;)...

To highlight patterns while viewing the whole file, h can do this.
Plus it uses different colors for different patterns.
cat FILE | h 'PAT1' 'PAT2' ...
You can also pipe the output of h to less -R for better reading.
To grep and use 1 color for each pattern, cxpgrep could be a good fit.

Use ripgrep, aka rg: https://github.com/BurntSushi/ripgrep
rg --passthru...
Color is the default:
rg -t tf -e 'key.*tfstate' -e dynamodb_table
--passthru
Print both matching and non-matching lines.
Another way to achieve a similar effect is by modifying your pattern to
match the empty string.
For example, if you are searching using rg foo then using
rg "^|foo" instead will emit every line in every file searched, but only
occurrences of foo will be highlighted.
This flag enables the same behavior without needing to modify the pattern.
Sacrilege, granted, but grep has gotten complacent.
brew/apt/rpm/whatever install ripgrep
You'll never go back.

another dirty way:
grep -A80 -B80 --color FIND_THIS IN_FILE
I did an
alias grepa='grep -A80 -B80 --color'
in bashrc.

Here is a shell script that uses Awk's gsub function to replace the text you're searching for with the proper escape sequence to display it in bright red:
#! /bin/bash
awk -vstr=$1 'BEGIN{repltext=sprintf("%c[1;31;40m&%c[0m", 0x1B,0x1B);}{gsub(str,repltext); print}' $2
Use it like so:
$ ./cgrep pattern [file]
Unfortunately, it doesn't have all the functionality of grep.
For more information , you can refer to an article "So You Like Color" in Linux Journal

One other answer mentioned grep's -Cn switch which includes n lines of Context. I sometimes do this with n=99 as a quick-and-dirty way of getting [at least] a screenfull of context when the egrep pattern seems too fiddly, or when I'm on a machine on which I've not installed rcg and/or ccze.
I recently discovered ccze which is a more powerful colorizer. My only complaint is that it is screen-oriented (like less, which I never use for that reason) unless you specify the -A switch for "raw ANSI" output.
+1 for the rcg mention above. It is still my favorite since it is so simple to customize in an alias. Something like this is usually in my ~/.bashrc:
alias tailc='tail -f /my/app/log/file | rcg send "BOLD GREEN" receive "CYAN" error "RED"'

Alternatively you can use The Silver Searcher and do
ag <search> --passthrough

I use following command for similar purpose:
grep -C 100 searchtext file
This will say grep to print 100 * 2 lines of context, before & after of the highlighted search text.

It might seem like a dirty hack.
grep "^\|highlight1\|highlight2\|highlight3" filename
Which means - match the beginning of the line(^) or highlight1 or highlight2 or highlight3. As a result, you will get highlighted all highlight* pattern matches, even in the same line.

Ok, this is one way,
wc -l filename
will give you the line count -- say NN, then you can do
grep -C NN --color=always filename

If you want highlight several patterns with different colors see this bash script.
Basic usage:
echo warn error debug info 10 nil | colog
You can change patterns and colors while running pressing one key and then enter key.

Here's my approach, inspired by #kepkin's solution:
# Adds ANSI colors to matched terms, similar to grep --color but without
# filtering unmatched lines. Example:
# noisy_command | highlight ERROR INFO
#
# Each argument is passed into sed as a matching pattern and matches are
# colored. Multiple arguments will use separate colors.
#
# Inspired by https://stackoverflow.com/a/25357856
highlight() {
# color cycles from 0-5, (shifted 31-36), i.e. r,g,y,b,m,c
local color=0 patterns=()
for term in "$#"; do
patterns+=("$(printf 's|%s|\e[%sm\\0\e[0m|g' "${term//|/\\|}" "$(( color+31 ))")")
color=$(( (color+1) % 6 ))
done
sed -f <(printf '%s\n' "${patterns[#]}")
}
This accepts multiple arguments (but doesn't let you customize the colors). Example:
$ noisy_command | highlight ERROR WARN

Is there some way I can tell grep to print every line being read
regardless of whether there's a match?
Option -C999 will do the trick in the absence of an option to display all context lines. Most other grep variants support this too. However: 1) no output is produced when no match is found and 2) this option has a negative impact on grep's efficiency: when the -C value is large this many lines may have to be temporarily stored in memory for grep to determine which lines of context to display when a match occurs. Note that grep implementations do not load input files but rather reads a few lines or use a sliding window over the input. The "before part" of the context has to be kept in a window (memory) to output the "before" context lines later when a match is found.
A pattern such as ^|PATTERN or PATTERN|$ or any empty-matching sub-pattern for that matter such as [^ -~]?|PATTERN is a nice trick. However, 1) these patterns don't show non-matching lines highlighted as context and 2) this can't be used in combination with some other grep options, such as -F and -w for example.
So none of these approaches are satisfying to me. I'm using ugrep, and enhanced grep with option -y to efficiently display all non-matching output as color-highlighted context lines. Other grep-like tools such as ag and ripgrep also offer a pass-through option. But ugrep is compatible with GNU/BSD grep and offers a superset of grep options like -y and -Q. For example, here is what option -y shows when combined with -Q (interactive query UI to enter patterns):
ugrep -Q -y FILE ...

Also try:
egrep 'pattern1|pattern2' FILE.txt | less -Sp 'pattern1|pattern2'
This will give you a tabular output with highlighted pattern/s.

Related

Sed through files without using for loop?

I have a small script which basically generates a menu of all the scripts in my ~/scripts folder and next to each of them displays a sentence describing it, that sentence being the third line within the script commented out. I then plan to pipe this into fzf or dmenu to select it and start editing it or whatever.
1 #!/bin/bash
2
3 # a script to do
So it would look something like this
foo.sh a script to do X
bar.sh a script to do Y
Currently I have it run a for loop over all the files in the scripts folder and then run sed -n 3p on all of them.
for i in $(ls -1 ~/scripts); do
echo -n "$i"
sed -n 3p "~/scripts/$i"
echo
done | column -t -s '#' | ...
I was wondering if there is a more efficient way of doing this that did not involve a for loop and only used sed. Any help will be appreciated. Thanks!
Instead of a loop that is parsing ls output + sed, you may try this awk command:
awk 'FNR == 3 {
f = FILENAME; sub(/^.*\//, "", f); print f, $0; nextfile
}' ~/scripts/* | column -t -s '#' | ...
Yes there is a more efficient way, but no, it doesn't only use sed. This is probably a silly optimization for your use case though, but it may be worthwhile nonetheless.
The inefficiency is that you're using ls to read the directory and then parse its output. For large directories, that causes lots of overhead for keeping that list in memory even though you only traverse it once. Also, it's not done correctly, consider filenames with special characters that the shell interprets.
The more efficient way is to use find in combination with its -exec option, which starts a second program with each found file in turn.
BTW: If you didn't rely on line numbers but maybe a tag to mark the description, you could also use grep -r, which avoids an additional process per file altogether.
This might work for you (GNU sed):
sed -sn '1h;3{H;g;s/\n/ /p}' ~/scripts/*
Use the -s option to reset the line number addresses for each file.
Copy line 1 to the hold space.
Append line 3 to the hold space.
Swap the hold space for the pattern space.
Replace the newline with a space and print the result.
All files in the directory ~/scripts will be processed.
N.B. You may wish to replace the space delimiter by a tab or pipe the results to the column command.

How do I trim whitespace, but not newlines, from subshell output in bash?

There are many tens, maybe a hundred or more previous questions that seem "identical" to this already here, but after extensive search, I found NOTHING that even came close to working - though I did learn quite a lot - and so I decided to just RTFM and figure this out on my own.
The Problem
I wanted to search the output of a ps auxwww command to find processes of interest, and the issue was that I can't just simply use cut to find the exact data from them that I wanted. ps, it turns out, tries to columnate the output, adding either extra spaces or tabs that get in the way of using cut to get the correct data.
So, since I'm not a master at bash, I did a search... The answers I found were all focused on either variables - a "backup strategy" from my point of view that itself didn't solve the whole problem - or they only trimmed leading or trailing space or all "whitespace" including newlines. NOPE, Won't Work For Cut! And, neither will removing trailing newlines and so forth.
So, restated, the question is, how do we efficiently end up with the white space defined as simply a single space between other characters without eliminating newlines?
Below, I will give my answer, but I welcome others to give theirs - who knows, maybe someone has a better answer?!
Answer:
At least MY answer - please leave your own, too! - was to do this:
ps auxwww | grep <program> | tr -s [:blank:] | cut -d ' ' -f <field_of_interest>
This worked great!
Obviously, there are many ways to adapt this to other needs.
As an alternative to all of the pipes and grep with cut, you could simply use awk. The benefit of using awkwith the default field-separator (FS) being set to break on whitespace is that it considers any number of whitespace between fields as a single separator.
So using awk will do away with needing to use tr -s to "squeeze" whitespace to define fields. Further, awk gives far greater control over field matching using regular expressions rather than having to rely on grep of a full line and cut to locate a pre-determined field numbers. (though to some extent you will still have to tell awk what field out of the ps command you are interested in)
Using bash, you can also eliminate the pipe | by using process substitution to send the output of ps auxwww to awk on stdin using redirection, e.g. awk ... < <(ps auxwww) for a single tidy command line.
To get your "program" and "file_of_interest" into awk you have two options. You can initialize awk variables using the -v var=value option (there can be multiple -v otions given), or you can use the BEGIN rule to initialize the variables. The only difference being with -v you can provide a shell variable for value and there is no whitespace allowed surrounding the = sign, while within BEGIN any whitespace is ignored.
So in your case a couple of examples to get the virtual memory size for firefox processes, you could use:
awk -v prog="firefox" -v fnum="5" '
$11 ~ prog {print $fnum}
' < <(ps auxwww)
(above if you had myprog=firefox as a shell variable, you could use -v prog="$myprog" to initialize the prog variable for awk)
or using the BEGIN rule, you could do:
awk 'BEGIN {prog = "firefox"; fnum = "5"}
$11 ~ prog {print $fnum }
' < <(ps auxwww)
In each command above, it locates the COMMAND field from ps (field 11) and checks whether it contains firefox and if so it outputs field no. 5 the virtual memory size used by each process.
Both work fine as one-liners as well, e.g.
awk -v prog="firefox" -v fnum="5" '$11 ~ prog {print $fnum}' < <(ps auxwww)
Don't get me wrong, the pipeline is perfectly fine, it will just be slow. For short commands with limited output there won't be much difference, but when the output is large, awk will provide orders of magnitude improvement over having to tr and grep and cut reading over the same records three times.
The reason being, the pipes and the process on each side requires separate processes be spawned by the shell. So minimizes their use, improves the efficiency of what your script is doing. Now if the data is small as are the processes, there isn't much of a difference. However if you are reading a 3G file 3 times over -- that's is the difference in orders of magnitude. Hours verses minutes or seconds.
I had to use single quotes on CentosOS Linux to get tr working like described above:
ps -o ppid= $$ | tr -d '[:space:]'
You can reduce the number of pipes using this Perl one-liner, which uses Perl regexes instead of a separate grep process. This combines grep, tr and cut in a single command, with an easy way to manipulate the output (#F is the array of fields, 0-indexed):
Examples:
# Start an example process to provide the input for `ps` in the next commands:
/Applications/Emacs.app/Contents/MacOS/Emacs-x86_64-10_14 --geometry 109x65 /tmp/foo &
# Print single space-delimited output of `ps` for all emacs processes:
ps auxwww | perl -lane 'print "#F" if $F[10] =~ /emacs/i'
# Prints:
# bar 72144 0.0 0.5 4610272 82320 s006 SN 11:15AM 0:01.31 /Applications/Emacs.app/Contents/MacOS/Emacs-x86_64-10_14 --geometry 109x65 /tmp/foo
# Print emacs PID and file name opened with emacs:
ps auxwww | perl -lane 'print join "\t", #F[1, -1] if $F[10] =~ /emacs/i'
# Prints:
# 72144 /tmp/foo
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)

Bash terminal output - highlight lines containing some text

When I get output in bash I get my standard 2 colour screen. Is there any way I can, by default, highlight a line if it contains some key text output?
E.g. if it contains the word "FAIL" then the line is coloured red.
I’ve read this https://unix.stackexchange.com/questions/46562/how-do-you-colorize-only-some-keywords-for-a-bash-script
but am looking for something simpler than having to write a wrapper script which I’d inevitably have to debug at some time in the future.
For a simple workaround, pipe it through grep --color to turn some words red.
Add a fallback like ^ to print lines which do not contain any matches otherwise.
grep --color -e 'FAIL' -e '^' <<<$'Foo\nBar FAIL Baz\nIck'
Grep output with multiple Colors? describes a hack for getting multiple colors if you need that.
If you're happy to install a BASH script and ack, the hhlighter package has useful default colours and an easy interface https://github.com/paoloantinori/hhighlighter:
You can use it like so to highlight rows that start with FAIL:
h -i 'FAIL.*'
or that contain FAIL:
h -i '.*FAIL.*'
or for various common log entries:
h -i '.*FAIL.*' '.*PASS.*' '.*WARN.*'
Building on tripleee's answer, following command will highlight the matching line red and preserve the other lines:
your_command | grep --color -e ".*FAIL.*" -e "^"
If you prefer a completely inverted line, with gnu grep:
your_command | GREP_COLORS='mt=7' grep --color -e ".*FAIL.*" -e "^"
(updated with mklement0 feedback)
This will highlight not only a word, but the whole line:
echo "foo bar error baz" | egrep --color '*.FAIL.*|$'
A search phrase should be enclosed by .* on either side. It will cause highlighting before and after the search word or phrase.
References
https://unix.stackexchange.com/a/330613/341457
How .* (dot star) works?

grep pipe searching for one word, not line

For some reason I cannot get this to output just the version of this line. I suspect it has something to do with how grep interprets the dash.
This command:
admin#DEV:~/TEMP$ sendemail
Yields the following:
sendemail-1.56 by Brandon Zehm
More output below omitted
The first line is of interest. I'm trying to store the version to variable.
TESTVAR=$(sendemail | grep '\s1.56\s')
Does anyone see what I am doing wrong? Thanks
TESTVAR is just empty. Even without TESTVAR, the output is empty.
I just tried the following too, thinking this might work.
sendemail | grep '\<1.56\>'
I just tried it again, while editing and I think I have another issue. Perhaps im not handling the output correctly. Its outputting the entire line, but I can see that grep is finding 1.56 because it highlights it in the line.
$ TESTVAR=$(echo 'sendemail-1.56 by Brandon Zehm' | grep -Eo '1.56')
$ echo $TESTVAR
1.56
The point is grep -Eo '1.56'
from grep man page:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output
line.
Your regular expression doesn't match the form of the version. You have specified that the version is surrounded by spaces, yet in front of it you have a dash.
Replace the first \s with the capitalized form \S, or explicit set of characters and it should work.
I'm wondering: In your example you seem to know the version (since you grep for it), so you could just assign the version string to the variable. I assume that you want to obtain any (unknown) version string there. The regular expression for this in sed could be (using POSIX character classes):
sendemail |sed -n -r '1 s/sendemail-([[:digit:]]+\.[[:digit:]]+).*/\1/ p'
The -n suppresses the normal default output of every line; -r enables extended regular expressions; the leading 1 tells sed to only work on line 1 (I assume the version appears in the first line). I anchored the version number to the telltale string sendemail- so that potential other numbers elsewhere in that line are not matched. If the program name changes or the hyphen goes away in future versions, this wouldn't match any longer though.
Both the grep solution above and this one have the disadvantage to read the whole output which (as emails go these days) may be long. In addition, grep would find all other lines in the program's output which contain the pattern (if it's indeed emails, somebody might discuss this problem in them, with examples!). If it's indeed the first line, piping through head -1 first would be efficient and prudent.
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail
sendemail-1.56 by Brandon Zehm
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail | cut -f2 -d "-" | cut -f1 -d" "
1.56

Grep characters before and after match?

Using this:
grep -A1 -B1 "test_pattern" file
will produce one line before and after the matched pattern in the file. Is there a way to display not lines but a specified number of characters?
The lines in my file are pretty big so I am not interested in printing the entire line but rather only observe the match in context. Any suggestions on how to do this?
3 characters before and 4 characters after
$> echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}'
23_string_and
grep -E -o ".{0,5}test_pattern.{0,5}" test.txt
This will match up to 5 characters before and after your pattern. The -o switch tells grep to only show the match and -E to use an extended regular expression. Make sure to put the quotes around your expression, else it might be interpreted by the shell.
You could use
awk '/test_pattern/ {
match($0, /test_pattern/); print substr($0, RSTART - 10, RLENGTH + 20);
}' file
You mean, like this:
grep -o '.\{0,20\}test_pattern.\{0,20\}' file
?
That will print up to twenty characters on either side of test_pattern. The \{0,20\} notation is like *, but specifies zero to twenty repetitions instead of zero or more.The -o says to show only the match itself, rather than the entire line.
I'll never easily remember these cryptic command modifiers so I took the top answer and turned it into a function in my ~/.bashrc file:
cgrep() {
# For files that are arrays 10's of thousands of characters print.
# Use cpgrep to print 30 characters before and after search pattern.
if [ $# -eq 2 ] ; then
# Format was 'cgrep "search string" /path/to/filename'
grep -o -P ".{0,30}$1.{0,30}" "$2"
else
# Format was 'cat /path/to/filename | cgrep "search string"
grep -o -P ".{0,30}$1.{0,30}"
fi
} # cgrep()
Here's what it looks like in action:
$ ll /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
-rw-r--r-- 1 rick rick 25780 Jul 3 19:05 /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
$ cat /tmp/rick/scp.Mf7UdS/Mf7UdS.Source | cgrep "Link to iconic"
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri
$ cgrep "Link to iconic" /tmp/rick/scp.Mf7UdS/Mf7UdS.Source
1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri
The file in question is one continuous 25K line and it is hopeless to find what you are looking for using regular grep.
Notice the two different ways you can call cgrep that parallels grep method.
There is a "niftier" way of creating the function where "$2" is only passed when set which would save 4 lines of code. I don't have it handy though. Something like ${parm2} $parm2. If I find it I'll revise the function and this answer.
With gawk , you can use match function:
x="hey there how are you"
echo "$x" |awk --re-interval '{match($0,/(.{4})how(.{4})/,a);print a[1],a[2]}'
ere are
If you are ok with perl, more flexible solution : Following will print three characters before the pattern followed by actual pattern and then 5 character after the pattern.
echo hey there how are you |perl -lne 'print "$1$2$3" if /(.{3})(there)(.{5})/'
ey there how
This can also be applied to words instead of just characters.Following will print one word before the actual matching string.
echo hey there how are you |perl -lne 'print $1 if /(\w+) there/'
hey
Following will print one word after the pattern:
echo hey there how are you |perl -lne 'print $2 if /(\w+) there (\w+)/'
how
Following will print one word before the pattern , then the actual word and then one word after the pattern:
echo hey there how are you |perl -lne 'print "$1$2$3" if /(\w+)( there )(\w+)/'
hey there how
If using ripgreg this is how you would do it:
grep -E -o ".{0,5}test_pattern.{0,5}" test.txt
You can use regexp grep for finding + second grep for highlight
echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}' | grep string
23_string_and
With ugrep you can specify -ABC context with option -o (--only-matching) to show the match with extra characters of context before and/or after the match, fitting the match plus the context within the specified -ABC width. For example:
ugrep -o -C30 pattern testfile.txt
gives:
1: ... long line with an example pattern to match. The line could...
2: ...nother example line with a pattern.
The same on a terminal with color highlighting gives:
Multiple matches on a line are either shown with [+nnn more]:
or with option -k (--column-number) to show each individually with context and the column number:
The context width is the number of Unicode characters displayed (UTF-8/16/32), not just ASCII.
I personally do something similar to the posted answers.. but since the dot key, like any keyboard key, can be tapped or held down.. and I often don't need a lot of context(if I needed more I might do the lines like grep -C but often like you I don't want lines before and after), so I find it much quicker for entering the command, to just tap the dot key for how many dots / how many characters, if it's a few then tapping the key, or hold it down for more.
e.g. echo zzzabczzzz | grep -o '.abc..'
Will have the abc pattern with one dot before and two after. ( in regex language, Dot matches any character). Others used dot too but with curly braces to specify repetition.
If I wanted to be strict re between (0 or x) characters and exactly y characters, then i'd use the curlies.. and -P, as others have done.
There is a setting re whether dot matches new line but you can look into that if it's a concern/interest.

Resources