In file named appleFile:
1.apple_with_seeds###
2.apple_with_seeds###
3.apple_with_seeds_and_skins###
4.apple_with_seeds_and_skins###
5.apple_with_seeds_and_skins###
.....
.....
.....
How can i use the grep command to grep the pattern only with "apple_with_seeds"???
It is supposed that there is random characters after seeds and skins.
Result:
1.apple_with_seeds###
2.apple_with_seeds###
Maybe something like this will work for you:
grep 'apple_with_seeds[^_]' appleFile
That will print all lines having no _ character after seeds. You can add other characters to exclude to between the brackets (but after the ^), e.g. [^_a-z] will additionally exclude all lower case letters.
Or you could explicitly include some characters (like #):
grep 'apple_with_seeds[#]*$' appleFile
And again you can add arbitrary characters between the brackets, e.g. [#A-Z] would match any of the characters # or A-Z.
cat appleFile | grep "apple_with_seeds$"
UPDATE:
if you want to exclude something, try -v option:
cat appleFile | grep "apple_with_seeds$" | grep -v "exclude_pattern"
Try this
cat appleFile|grep -i seeds$
Related
In bash I need to extract a certain sequence of letters and numbers from a filename. In the example below I need to extract just the S??E?? section of the filenames. This must work with both upper/lowercase.
my.show.s01e02.h264.aac.subs.mkv
great.s03e12.h264.Dolby.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
Expected output would be:
s01e02
s03e12
S05E11
I've been trying to do this with SED but can't get it to work. This is what I have tried, without success:
sed 's/.*s[0-9][0-9]e[0-9][0-9].*//'
Many thanks for any help.
With sed we can match the desired string in a capture group, and use the I suffix for case-insensitive matching, to accomplish the desired result.
For the sake of this answer I'm assuming the filenames are in a file:
$ cat fnames
my.show.s01e02.h264.aac.subs.mkv
great.s03e12.h264.Dolby.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
One sed solution:
$ sed -E 's/.*\.(s[0-9][0-9]e[0-9][0-9])\..*/\1/I' fnames
s01e02
s03e12
S05E11
Where:
-E - enable extended regex support
\.(s[0-9][0-9]e[0-9][0-9])\. - match s??e?? with a pair of literal periods as bookends; the s??e?? (wrapped in parens) will be stored in capture group #1
\1 - print out capture group #1
/I - use case-insensitive matching
I think your pattern is ok. With the grep -o you get only the matched part of a string instead of matching lines. So
grep -io 'S[0-9]{2}E[0-9]{2}'
solves your problem. Compared to your pattern only numbers will be matched. Maybe you can put it in an if, so lines without a match show that something is wrong with the filename.
Suppose you have those file names:
$ ls -1
great.s03e12.h264.Dolby.mkv
my.show.s01e02.h264.aac.subs.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
You can extract the substring this way:
$ printf "%s\n" * | sed -E 's/^.*([sS][0-9][0-9][eE][0-9][0-9]).*/\1/'
Or with grep:
$ printf "%s\n" *.m* | grep -o '[sS][0-9][0-9][eE][0-9][0-9]'
Either prints:
s03e12
s01e02
S05E11
You could use that same sed or grep on a file (with filenames in it) as well.
I have an input file
RAKESH_ONE
RAKESH-TWO
RAKESH123
RAKESHTHREE
/RAKESH/
FIVERAKESH
456RAKESH
WELCOME123
This is RAKESH
I would like to get the output
RAKESH_ONE
RAKESH-TWO
/RAKESH/
This is RAKESH
I want to print the line matching the pattern RAKESH. If the pattern is prefixed or suffixed with alphanumeric we should avoid it.
([^a-zA-Z0-9]+|^)RAKESH([^a-zA-Z0-9]+|$)
This will match patterns on the lines without alphanumeric prefixes or suffixes. It will not match the whole line, but if used with grep or sed you can output just the lines you need.
UPDATE
As requested, here's the full grep command. Use the -E option to use extended regex:
grep -E "([^a-zA-Z0-9]+|^)RAKESH([^a-zA-Z0-9]+|$)" file.txt
I have the following IP addresses in a file
3.3.3.1
3.3.3.11
3.3.3.111
I am using this file as input file to another program. In that program it will grep each IP address. But when I grep the contents I am getting some wrong outputs.
like
cat testfile | grep -o 3.3.3.1
but I am getting output like
3.3.3.1
3.3.3.1
3.3.3.1
I just want to get the exact output. How can I do that with grep?
Use the following command:
grep -owF "3.3.3.1" tesfile
-o returns the match only and not the whole line.-w greps for whole words, meaning the match must be enclosed in non word chars like <space>, <tab>, ,, ; the start or the end of the line etc. It prevents grep from matching 3.3.3.1 out of 3.3.3.111.
-F greps for fixed strings instead of patterns. This prevents the . in the IP address to be interpreted as any char, meaning grep will not match 3a3b3c1 (or something like this).
To match whole words only, use grep -ow 3.3.3.1 testfile
UPDATE: Use the solution provided by hek2mgl as it is more robust.
You may use anhcors.
grep '^3\.3\.3\.1$' file
Since by default grep uses regex, you need to escape the dots in-order to make grep to match literal dot character.
How do I find a line where a pattern is in middle of line. i.e. in the following example. I want to only get 8th line but exclude 1st and 5th line grepping "#"
I know i would use grep "^#" to find only in first character but how to exclude it?
#DD65WKN1:203:H7T67ADXX:2:2216:19936:100494 1:N:0:
GTCGTTCTTCAGGTTCTC
+
FFFFFIIIIFFFIFFFFF
#DD65WKN1:203:H7T67ADXX:2:2216:6629:100501 1:N:0:
TAAAGTAGCAAAAATG
+
FFFFFFFFIFBFIFFF#DD65WKN1:203:H7T67ADXX:2:2216:6629:100501 1:N:0:
TAAAGTAGCAAAAATG
+
FFFFFFFFIFBFIFFF
Thanks
You can match any character beforehand, so that # won't be matched if just in the first position:
$ grep '.#' file
FFFFFFFFIFBFIFFF#DD65WKN1:203:H7T67ADXX:2:2216:6629:100501 1:N:0:
Note that . matches any character. To be completely sure (first solution would match a line starting with ##), you can negate # by using:
grep '[^#]#' file
Or also indicate that you want to find any line starting with a no-# set of characters (at least one, as indicated by +).
grep '^[^#]\+#' file
Use grep with Perl-regex option which supports negative lookbehind.
$ grep -P '(?<!^)#' file
FFFFFFFFIFBFIFFF#DD65WKN1:203:H7T67ADXX:2:2216:6629:100501 1:N:0:
The above grep command will print the line which doesn't have # symbol at the begining but it may present anwhere on that line.
The best thing about unix filters is combining them
grep --invert-match '^#' file | grep '#'
or more traditionally
sed '/^#/d' file | grep '#'
Here is my problem.A existed file named data.f,I use collating symbol "48",I want to match "48"in my file with Collating symbols in bracket expressions.
grep '[[.48.]]' data.f
but there is some error tip:
grep: Invalid collation character
but, there is no problem with character classes in bracket expressions.
grep "[[:alpha:]]" data.f
if you want to grep 48
grep 48 file
if you want to grep "48"
grep '"48"' file
// to avoid discussion in comments I extend my post with more examples
if you want to grep n occurrences of "48" in one line you should use regular expressions
cat file | grep '\(.*"48"\)\{n\}' | grep -v '\(.*"48"\)\{n+1\}'
basically you grep lines with at least n occurrences, and then with invert-match you exclude lines with n+1 occurrences of string, so you get n occurrences
in you comment you mentioned you wanted to grep lines with 5 occurrences of "48", that CAN be separated by other characters (that's the reason I put .* before "48")
so here is the sample
cat file | grep '\(.*"48"\)\{5\}' | grep -v '\(.*"48"\)\{6\}'
Wouldn't grep '48' data.f work?
I have no idea what you mean by “I use collating symbol "48"” (I know what collation classes are, which is what grep expects to see in your input, but I don't know what a collation symbol would be), but from one of your comments, it seems you're actually looking for the exact string [[.48.]] in your file. Here's two ways of doing just that:
grep -F '[[.48.]]' data.f
grep '\[\[.48.]]' data.f
In one of your other comments, you asked for how to ask grep for lines with at least five occurrences of “48” on them. That's a pretty clear regex question:
grep -E '(.*48){5}' data.f