grep not working with BOM [duplicate]

grep not working with BOM [duplicate] - bash

This question already has answers here:
Elegant way to search for UTF-8 files with BOM?
(11 answers)
Closed 8 years ago.
I am trying to grep a string from a file but grep returns nothing (even though the string is present in the file). It turned out that the file starts with a ÿþ mark. If I remove it manually then grep works. How do I make grep work without manually removing the BOM?

What about:
strings <file> | grep <pattern>
Alternatively check the man page of your grep command. What's actually happening is that grep is looking at the first few bytes of your file and deciding that it's a binary file and therefore not searchable. You can override this with:
--binary-files=text

You can also use cat with the -v (visible) option:
cat -v file | grep pattern

Related

How to find a file with an certain string inside the content of the file by scanning a whole directory using grep [duplicate]

This question already has answers here:
How to find all files containing specific text (string) on Linux?
(54 answers)
Closed 4 years ago.
How would you use grep to find a file inside a directory that contains a certain string inside it's content.

You can use command like:
grep -lR string directory/*
-l will print only the filename instead of content. -R will do it recursively. If you need to do it only in "directory" remove -R
grep -l string directory/*

Using grep to search for '----i' returns error [duplicate]

This question already has answers here:
How can I grep for a string that begins with a dash/hyphen?
(11 answers)
Closed 4 years ago.
I've been trying to find '----i' from a file. The file is a no-type text file with the following content:
-------------e-- ./login.defs
-------------e-- ./Public
-------------e-- ./lightdm.conf
----i--------e-- ./salad.sh
-------------e-- ./file4
-------------e-- ./Desktop
Using grep as follows:
grep -i '----i' filename
returns the following error:
grep: unrecognized option '----i'
Does anyone know how to solve this? Thanks!

The long and short of it is that you need to use -e PATTERN or --regexp=PATTERN if you want to use a pattern which may otherwise be treated as a grep option. So grep -i -e '----i' filename should do the trick.

cat or grep a html file to find specific text [duplicate]

This question already has answers here:
How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?
(9 answers)
Extract lines between two patterns from a file [duplicate]
(3 answers)
Closed 4 years ago.
I am trying to use bash to parse and HTML file using grep.
The HTML won't change so I should be able to find the text easy enough.
The HTML will be like this, and I just want the number which will change each time the file changes:
<div class="total">
900 files inspected,
28301 offenses detected:
</div>
grep -E '^<div class="total">.</div>' my_file.html
Ideally I just want to pull the number of offenses so in the example above it would be 28301. I would like to assign it to a variable also.
Am I close?

you can do a simple
a=$(grep -oP '(\d+)(?=\soffenses\sdetected)' abc);echo $a
will give:
28301
-o only gives the matching part of the line
-P uses perl regular expression in regex
abc is the name of the file
(\d+)(?=\soffenses\sdetected) in this reges we are just using positive lookahead to capture the require digits that are followed by a particular word

If you have GNU grep and GNU sed, you can do:
$ cat file | xargs | grep -Po '<div class=total>\K(.*?)</div>' | sed -E 's/<\/div>//; s/, /\n/'
900 files inspected
28301 offenses detected:
If you have ruby available:
$ ruby -e 'puts readlines.join[/(?<=<div class="total">).+(?=<\/div>)/m].gsub(/^[ \t]+/m,"")' file
900 files inspected,
28301 offenses detected:

Returning only a part of the string from a grep result [duplicate]

This question already has answers here:
Can grep show only words that match search pattern?
(15 answers)
Closed 6 years ago.
I'm using grep on text file containing some simple logs in the following form:-
[ABC.txt]
1=abc|2=def|3=ghi|4=hjk|5=lmn|6=opq
8=rst|9=uvx|10=wyz
.
.
.
.
and so on
the values for the tags 1,2,3,4 etc are different throughout the file and include special characters in some case too. Is there a way I can only retrieve the value for the tag 4 and no other tags via GREP?
BTW,this log file is itself a result of grep .So please advice if I should redirect the output first and then apply the second grep or apply the second grep over the first one,considering it's a large file.

grep -Po '(?<=4=)[^|]*' ABC.txt

You could pipe the result of grep to cut, split the fields by "|" and select the fourth field
[your grep command] | cut -d | -f 4
If you want the "4=" to be gone you can just do the same by using cut a second time but this time using "=" as a delimiter.
http://pubs.opengroup.org/onlinepubs/009695399/utilities/cut.html

How to scrape end of line in grep? [duplicate]

This question already has answers here:
How to find patterns across multiple lines using grep?
(28 answers)
Closed 6 years ago.
I have a file that contains a sequence already broken into lines, something like this:
CGCCCATGGGTCGTATACGTAATGGGAAAACAAAGCATGGTGTAACTATGGTAAGTGCTA
GACAATACAAGAAGGCTGATATTTGTAGAATAATTCATTTGAATTATTATGCTGTAAATA
GCTAGATTATTATGCATAATTACTTTGAGAGGTGATCAATCAATTCGACCCTTGCCAATT
I want to search a specific pattern in this file like GCTGTAAATAGCTAGATTA for example.
The problem is that the pattern may be cut by a newline at an unpredictable place.
I can use :
grep -e "pattern" file
but it cannot avoid "new line" character and doesn't give the result. How can I modify my command to ignore \n in my search?
Edit:
I don't know either my query exists in the file or not, and if it is there, I don't know where it exists.
The best solution that came into my mind is
tr -d '\n' < file | grep -e "CTACCCCAGACAAACTGGTCAGATACCAACCATCAGCGAAACTAACCAAACAAA"
but I know there should be more efficient ways to do that.

pattern="GCTGTAAATA"$'\n'"GCTAGATTA" # $'\n' is Bash's way of mentioning special chars
grep -e "$pattern" file
OR
pattern="GCTGTAAATA
GCTAGATTA" # with an actual newline at the end of the first line
grep -e "$pattern" file

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

grep not working with BOM [duplicate] - bash

What about: strings <file> | grep <pattern> Alternatively check the man page of your grep command. What's actually happening is that grep is looking at the first few bytes of your file and deciding that it's a binary file and therefore not searchable. You can override this with: --binary-files=text

You can also use cat with the -v (visible) option: cat -v file | grep pattern

Related

How to find a file with an certain string inside the content of the file by scanning a whole directory using grep [duplicate]

Using grep to search for '----i' returns error [duplicate]

cat or grep a html file to find specific text [duplicate]

Returning only a part of the string from a grep result [duplicate]

How to scrape end of line in grep? [duplicate]

Categories

Resources