Searching within txt files using bash

Searching within txt files using bash - bash

I am looking trivial solution for the trivial task.
My bash script is looping several folders producing in which some log.txt file. If some operation in each case has performed successfully in each of the log the string with sentence "The unit is OK" should somewhere in the log.txt appeared, however its actual position (precise number of string) in each log.txt is differs!
I need to put in my loop some condition (probably using IF ) to check whether that sentence is actually present somewhere in the log file and if so - to print "Everything is OK" within the terminal where my script is executed in moment of looping of particular folder, and otherwise (if the string is absent in the log) to print smth like "Bad news"!
Will be thankful for the different solutions especially how to find the strings of selected phrases in the given log file.
Thanks!!
Gleb

To apply the test to an entire directory, a simple for loop can be use. If you need to process directories recursively, you can feed a while read -r... loop with the results of find. In either case the search with be similar. Here is an example searching a single directory of log files:
$ for i in dirname/*; do grep -q 'The unit is OK' "$i" && \
echo "$i - Everything is OK" || echo "$i - Bad news"; done
Searching where dirname is my test dat directory, example results would be:
dat/test.properties - Bad news
dat/test_pph_s.txt - Bad news
dat/testlog-1.txt - Everything is OK
dat/testlog-2.txt - Everything is OK
dat/testlog-3.txt - Everything is OK
Where the testlog-X.txt files contain, for example:
$ cat dat/testlog-1.txt
The unit is OK

Without more info (examples of strings you are looking for, potential file names, whatever bash script you already have) it is hard to provide concrete advice.
Still, if you have a known set of potential strings you could check for their existence in a file while you're looping
if [ `grep -E (pattern1|pattern2|pattern3) $FILE | wc -l` -gt 0 ]; then
# String was found in file
fi

Scanning files with some pattern (eg: log.txt)
for file in $(find /path/to/logs -name "log.txt")
do
grep -q "The unit is OK" $file && echo "$file: Everything is OK" || echo "$file: Bad news"
done

Related

Bash File names will not append to file from script

Hello I am trying to get all files with Jane's name to a separate file called oldFiles.txt. In a directory called "data" I am reading from a list of file names from a file called list.txt, from which I put all the file names containing the name Jane into the files variable. Then I'm trying to test the files variable with the files in list.txt to ensure they are in the file system, then append the all the files containing jane to the oldFiles.txt file(which will be in the scripts directory), after it tests to make sure the item within the files variable passes.
#!/bin/bash
> oldFiles.txt
files= grep " jane " ../data/list.txt | cut -d' ' -f 3
if test -e ~data/$files; then
for file in $files; do
if test -e ~/scripts/$file; then
echo $file>> oldFiles.txt
else
echo "no files"
fi
done
fi
The above code gets the desired files and displays them correctly, as well as creates the oldFiles.txt file, but when I open the file after running the script I find that nothing was appended to the file. I tried changing the file assignment to a pointer instead files= grep " jane " ../data/list.txt | cut -d' ' -f 3 ---> files=$(grep " jane " ../data/list.txt) to see if that would help by just capturing raw data to write to file, but then the error comes up "too many arguments on line 5" which is the 1st if test statement. The only way I get the script to work semi-properly is when I do ./findJane.sh > oldFiles.txt on the shell command line, which is me essentially manually creating the file. How would I go about this so that I create oldFiles.txt and append to the oldFiles.txt all within the script?

The biggest problem you have is matching names like "jane" or "Jane's", etc. while not matching "Janes". grep provides the options -i (case insensitive match) and -w (whole-word match) which can tailor your search to what you appear to want without having to use the kludge (" jane ") of appending spaces before an after your search term. (to properly do that you would use [[:space:]]jane[[:space:]])
You also have the problem of what is your "script dir" if you call your script from a directory other than the one containing your script, such as calling your script from your $HOME directory with bash script/findJane.sh. In that case your script will attempt to append to $HOME/oldFiles.txt. The positional parameter $0 always contains the full pathname to the current script being run, so you can capture the script directory no matter where you call the script from with:
dirname "$0"
You are using bash, so store all the filenames resulting from your grep command in an array, not some general variable (especially since your use of " jane " suggests that your filenames contain whitespace)
You can make your script much more flexible if you take the information of your input file (e.g list.txt), the term to search for (e.g. "jane"), the location where to check for existence of the files (e.g. $HOME/data) and the output filename to append the names to (e.g. "oldFile.txt") as command line [positonal] parameters. You can give each default values so it behaves as you currently desire without providing any arguments.
Even with the additional scripting flexibility of taking the command line arguments, the script actually has fewer lines simply filling an array using mapfile (synonymous with readarray) and then looping over the contents of the array. You also avoid the additional subshell for dirname with a simple parameter expansion and test whether the path component is empty -- to replace with '.', up to you.
If I've understood your goal correctly, you can put all the pieces together with:
#!/bin/bash
# positional parameters
src="${1:-../data/list.txt}" # 1st param - input (default: ../data/list.txt)
term="${2:-jane}" # 2nd param - search term (default: jane)
data="${3:-$HOME/data}" # 3rd param - file location (defaut: ../data)
outfn="${4:-oldFiles.txt}" # 4th param - output (default: oldFiles.txt)
# save the path to the current script in script
script="$(dirname "$0")"
# if outfn not given, prepend path to script to outfn to output
# in script directory (if script called from elsewhere)
[ -z "$4" ] && outfn="$script/$outfn"
# split names w/term into array
# using the -iw option for case-insensitive whole-word match
mapfile -t files < <(grep -iw "$term" "$src" | cut -d' ' -f 3)
# loop over files array
for ((i=0; i<${#files[#]}; i++)); do
# test existence of file in data directory, redirect name to outfn
[ -e "$data/${files[i]}" ] && printf "%s\n" "${files[i]}" >> "$outfn"
done
(note: test expression and [ expression ] are synonymous, use what you like, though you may find [ expression ] a bit more readable)
(further note: "Janes" being plural is not considered the same as the singular -- adjust the grep expression as desired)
Example Use/Output
As was pointed out in the comment, without a sample of your input file, we cannot provide an exact test to confirm your desired behavior.
Let me know if you have questions.

As far as I can tell, this is what you're going for. This is totally a community effort based on the comments, catching your bugs. Obviously credit to Mark and Jetchisel for finding most of the issues. Notable changes:
Fixed $files to use command substitution
Fixed path to data/$file, assuming you have a directory at ~/data full of files
Fixed the test to not test for a string of files, but just the single file (also using -f to make sure it's a regular file)
Using double brackets — you could also use double quotes instead, but you explicitly have a Bash shebang so there's no harm in using Bash syntax
Adding a second message about not matching files, because there are two possible cases there; you may need to adapt depending on the output you're looking for
Removed the initial empty redirection — if you need to ensure that the file is clear before the rest of the script, then it should be added back, but if not, it's not doing any useful work
Changed the shebang to make sure you're using the user's preferred Bash, and added set -e because you should always add set -e
#!/usr/bin/env bash
set -e
files=$(grep " jane " ../data/list.txt | cut -d' ' -f 3)
for file in $files; do
if [[ -f $HOME/data/$file ]]; then
if [[ -f $HOME/scripts/$file ]]; then
echo "$file" >> oldFiles.txt
else
echo "no matching file"
fi
else
echo "no files"
fi
done

grep loop returning output per file not per search term

I have a directory of multiple files. I want to search all of those files for the same keywords and tell me whether or not each keyword is present in every file in the directory.
I've put the keywords into an array, and this has gotten the script to at least run all the way through to the final echo commands (see below). But I don't think it's looping the grep command the way I want it to.
DATA_DIR=/directorypath
SEARCH1="sulfonamide"
SEARCH2="resistance"
declare -a LIST_STRINGS=(${SEARCH1} ${SEARCH2})
cd ${DATA_DIR}
LIST_FILES=`ls -1 */*.faa`
for file in ${LIST_FILES};do ##loop command for every file in the list
echo "Filename is ${file}"
for item in ${LIST_STRINGS[*]};do ##loop command for every search item in the array list
grep -qi ${item} ${file} ##search current file for current search item. q to return 0 status if a match is found. i to ignore upper/lower case
if [ $? == "0" ];then
echo "${item} found!"
else
echo "${item} not found"
fi
done
done
At the moment the output is constantly telling me that it can't find either of the items in the array in any of the files. My output looks like this (repeated for every file in the directory):
Filename is path/file.faa
sulfonamide not found
resistance not found
However, I know this is wrong because I have checked the files manually and found that SEARCH2 is present but SEARCH1 isn't.
So is the grep command returning a non-0 status because it can't find both terms? How do I get it to perform a renewed grep search for each item in the array of strings?
I have tried switching the order of the terms and scanning only one file. The result is the same - the output tells me that neither term could be found, when in reality one of them was present.

Unexpected end of file in while loop in bash

I am trying to write a bash script that will do the following:
Take a directory or file as input (will always begin with /mnt/user/)
Search other mount points for same file or directory (will always begin with /mnt/diskx)
Return value
So, for example, the input will be "/mnt/user/my_files/file.txt". It will search if ""/mnt/disk1/my_files/file.txt" exists and will incrementally look for each disk (disk2, disk3, etc) until it finds it or disk20.
This is what I have so far:
#/user/bin/bash
var=$1
i=0
while [ -e $check_var = echo $var | sed 's:/mnt/user:/mnt/disk$i+1:']
do
final=$check_var
done
It's incomplete yes, but I am not that proficient in bash so I'm doing a little at a time. I'm sure my command won't work properly yet either but right now I am getting an "unexpected end of file" and I can't figure out why.

There are many issues here:
If this is the actual code you're getting "unexpected end of file" on, you should save the file in Unix format, not DOS format.
The shebang should be #!/usr/bin/bash or #!/bin/bash depending on your system
You have to assign check_var before running [ .. ] on it.
You have to use $(..) to expand a command
Variables like $i are not expanded in single quotes
sed can't add numbers
i is never incremented
the loop logic is inverted, it should loop until it matches and not while it matches.
You'd want to assign final after -- not in -- the loop.
Consider doing it in even smaller pieces, it's easier to debug e.g. the single statement sed 's:/mnt/user:/mnt/disk$i+1:' than your entire while loop.
Here's a more canonical way of doing it:
#!/bin/bash
var="${1#/mnt/user/}"
for file in /mnt/disk{1..20}/"$var"
do
[[ -e "$file" ]] && final="$file" && break
done
if [[ $final ]]
then
echo "It exists at $final"
else
echo "It doesn't exist anywhere"
fi

Generate shell script call tree

I've been handed a project that consists of several dozen (probably over 100, I haven't counted) bash scripts. Most of the scripts make at least one call to another one of the scripts. I'd like to get the equivalent of a call graph where the nodes are the scripts instead of functions.
Is there any existing software to do this?
If not, does anybody have clever ideas for how to do this?
Best plan I could come up with was to enumerate the scripts and check to see if the basenames are unique (they span multiple directories). If there are duplicate basenames, then cry, because the script paths are usually held in variable names so you may not be able to disambiguate. If they are unique, then grep the names in the scripts and use those results to build up a graph. Use some tool (suggestions?) to visualize the graph.
Suggestions?

Wrap the shell itself by your implementation, log who called you wrapper and exec the original shell.
Yes you have to start the scripts in order to identify which script is really used. Otherwise you need a tool with the same knowledge as the shell engine itself to support the whole variable expansion, PATHs etc -- I never heard about such a tool.
In order to visualize the calling graph use GraphViz's dot format.

Here's how I wound up doing it (disclaimer: a lot of this is hack-ish, so you may want to clean up if you're going to use it long-term)...
Assumptions:
- Current directory contains all scripts/binaries in question.
- Files for building the graph go in subdir call_graph.
Created the script call_graph/make_tgf.sh:
#!/bin/bash
# Run from dir with scripts and subdir call_graph
# Parameters:
# $1 = sources (default is call_graph/sources.txt)
# $2 = targets (default is call_graph/targets.txt)
SOURCES=$1
if [ "$SOURCES" == "" ]; then SOURCES=call_graph/sources.txt; fi
TARGETS=$2
if [ "$TARGETS" == "" ]; then TARGETS=call_graph/targets.txt; fi
if [ ! -d call_graph ]; then echo "Run from parent dir of call_graph" >&2; exit 1; fi
(
# cat call_graph/targets.txt
for file in `cat $SOURCES `
do
for target in `grep -v -E '^ *#' $file | grep -o -F -w -f $TARGETS | grep -v -w $file | sort | uniq`
do echo $file $target
done
done
)
Then, I ran the following (I wound up doing the scripts-only version):
cat /dev/null | tee call_graph/sources.txt > call_graph/targets.txt
for file in *
do
if [ -d "$file" ]; then continue; fi
echo $file >> call_graph/targets.txt
if file $file | grep text >/dev/null; then echo $file >> call_graph/sources.txt; fi
done
# For scripts only:
bash call_graph/make_tgf.sh call_graph/sources.txt call_graph/sources.txt > call_graph/scripts.tgf
# For scripts + binaries (binaries will be leaf nodes):
bash call_graph/make_tgf.sh > call_graph/scripts_and_bin.tgf
I then opened the resulting tgf file in yEd, and had yEd do the layout (Layout -> Hierarchical). I saved as graphml to separate the manually-editable file from the automatically-generated one.
I found that there were certain nodes that were not helpful to have in the graph, such as utility scripts/binaries that were called all over the place. So, I removed these from the sources/targets files and regenerated as necessary until I liked the node set.
Hope this helps somebody...

Insert a line at the beginning of each shell script, after the #! line, which logs a timestamp, the full pathname of the script, and the argument list.
Over time, you can mine this log to identify likely candidates, i.e. two lines logged very close together have a high probability of the first script calling the second.
This also allows you to focus on the scripts which are still actually in use.
You could use an ed script
1a
log blah blah blah
.
wq
and run it like so:
find / -perm +x -exec ed {} <edscript
Make sure you test the find command with -print instead of the exec clause. And / is probably not the path that you want to use. If you have to include bin directories then you will probably need to switch to grep in order to identify the pathnames to include, then when you have a file full of the right names, use xargs instead of find to run the script.

Fastest way to find text in folder

I was using the return value of fgrep -s 'text' /folder/*.txt to find if 'text' is in any .txt file in /folder/. It works, but I find it too slow for what I need, like if it searches for 'text' in all the files before giving me an answer.
I need something that quickly gives me a yes/no answer when it finds at least one file with the 'text'. Probably some awk script.

If I understand your question correctly, you want:
fgrep -m1
Which stops after one match.

You can use this to shorten your search if it's the kind that would be based on mopoke's answer. This stops after the first match in the first file in which it's found:
# found=$(false)
found=1 # false
text="text"
for file in /folder/*.txt
do
if fgrep -m1 "$text" "$file" > /dev/null
then
found=$?
# echo "$text found in $file"
break
fi
done
# if [[ ! $found ]]
# then
# echo "$text not found"
# fi
echo $found # or use exit $found
Edit: commented out some lines and made a couple of other changes.
If there is a large number of files, then the for could fail and you should use a while loop with a find piped into it.
If all you want to do is find out whether there is any .txt file in a folder regardless of the file's contents, then use something like this all by itself:
find /folder -name "*.txt"

Building on the answers above, this should do it I think ?
find /folder -name \*.txt -exec grep "text" {}\;
But I'm not sure I fully understand the problem : is 'fgrep' doing a full depth recursion before it starts outputting or something ? The find should report as-it-finds so might be better for what you need, dunno.
[edit, not tested]: Change the 'grep' above for a shell-script that does something like:
grep "text" && exit 0 || exit 1
To get the true|false thing you need to work (you'll need to play around with this , haven't tested exactly the steps here - no access to Unix at the moment :-( )

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Searching within txt files using bash - bash

Scanning files with some pattern (eg: log.txt) for file in $(find /path/to/logs -name "log.txt") do grep -q "The unit is OK" $file && echo "$file: Everything is OK" || echo "$file: Bad news" done

Related

Bash File names will not append to file from script

grep loop returning output per file not per search term

Unexpected end of file in while loop in bash

Generate shell script call tree

Fastest way to find text in folder

Categories

Resources