Bash: Native way to check if an entry is one line? - bash

I have a find script that automatically opens a file if just one file is found. The way I currently handle it is doing a word count on the number of lines of the search results. Is there an easier way to do this?
if [ "$( cat "$temp" | wc -l | xargs echo )" == "1" ]; then
edit `cat "$temp"`
fi
EDITED - here is the context of the whole script.
term="$1"
temp=".aafind.txt"
find src sql common -iname "*$term*" | grep -v 'src/.*lib' >> "$temp"
if [ ! -s "$temp" ]; then
echo "ΓΈ - including lib..." 1>&2
find src sql common -iname "*$term*" >> "$temp"
fi
if [ "$( cat "$temp" | wc -l | xargs echo )" == "1" ]; then
# just open it in an editor
edit `cat "$temp"`
else
# format output
term_regex=`echo "$term" | sed "s%\*%[^/]*%g" | sed "s%\?%[^/]%g" `
cat "$temp" | sed -E 's%//+%/%' | grep --color -E -i "$term_regex|$"
fi
rm "$temp"

Unless I'm misunderstanding, the variable $temp contains one or more filenames, one per line, and if there is only one filename it should be edited?
[ $(wc -l <<< "$temp") = "1" ] && edit "$temp"
If $temp is a file containing filenames:
[ $(wc -l < "$temp") = "1" ] && edit "$(cat "$temp")"

Several of the results here will read through an entire file, whereas one can stop and have an answer after one line and one character:
if { IFS='' read -r result && ! read -n 1 _; } <file; then
echo "Exactly one line: $result"
else
echo "Either no valid content at all, or more than one line"
fi
For safely reading from find, if you have GNU find and bash as your shell, replace <file with < <(find ...) in the above. Even better, in that case, is to use NUL-delimited names, such that filenames with newlines (yes, they're legal) don't trip you up:
if { IFS='' read -r -d '' result && ! read -r -d '' -n 1 _; } \
< <(find ... -print0); then
printf 'Exactly one file: %q\n' "$result"
else
echo "Either no results, or more than one"
fi

Well, given that you are storing these results in the file $temp this is a little easier:
[ "$( wc -l < $temp )" -eq 1 ] && edit "$( cat $temp )"
Instead of 'cat $temp' you can do '< $temp', but it might take away some readability if you are not very familiar with redirection 8)

If you want to test whether the file is empty or not, test -s does that.
if [ -s "$temp" ]; then
edit `cat "$temp"`
fi
(A non-empty file by definition contains at least one line. You should find that wc -l agrees.)
If you genuinely want a line count of exactly one, then yes, it can be simplified substantially;
if [ $( wc -l <"$temp" ) = 1 ]; then
edit `cat "$temp"`
fi

You can use arrays:
x=($(find . -type f))
[ "${#x[*]}" -eq 1 ] && echo "just one || echo "many"
But you might have problems in case of filenames with whitespace, etc.
Still, something like this would be a native way

no this is the way, though you're making it over-complicated:
if [ "`wc -l $temp | cut -d' ' -f1`" = "1" ]; then
edit "$temp";
fi
what's complicating it is:
useless use of cat,
unuseful use of xargs
and I'm not sure if you really want the editcat $temp`` which is editing the file at the content of $temp

Related

Delete empty files - Improve performance of logic

I am i need to find & remove empty files. The definition of empty files in my use case is a file which has zero lines.
I did try testing the file to see if it's empty However, this behaves strangely as in even though the file is empty it doesn't detect it so.
Hence, the best thing I could write up is the below script which i way too slow given it has to test several hundred thousand files
#!/bin/bash
LOOKUP_DIR="/path/to/source/directory"
cd ${LOOKUP_DIR} || { echo "cd failed"; exit 0; }
for fname in $(realpath */*)
do
if [[ $(wc -l "${fname}" | awk '{print $1}') -eq 0 ]]
then
echo "${fname}" is empty
rm -f "${fname}"
fi
done
Is there a better way to do what I'm after or alternatively, can the above logic be re-written in a way that brings better performance please?
Your script is slow beacuse wc reads every file to the end, which is not needed for your purpose. This might be what you're looking for:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || echo rm -f -- "$file"
fi
done
Drop the echo after making sure it works as intended.
Another version, calling the rm only once, could be:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || files_to_be_deleted+=("$file")
fi
done
rm -f -- "${files_to_be_deleted[#]}"
Explanation:
The core logic is in the line
read < "$file" || rm -f -- "$file"
The read < "$file" command attempts to read a line from the $file. If it succeeds, that is, a line is read, then the rm command on the right-hand side of the || won't be executed (that's how the || works). If it fails then the rm command will be executed. In any case, at most one line will be read. This has great advantage over the wc command because wc would read the whole file.
if ! read < "$file"; then rm -f -- "$file"; fi
could be used instead. The two lines are equivalent.
To check a "$fname" is a file and is empty or not, use [ -s "$fname" ]:
#!/usr/bin/env sh
LOOKUP_DIR="/path/to/source/directory"
for fname in "$LOOKUP_DIR"*/*; do
if ! [ -s "$fname" ]; then
echo "${fname}" is empty
# remove echo when output is what you want
echo rm -f "${fname}"
fi
done
See: help test:
File operators:
...
-s FILE True if file exists and is not empty.
Yet another method
wc -l ~/tmp/* 2>/dev/null | awk '$1 == 0 {print $2}' | xargs echo rm
This will break if any of your files have whitespace in the name.
To work around that, with awk still
wc -l ~/tmp/* 2>/dev/null \
| awk 'sub(/^[[:blank:]]+0[[:blank:]]+/, "")' \
| xargs echo rm
This works because the sub function returns the number of substitutions made, which can be treated as a boolean zero/not-zero condition.
Remove the echo to actually delete the files.

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

Add character to file name if duplicate when moving with bash

I currently use a bash script and PDFgrep to rename files to a certain structure. However, in order to stop overriding if the new file has a duplicate name, I want to add a number at the end of the name. Keep in mind that there may be 3 or 4 duplicate names. What's the best way to do this?
#!/bin/bash
if [ $# -ne 1 ]; then
echo Usage: Renamer file
exit 1
fi
f="$1"
id1=$(pdfgrep -m 1 -i "MR# : " "$f" | grep -oE "[M][0-9][0-9]+") || continue
id2=$(pdfgrep -m 1 -i "Visit#" "$f" | grep -oE "[V][0-9][0-9]+") || continue
{ read today; read dob; read dop; } < <(pdfgrep -i " " "$f" | grep -oE "[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]")
dobsi=$(echo $dob | sed -e 's/\//-/g')
dopsi=$(echo $dop | sed -e 's/\//-/g')
mv -- "$f" "${id1}_${id2}_$(printf "$dobsi")_$(printf "$dopsi")_1.pdf"
Use a loop that checks if the destination filename exists, and increments a counter if it does. Replace the mv line with this:
prefix="${id1}_{id2}_${dob}_${dop}"
counter=0
while true
do
if [ "$counter" -ne 0 ]
then target="${prefix}_${counter}.pdf"
else target="${prefix}.pdf"
fi
if [ ! -e "$target" ]
then
mv -- "$f" "$target"
break
fi
((counter++))
done
Note that this suffers from a TOCTTOU problem, if the duplicate file is created between the ! -f "$target" test and the mv. I thought it would be possible to replace the existence check with using mv -n; but while this won't overwrite the file, it still treats the mv as successful, so you can't test the result to see if you need to increment the counter.

Bash command - using IF - FI within a DO - DONE

I'm trying to run a command that should find PHP files that contain "base64_decode" and/or "eval", echo the file name, print the top three lines, if the file contains more than 3 lines, also the bottom 3.
I have the following at the moment:
for file in $(find . -name "*.php" -exec grep -il "base64_decode\|eval" {} \;); do echo $file; head -n 3 $file; if [ wc -l < $file -gt 3 ]; then tail -n 3 $file fi; done | less
This returns the following error:
bash: syntax error near unexpected token `done'
I would to use the following
while read -r file
do
echo ==$file==
head -n 3 "$file"
[[ $(grep -c '' "$file") > 3 ]] && (echo ----last-3-lines--- ; tail -n 3 "$file")
done < <(find . -name \*.php -exec grep -il 'base64_decode\|eval' {} \+)
Using while over the for is better, because the filenames could contain spaces. /probably not in this case, but anyway :)/
using grep -c '' "$file" is sometimes better (when the last line in the file, doesn't contains the \n character (the wc counts the \n characters in the file)
the find with the \+ instead of the \; is more efficient
Problem seems to be here:
if [ wc -l < $file -gt 3 ]; then
Since you need to use command substitution here to make sure wc -l command executes first and then compare the result:
if [[ $(wc -l < "$file") -gt 3 ]]; then
You want to execute your wc, more like:
if [[ $(wc -l < $file) -gt 3 ]]; then
try this:
#!/bin/bash
for file in $(grep -H "base64_decode\|eval" ./*.php | cut -d: -f1);
do
echo $file;
head -n 3 $file;
if [[ $(wc -l < $file) -gt 3 ]];
then
tail -n 3 $file
fi;
done
I tested and seems to work fine.
But, be carefull ... if php has 4 lines, you will see:
line1
line2
line3
line2
line3
line4
EDIT: changed the script above to grep inside files.
cat a.php
asdasd
asd
base64_decode
l
a
and result
./test2.sh
./a.php
asdasd
asd
base64_decode
base64_decode
l
a

Find lines containing all keywords in bash script

Essentially, I would like something that behaves similarly to:
cat file | grep -i keyword1 | grep -i keyword2 | grep -i keyword3
How can I do this with a bash script that takes a variable-length list of keyword arguments? The script should do a case-insensitive match of lines containing all keywords.
Use this as a script
#! /bin/bash
awk -v IGNORECASE=1 -f <(
P=; for k; do [ -z "$P" ] && P="/$k/" || P="$P&&/$k/"; done
echo "$P{print}"
)
and invoke it as
script.sh keyword1 keyword2 keyword3 < file
I don't know if this is efficient, and I think this is ugly, also there might be some utility for that, but:
#!/bin/bash
unset keywords matchlist
keywords=("$#")
for kw in "${keywords[#]}"; do
matchlist="$matchlist /$kw/ &&"
done
matchlist="${matchlist% &&}"
# awk "$matchlist { print; }" < <(tr '[:upper:]' '[:lower:]' <file)
awk "$matchlist { print; }" file
And yes, it needs some robustness regarding special characters and stuff. It's just to show the idea.
Give this a try:
shopt -s nocasematch
keywords="keyword1|keyword2|keyword3"
while read line; do [[ $line =~ $keywords ]] && echo $line; done < file
Edit:
Here's a version that tests for all keywords being present, not just any:
keywords=(keyword1 keyword2 keyword3) # or keywords=("$#")
qty=${#keywords[#]}
while read line
do
count=0
for keyword in "${keywords[#]}"
do
[[ "$line" =~ $keyword ]] && (( count++ ))
done
if (( count == qty ))
then
echo $line
fi
done < textlines
Found a way to do this with grep.
KEYWORDS=$#
MATCH_EXPR="cat file"
for keyword in ${KEYWORDS};
do
MATCH_EXPR="${MATCH_EXPR} | grep -i ${keyword}"
done
eval ${MATCH_EXPR}
you can use bash 4.0++
shopt -s nocasematch
while read -r line
do
case "$line" in
*keyword1*) f=1;;&
*keyword2*) g=1;;&
*keyword3*)
[ "$f" -eq 1 ] && [ "$g" -eq 1 ] && echo $line;;
esac
done < "file"
shopt -u nocasematch
or gawk
gawk '/keyword/&&/keyword2/&&/keyword3/' file
I'd do it in Perl.
For finding all lines that contain at least one of them:
perl -ne'print if /(keyword1|keyword2|keyword3)/i' file
For finding all lines that contain all of them:
perl -ne'print if /keyword1/i && /keyword2/i && /keyword3/i' file
Here is a script called search.sh in bash that will search lines within a file or folder for all keywords specified:
#!/bin/bash
if [ $# -lt 2 ]; then
echo "[-] $0 file_to_search/folder_to_search keyword1 keyword2 keyword3 ..."
exit
fi
all_args="$#"
i=0
results="" # this will store the cumulative results from each keyword search
for arg in $all_args; do
if [ $i -eq 0 ]; then
# first argument is the file/folder to search
file_to_search="$arg"
i=$(($i + 1))
elif [ $i -eq 1 ]; then
# search the file/folder with first keyword (first search)
results=`grep --color=always -r -n -i "$arg" "$file_to_search"`
i=$(($i + 1))
else
# now keep searching the results from first search for other keywords
results=`echo "$results" | grep --color=always -i "$arg"`
i=$(($i + 1))
fi
done
echo "$results"
Example invocation of script above will search the 'tools.txt' file for 'python' and 'jira' keywords:
./search.sh tools.txt python jira

Resources