How i get the file name with certain extension in bash [duplicate] - bash

This question already has answers here:
How to loop through file names returned by find?
(17 answers)
Closed 12 months ago.
Here im trying to uncompress roms(iso) files they usually come in zip or 7z
and once in iso file i will like to compress it again to chd (readable format for the emulator) so i though i can use the command find to look up for the file but looks like when i just execute the find instruction the files are display propletly (one per line) but when i try to get each file name to process it looks like it just split by space (yes this files had spaces in it) and not the actual full filename, is worth mention that this iso files are inside a subdirectory name equal than the file itself(without *.iso obvsly) this is what im trying:
#/bin/bash
dir="/home/creeper/Downloads/"
dest="/home/creeper/Documents/"
for i in $(find $dir -name '*.7z' -or -name '*.zip' -or -name '*.iso');
do
if [[ $i == *7z ]]
then
7z x $i
rm -fr $i
fi
if [[ $i == *zip ]]
then
unzip $i
rm -fr $i
fi
if [[ $i == *iso ]]
then
chd_file="${i%.*}.chd"
chdman createcd -i $i -o $chd_file;
mv -v $chd_file $dest
rm -fr $i
fi
done;```

when i try to get each file name to process it looks like it just split by space (yes this files had spaces in it) and not the actual full filename
That's because for does word splitting etc. when its input is a command's output. See Don't Read Lines with For in the bash wiki for details.
One alternative is to use bash's extended globbing features instead of find:
#!/usr/bin/env bash
shopt -s extglob globstar
dir="/home/creeper/Downloads/"
for i in "$dir"/**/*.#(7z|zip|iso); do
# Remember to quote expansions of $i!
# ...
done

Related

How to search for keywords in metadata across all files in a folder recursively?

I need to search all subdirectories and files recursively from a location and print out any files that contains metadata matching any of my specified keywords.
e.g. If John Smith was listed as the author of hello.js in the metadata and one of my keywords was 'john' I would want the script to print hello.js.
I think the solution could be a combination of mdls and grep but I have not used bash much before so am a bit stuck.
I have tried the following command but this only prints the line the keyword is on if 'john' is found.
mdls hello.js | grep john
Thanks in advance.
(For reference I am using macOS.)
Piping the output of mdls into grep as you show in your question doesn't carry forward the filename. The following script iterates recursively over the files in the selected directory and checks to see if one of the attributes matches the desired pattern (using regex). If it does, the filename is output.
#!/bin/bash
shopt -s globstar # expand ** recursively
shopt -s nocasematch # ignore case
pattern="john"
attrib=Author
for file in /Users/me/myfiles/**/*.js
do
attrib_value=$(mdls -name "$attrib" "$file")
if [[ $attrib_value =~ $pattern ]]
then
printf 'Pattern: %s found in file $file\n' "$pattern" "$file"
fi
done
You can use a literal test instead of a regular expression:
if [[ $attrib_value == *$pattern* ]]
In order to use globstar you will need to use a later version of Bash than the one installed by default in MacOS. If that's not possible then you can use find, but there are challenges in dealing with filenames that contain newlines. This script takes care of that.
#!/bin/bash
shopt -s nocasematch # ignore case
dir=/Users/me/myfiles/
check_file () {
local attrib=$1
local pattern=$2
local file=$3
local attrib_value=$(mdls -name "$attrib" "$file")
if [[ $attrib_value =~ $pattern ]]
then
printf 'Pattern: %s found in file $file\n' "$pattern" "$file"
fi
}
export -f check_file
pattern="john"
attrib=Author
find "$dir" -name '*.js' -print0 | xargs -0 -I {} bash -c 'check_file "$attrib" "$pattern" "{}"'

Remove numbers at beginning of filenames in directory in bash

In an attempt to rename the files in one directory with numbers at the front I made an error in my script so that this happened in the wrong directory. Therefore I now need to remove these numbers from the beginning of all of my filenames in a directory. These range from 1 to 3 digits. Examples of the filnames I am working with are:
706terrain_Slope1000m_Minimum_all_25PCs_bolt_all_25PCs_qq_bolt.png
680met_sfcWind_all_25PCs_bolt_number.txt
460greenness_NDVI_500m_min_all_25PCs_bolt_number.txt
I was thinking of using mv but I'm not really sure how to do it with varying numbers of digits at the beginning, so any advice would be appreciated!
A simple way in bash is making use of a regular expression test:
for file in *; do
[[ -f "${file}" ]] && [[ "${file}" =~ (^[0-9]+) ]] && mv ${file} ${file/${BASH_REMATCH[1]}}
done
This does the following:
[[ -f "${file}" ]]: test if file is a file, if so
[[ "${file}" =~ (^[0-9]+) ]]: check if file starts with a number
${file/${BASH_REMATCH[1]}}: remove the number from the string file by using BASH_REMATCH, a variable that matches the groupings from the regex match.
If you've got perl's rename installed, the following should work :
rename 's/^[0-9]{1,3}//' /path/to/files
/path/to/files can be a list of specific files, or probably in your case a glob (e.g. *.{png,txt}). You don't need to select only files starting with digits as rename won't modify those that do not.
Using bash parameter expansion:
shopt -s extglob
for i in +([0-9])*.{txt,png}; do
mv -- "$i" "${i##+([0-9])}"
done
This will remove starting digits (any number) in filenames having png and txt extension.
The ## is removing the longest matching prefix pattern.
The +(...) is path name expansion syntax for repeated characters.
And [0-9] is pattern matching digits.
Alternate method using GNU find:
#!/usr/bin/env bash
find ./ \
-maxdepth 1\
-type f\
-name '[[:digit:]]*'\
-exec bash -c 'shopt -s extglob; f="${1##*/}"; d="${1%%/*}"; mv -- "$1" "${d}/${f##+([[:digit:]])}"' _ {} \;
Find all actual files in current directory whose name start with a digit.
For each found file, execute the Bash script below:
shopt -s extglob # need for extended pattern syntax
f="${1##*/}" # Get file name without directory path
d="${1%%/*}" # Get directory path without file name
mv -- "$1" "${d}/${f##+([[:digit:]])}" # Rename without the leading digits
Using basic features of a POSIX-compliant shell:
#!/bin/sh
for f in [[:digit:]]*; do
if [ -f "$f" ]; then
pf="${f%${f#???}}" pf="${pf##*[[:digit:]]}"
mv "$f" "$pf${f#???}"
fi
done

How to iterate over a directory and display only filename

I would want to iterate over contents of a directory and list only ordinary files.
The path of the directory is given as an user input. The script works if the input is current directory but not with others.
I am aware that this can be done using ls.. but i need to use a for .. in control structure.
#!/bin/bash
echo "Enter the path:"
read path
contents=$(ls $path)
for content in $contents
do
if [ -f $content ];
then
echo $content
fi
done
ls is only returning the file names, not including the path. You need to either:
Change your working directory to the path in question, or
Combine the path with the names for your -f test
Option #2 would just change:
if [ -f $content ];
to:
if [ -f "$path/$content" ];
Note that there are other issues here; ls may make changes to the output that break this, depending on wrapping. If you insist on using ls, you can at least make it (somewhat) safer with:
contents="$(command ls -1F "$path")"
You have two ways of doing this properly:
Either loop through the * pattern and test file type:
#!/usr/bin/env bash
echo "Enter the path:"
read -r path
for file in "$path/"*; do
if [ -f "$file" ]; then
echo "$file"
fi
done
Or using find to iterate a null delimited list of file-names:
#!/usr/bin/env bash
echo "Enter the path:"
read -r path
while IFS= read -r -d '' file; do
echo "$file"
done < <(
find "$path" -maxdepth 1 -type f -print0
)
The second way is preferred since it will properly handle files with special characters and offload the file-type check to the find command.
Use file, set to search for files (-type f) from $path directory:
find "$path" -type f
Here is what you could write:
#!/usr/bin/env bash
path=
while [[ ! $path ]]; do
read -p "Enter path: " path
done
for file in "$path"/*; do
[[ -f $file ]] && printf '%s\n' "$file"
done
If you want to traverse all the subdirectories recursively looking for files, you can use globstar:
shopt -s globstar
for file in "$path"/**; do
printf '%s\n' "$file"
done
In case you are looking for specific files based on one or more patterns or some other condition, you could use the find command to pick those files. See this post:
How to loop through file names returned by find?
Related
When to wrap quotes around a shell variable?
Why you shouldn't parse the output of ls
Is double square brackets [[ ]] preferable over single square brackets [ ] in Bash?

How to recursively traverse a directory tree and find only files?

I am working on a scp call to download a folder present on a remote system. Downloaded folder has subfolders and within these subfolders there are a bunch of files which I want to pass as arguments to a python script like this:
scp -r researcher#192.168.150.4:SomeName/SomeNameElse/$folder_name/ $folder_name/
echo "File downloaded successfully"
echo "Running BD scanner"
for d in $folder_name/*; do
if [[ -d $d ]]; then
echo "It is a directory"
elif [[ -f $d ]]; then
echo "It is a file"
echo "Running the scanner :"
python bd_scanner_new.py /home/nsadmin/Some/bash_script_run_files/$d
else
echo "$d is invalid file"
exit 1
fi
done
I have added the logic to find if there are any directories and excluding them. However, I don't traverse down those directories recursively.
Partial results below:
File downloaded succesfully
Running BD scanner
It is a directory
It is a directory
It is a directory
Exiting
I want to improve this code so that it traverses all directories and picks up all files. Please help me with any suggestions.
You can use shopt -s globstar in Bash 4.0+:
#!/bin/bash
shopt -s globstar nullglob
cd _your_base_dir
for file in **/*; do
# will loop for all the regular files across the entire tree
# files with white spaces or other special characters are gracefully handled
python bd_scanner_new.py "$file"
done
Bash manual says this about globstar:
If set, the pattern ‘**’ used in a filename expansion context will
match all files and zero or more directories and subdirectories. If
the pattern is followed by a ‘/’, only directories and subdirectories
match.
More globstar discussion here: https://unix.stackexchange.com/questions/117826/bash-globstar-matching
Why go through the trouble of using globbing for file matching but rather use find with is meant for this by using a process-substitution (<()) with a while-loop.
#!/bin/bash
while IFS= read -r -d '' file; do
# single filename is in $file
python bd_scanner_new.py "$file"
done < <(find "$folder_name" -type f -print0)
Here, find does a recursive search of all the files from the mentioned path to any level of sub-directories below. Filenames can contain blanks, tabs, spaces, newlines. To process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL which then is read command processes with the same de-limit character.
Note; On a side note, always double-quote variables in bash to avoid expansion by shell.

Shell Script to list files in a given directory and if they are files or directories

Currently learning some bash scripting and having an issue with a question involving listing all files in a given directory and stating if they are a file or directory. The issue I am having is that I only get either my current directory or if a specify a directory it will just say that it is a directory eg. /home/user/shell_scripts will return shell_scipts is a directory rather than the files contained within it.
This is what I have so far:
dir=$dir
for file in $dir; do
if [[ -d $file ]]; then
echo "$file is a directory"
if [[ -f $file ]]; then
echo "$file is a regular file"
fi
done
Your line:
for file in $dir; do
will expand $dir just to a single directory string. What you need to do is expand that to a list of files in the directory. You could do this using the following:
for file in "${dir}/"* ; do
This will expand the "${dir}/"* section into a name-only list of the current directory. As Biffen points out, this should guarantee that the file list wont end up with split partial file names in file if any of them contain whitespace.
If you want to recurse into the directories in dir then using find might be a better approach. Simply use:
for file in $( find ${dir} ); do
Note that while simple, this will not handle files or directories with spaces in them. Because of this, I would be tempted to drop the loop and generate the output in one go. This might be slightly different than what you want, but is likely to be easier to read and a lot more efficient, especially with large numbers of files. For example, To list all the directories:
find ${dir} -maxdepth 1 -type d
and to list the files:
find ${dir} -maxdepth 1 -type f
if you want to iterate into directories below, then remove the -maxdepth 1
This is a good use for globbing:
for file in "$dir/"*
do
[[ -d "$file" ]] && echo "$file is a directory"
[[ -f "$file" ]] && echo "$file is a regular file"
done
This will work even if files in $dir have special characters in their names, such as spaces, asterisks and even newlines.
Also note that variables should be quoted ("$file"). But * must not be quoted. And I removed dir=$dir since it doesn't do anything (except break when $dir contains special characters).
ls -F ~ | \
sed 's#.*/$#/& is a Directory#;t quit;s#.*#/& is a File#;:quit;s/[*/=>#|] / /'
The -F "classify" switch appends a "/" if a file is a directory. The sed code prints the desired message, then removes the suffix.
for file in $(ls $dir)
do
[ -f $file ] && echo "$file is File"
[ -d $file ] && echo "$file is Directory"
done
or replace the
$(ls $dir)
with
`ls $`
If you want to list files that also start with . use:
for file in "${dir}/"* "${dir}/"/.[!.]* "${dir}/"/..?* ; do

Resources