Bash Scripting: Trying to find files using ls wild-carding - bash

Here's my problem: I have to resolve various filenames/locations (data directory may have sub-directories) which are user-configurable. If I can resolve the filename completely prior to the loop the following script works:
[prompt] more test.sh
#! /usr/bin/env bash
newfile=actual-filename
for directory in `find -L ${FILE_PATH}/data -type d`; do
for filename in `ls -1 ${directory}/${newfile} 2>/dev/null`; do
if [[ -r ${filename} ]]; then
echo "Found ${filename}"
fi
done
done
[prompt] ./test.sh
[prompt] Found ${SOME_PATH}/actual-filename
However, if newfile has any wildcarding in it the inner-loop will not run. Even if it only returns a single file.
I would use find with some regex, but to auto-generate the proper expressions and do the substitutions for somethings will be tricky (e.g. pgb.f0010{0930,1001}{00,06,12,18} would correlate to some files associate with Sep. 30 and Oct 1 of 2010 the first grouping is computed by my script for a provided date).
pgb.f0010093000 pgb.f0010093006 pgb.f0010093012 pgb.f0010093018 pgb.f0010100100
pgb.f0010100106 pgb.f0010100112 pgb.f0010100118
I'm running Fedora 15 64-bit.

newfile="*"
find -L ${FILE_PATH}/data -name "${newfile}" \
| while read filename
do
if [[ -r ${filename} ]]; then
echo "Found ${filename}"
fi
done
-or-
newfile="*"
find -L ${FILE_PATH}/data -name "${newfile}" -readable -exec echo "Found {}" \;
-or with regular expressions-
newfile='.*/pgb.f0010(0930|1001)(00|06|12|18)'
FILE_PATH=.
find -L ${FILE_PATH}/. -regextype posix-extended \
-regex "${newfile}" -readable -exec echo "Found {}" \;

The root problem is that there is a dependence on shell expansion in the broken script. Use eval:
#! /usr/bin/env bash
FILE_PATH="."
newfile=pgb.f0010{0930,1001}{00,06,12,18}
for directory in `find -L ${FILE_PATH}/data -type d`; do
for filename in `eval ls -1 ${directory}/${newfile} 2>/dev/null`; do
if [[ -r ${filename} ]]; then
echo "Found ${filename}"
fi
done
done

Related

glob operator with for loop is stuck

I am trying to traverse all files in /home directory recursively. I want to do some linux command with each file . So, I am making use of for loop as below:
for i in /home/**/*
I have put below statements as start of script as well:
shopt -s globstar
shopt -s nullglob
But its getting stuck in for loop. It might be the problem with handling so many files. If I give some another directory(with less no of files) to for loop loop, then it traverse properly.
What else I can try.
Complete code:
#!/bin/bash
shopt -s globstar
shopt -s nullglob
echo "ggg"
for i in /home/**/*
do
NAME=${i}
echo "It's there." $NAME
if [ -f "$i" ]; then
echo "It's there." $NAME
printf "\n\n"
fi
done
Your code isn't getting stuck. It will just be very, very slow since it needs to build up the list of all files before entering the for loop. The standard alternative is to use find, but you need to be careful about what exactly you want to do. If you want it to behave exactly like your for loop, which means i) ignore hidden files (those whose name starts with .) and ii) follow symlinks, you can do this (assuming GNU find since you are on Linux):
find -L . -type f -not -name '.*' -printf '.\n' | wc -l
That will print a . for each file found, so wc -l will give you the number of files. The -L makes find dereference symlinks and the -not -name '.*' will exclude hidden files.
If you want to iterate over the output and do something to each file, you would need to use this:
find -L . -type f -not -name '.*' -print0 |
while IFS= read -r -d '' file; do
printf -- "FILE: %s\n" "$file"
done
Perhaps this approach may help:
#!/bin/bash
shopt -s globstar
shopt -s nullglob
echo "ggg"
for homedir in /home/*/
do
for i in "$homedir"**
do
NAME=${i}
echo "It's there." "$NAME"
if [ -f "$i" ]; then
echo "It's there." "$NAME"
printf "\n\n"
fi
done
done
Update: Another approach in pure bash might be
#!/bin/bash
shopt -s nullglob
walktree() {
local file
for file in *; do
[[ -L $file ]] && continue
if [[ -f $file ]]; then
# Do something with the file "$PWD/$file"
echo "$PWD/$file"
elif [[ -d $file ]]; then
cd "./$file" || exit
walktree
cd ..
fi
done
}
cd /home || exit
walktree

bash move 500 directories at a time to subdirectory from a total of 160,000 directories

I needed to move a large s3 bucket to a local file store for a variety of reasons, and the files were stored as 160,000 directories with subdirectories.
As this is just far too many folders to look at with something like a gui FTP interface, I'd like to move the 160,000 root directories into, say, 320 directories - 500 directories in each.
I'm a newbie at bash scripting, and I just wrote this up, but I'm scared I'm going to mangle the whole thing and have to redo the transfer. I tested with [[ "$i" -ge 3 ]]; and some directories with subdirectories and it looked like it worked okay, but I'm quite nervous. Do not want to retransfer all this data.
i=0;
j=0;
for file in *; do
if [[ -d "$file" && ! -L "$file" ]];
then
((i++))
echo "directory $file is being written to assets_$j";
mv $file ./assets_$j/;
if [[ "$i" -ge 499 ]];
then
((j++));
((i=0));
fi
fi;
done
Thanks for the help!
find all the directories in the current folder.
Read a count of the folders.
Exec mv for each chunk
find . -mindepth 1 -maxdepth 1 -type d |
while IFS= readarray -n10 -t files && ((${#files[#]})); do
dest="./assets_$((j++))/"
echo mkdir -v -p "$dest"
echo mv -v "${files[#]}" "$dest";
done
On the condition that assets_1, assets_2, etc. do not exist in the working directory yet:
dirs=(./*/)
for (( i=0,j=1; i<${#dirs[#]}; i+=500,j++ )); do
echo mkdir ./assets_$j/
echo mv "${dirs[#]:i:500}" ./assets_$j/
done
If you're happy with the output, remove echos.
A possible way, but you have no control on the counter, is:
find . -type d -mindepth 1 -maxdepth 1 -print0 \
| xargs -0 -n 500 sh -c 'echo mkdir -v ./assets_$$ && echo mv -v "$#" ./assets_$$' _
This gets the counter of assets from the PID which only recycles when the wrap-around is reached (Linux PID recycling)
The order which findreturns is slight different then the glob * (Find command default sorting order)
If you want to have the sort order alphabetically, you can add a simple sort:
find . -type d -mindepth 1 -maxdepth 1 -print0 | sort -z \
| xargs -0 -n 500 sh -c 'echo mkdir -v ./assets_$$ && echo mv -v "$#" ./assets_$$' _
note: remove the echo if you are pleased with the output

shell script find file older than x days and delete them if they were not listet in log files

I am a newbie to scripting and i need a little shell script doing the following:
find all .txt files they are older than x days
delete them if they were not listed in logfiles (textfiles and gzipped textfiles)
I know the basics about find -mtime, grep, zgrep, etc., but it is very tricky for me to get this in a working script.
I tried something like this:
#! /bin/sh
for file in $(find /test/ -iname '*.txt')
do
echo "$file" ls -l "$file"
echo $(grep $file /test/log/log1)
done
IFS='
'
for i in `find /test/ -ctime +10`; do
grep -q $i log || echo $i # replace echo with rm if satisfied
done
Sets Internal field separator for the cases with spaces in
filenames.
Finds all files older than 10 days in /test/ folder.
Greps path in log file.
I would use something like this:
#!/bin/bash
# $1 is the number of days
log_files=$(ls /var/log)
files=$(find -iname "*.rb" -mtime -$1)
for f in $files; do
found="false"
base=$(basename $f)
for logfile in $log_files; do
res=$(zgrep $base $logfile)
if [ "x$res" != "x" ]; then
found="true"
rm $f
fi
if [ "$found" = "true" ]; then
break
fi
done
done
and call it:
#> ./find_and_delete.sh 10
You could create a small bash script that checks whether a file is in the logs or not:
$ cat ~/bin/checker.sh
#!/usr/bin/env bash
n=$(basename $1)
grep -q $n $2
$ chmod +x ~/bin/checker.sh
And then use it in a single find command:
$ find . -type f ! -exec ./checker.sh {} log \; -exec echo {} \;
This should print only the files to be deleted. Once convinced that it does what you want:
$ find . -type f ! -exec ./checker.sh {} log \; -exec rm {} \;
deletes them.

How to find files containing exactly 16 lines?

I have to find files that containing exactly 16 lines in Bash.
My idea is:
find -type f | grep '/^...$/'
Does anyone know how to utilise find + grep or maybe find + awk?
Then,
Move the matching files another directory.
Deleting all non-matching files.
I would just do:
wc -l **/* 2>/dev/null | awk '$1=="16"'
Keep it simple:
find . -type f |
while IFS= read -r file
do
size=$(wc -l < "$file")
if (( size == 16 ))
then
mv -- "$file" /wherever/you/like
else
rm -f -- "$file"
fi
done
If your file names can contain newlines then google for the find and read options to handle that.
You should use grep instead of wc because wc counts newline characters \n and will not count if the last line doesn't ends with a newline.
e.g.
grep -cH '' * 2>/dev/null | awk -F: '$2==16'
for more correct approach (without error messages, and without argument list too long error) you should combine it with the find and xargs commands, like
find . -type f -print0 | xargs -0 grep -cH '' | awk -F: '$2==16'
if you don't want count empty lines (so only lines what contains at least one character), you can replace the '' with the '.'. And instead of awk, you can use second grep, like:
find . -type f -print0 | xargs -0 grep -cH '.' | grep ':16$'
this will find all files what are contains 16 non-empty lines... and so on..
GNU sed
sed -E '/^.{16}$/!d' file
A pure bash version:
#!/usr/bin/bash
for f in *; do # Look for files in the present dir
[ ! -f "$f" ] && continue # Skip not simple files
cnt=0
# Count the first 17 lines
while ((cnt<17)) && read x; do ((++cnt)); done<"$f"
if [ $cnt == 16 ] ; then echo "Move '$f'"
else echo "Delete '$f'"
fi
done
This snippet will do the work:
find . -type f -readable -exec bash -c \
'if(( $(grep -m 17 -c "" "$0")==16 )); then echo "file $0 has 16 lines"; else echo "file $0 doesn'"'"'t have 16 lines"; fi' {} \;
Hence, if you need to delete the files that are not 16 lines long, and move those who are 16 lines long to folder /my/folder, this will do:
find . -type f -readable -exec bash -c \
'if(( $(grep -m 17 -c "" "$0")==16 )); then mv -nv "$0" /my/folder; else rm -v "$0"; fi' {} \;
Observe the quoting for "$0" so that it's safe regarding any file name with funny symbols in it (spaces, ...).
I'm using the -v option so that rm and mv are verbose (I like to know what's happening). The -n option to mv is no-clobber: a security to not overwrite an existing file; this option might not be available if you have an old system.
The good thing about this method. It's really safe regarding any filename containing funny symbols.
The bad thing(s). It forks a bash and a grep and an mv or rm for each file found. This can be quite slow. This can be fixed using trickier stuff (while still remaining safe regarding funny symbols in filenames). If you really need it, I can give you a possible answer. It will also break if file can't be (re)moved.
Remark. I'm using the -readable option to find, so that it only considers files that are readable. If you have this option, use it, you'll have a more robust command!
I would go with
find . -type f | while read f ; do
[[ "${f##*/}" =~ ^.{16}$ ]] && mv "${f}" <any_directory> || rm -f "${f}"
done
or
find . -type f | while read f ; do
[[ $(echo -n "${f##*/}" | wc -c) -eq 16 ]] && mv "${f}" <any_directory> || rm -f "${f}"
done
Replace <any_directory> with the directory you actually want to move the files to.
BTW, find command will go sub-directories. if you don't want this, then you should change the find command to fit your need.

Rename file names in current directory and all subdirectories

I have 4 files with the following names in different directories and subdirectories
tag0.txt, tag1.txt, tag2.txt and tag3.txt
and wish to rename them as tag0a.txt, tag1a.txt ,tag2a.txt and tag3a.txt in all directories and subdirectories.
Could anyone help me out using a shell script?
Cheers
$ shopt -s globstar
$ rename -n 's/\.txt$/a\.txt/' **/*.txt
foo/bar/tag2.txt renamed as foo/bar/tag2a.txt
foo/tag1.txt renamed as foo/tag1a.txt
tag0.txt renamed as tag0a.txt
Remove -n to rename after checking the result - It is the "dry run" option.
This can of course be done with find:
find . -name 'tag?.txt' -type f -exec bash -c 'mv "$1" ${1%.*}a.${1##*.}' -- {} \;
Here is a posix shell script (checked with dash):
visitDir() {
local file
for file in "$1"/*; do
if [ -d "$file" ]; then
visitDir "$file";
else
if [ -f "$file" ] && echo "$file"|grep -q '^.*/tag[0-3]\.txt$'; then
newfile=$(echo $file | sed 's/\.txt/a.txt/')
echo mv "$file" "$newfile"
fi
fi
done
}
visitDir .
If you can use bashisms, just replace the inner IF with:
if [[ -f "$file" && "$file" =~ ^.*/tag[0-3]\.txt$ ]]; then
echo mv "$file" "${file/.txt/a.txt}"
fi
First check that the result is what you expected, then possibly remove the "echo" in front of the mv command.
Using the Perl script version of rename that may be on your system:
find . -name 'tag?.txt' -exec rename 's/\.txt$/a$&/' {} \;
Using the binary executable version of rename:
find . -name 'tag?.txt' -exec rename .txt a.txt {} \;
which changes the first occurrence of ".txt". Since the file names are constrained by the -name argument, that won't be a problem.
Is this good enough?
jcomeau#intrepid:/tmp$ find . -name tag?.txt
./a/tag0.txt
./b/tagb.txt
./c/tag1.txt
./c/d/tag3.txt
jcomeau#intrepid:/tmp$ for txtfile in $(find . -name 'tag?.txt'); do \
mv $txtfile ${txtfile%%.txt}a.txt; done
jcomeau#intrepid:/tmp$ find . -name tag*.txt
./a/tag0a.txt
./b/tagba.txt
./c/d/tag3a.txt
./c/tag1a.txt
Don't actually put the backslash into the command, and if you do, expect a '>' prompt on the next line. I didn't put that into the output to avoid confusion, but I didn't want anybody to have to scroll either.

Resources