Shell script: Check if a Directory is of YYYY_MM_DD_HH this format - shell

I have a script that creates a file list of directories available in another path.
Now, I would like to do some tasks only if the Directory is of the format "YYYY_MM_DD_HH" in this file list.
My file list has following entries:
2014_04_21_01
asdf
2012_01_19_10
2010_01
Now I would like to move the directories with names as YYYY_MM_DD_HH to another path. I.e., only 2014_04_21_01 & 2012_01_19_10 MUST be MOVED.
Please advise.

Use bash regex pattern matching:
for dir in $list
do if [[ "$dir" =~ ^[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]{2}$ ]]
then mv "$dir" newdir/
fi
done

Assuming you have a GNU version of sed on your computer, you could use it to easily parse your directory names and execute a command.
Say we have following input file:
2014_04_21_01
asdf
2012_01_19_10
2010_01
2012_01_19_10_09
62012_01_19_10
You can search for your regex with sed and replace it with a mv command as follows:
sed 's/^[0-9]\{4\}\(_[0-9]\{2\}\)\{3\}$/mv "&" "other_dir"/' file_list
will output:
mv "2014_04_21_01" "other_dir" # We want to run this
asdf
mv "2012_01_19_10" "other_dir" # and this
2010_01
2012_01_19_10_09
62012_01_19_10
Now if you add the (GNU sed) e option at the end of sed substitution (and -n option before sed script to ensure only successul substitutions are executed), the generated command will be piped into your shell:
sed -n 's/^[0-9]\{4\}\(_[0-9]\{2\}\)\{3\}$/mv "&" "other_dir"/e' file_list
# ^^ ^
I would recommand to run it first without the e option so as to check that mv commands will be properly formatted.

Why to make separate file for file list. Just go in that directory execute following command. I have taken the destination directory as /home/newdir/
ls | grep [0-9][0-9][0-9][0-9]_[01][0-9]_[0123][0-9]_[012][0-9] | awk '{print $0" /home/newdir/"}' | xargs mv
Be Careful while working with dates. As you have mentioned that file name is in format YYYY_MM_DD_HH then we have restrictions on MM,DD and HH. If we talk about restrictions then we know how a calendar is constructed. So 9999_99_99_99 is invalid file name. It is not satisfying YYYY_MM_DD_HH.
We have to build script for restrictions or I can say whole calendar. Still working on it.

Example:
perl -nle 'system("mv $_ dir/year$1") if /^(\d{4})_\d\d_\d\d_\d\d/$' flist
would extract the year and rename dir 2014_04_21_01 to dir/year2014

This single find command with -regex option should take care of this:
cd /base/path/of/these/dirs
find . -type d -regextype posix-egrep -regex '.*/[0-9]{4}_[0-9]{2}_[0-9]{2}_[0-9]{2}$' \
-exec mv '{}' /dest/dir/ \;

Related

How to find many files from txt file in directory and subdirectories, then copy all to new folder

I can't find posts that help with this exact problem:
On Mac Terminal I want to read a txt file (example.txt) containing file names such as:
20130815 144129 865 000000 0172 0780.bmp
20130815 144221 511 000003 1068 0408.bmp
....100 more
And I want to search for them in a certain folder/subfolders (example_folder). After each find, the file should be copied to a new folder x (new_destination).
Your help would be much appreciated!
Chers,
Mo
You could use a piped command with a combination of ls, grep, xargs and cp.
So basically you start with getting the list of files
ls
then you filter them with egrep -e, grep -e or whatever flavor of grep Mac uses for their terminal. If you want to find all files ending with text you can use the regex .txt$ (which means ends with '.txt')
ls | egrep -e "yourRegexExpression"
After that you get an input stream, but cp doesn't work with input streams and only takes a bunch of arguments, that's why we use xargs to convert it to arguments. The final step is to add the flag -t to the argument to signify that the next argument is the target directory.
ls | egrep -e "yourRegexExpression" | xargs cp -t DIRECTORY
I hope this helps!
Edit
Sorry I didn't read the question well enough, I updated to be match your problem. Here you can see that the egrep command compiles a rather large regex string with all the file names in this way (filename1|filename2|...|fileN). The $() evaluates the command inside and uses the tr to translate newLines to "|" for the regex.
ls | egrep -e "("$(cat yourtextfile.txt | tr "\n" "|")")" | xargs cp -t DIRECTORY
You could do something like:
$ for i in `cat example.txt`
find /search/path -type f -name "$i" -exec cp "{}" /new/path \;
This is how it works, for every line within example.txt:
for i in `cat example.txt`
it will try to find a file matching the line $i in the defined path:
find /search/path -type f -name "$i"
And if found it will copy it to the desired location:
-exec cp "{}" /new/path \;

Shell Script for Bulk renaming of files

I want to recursively rename all files in directory path by changing their prefix.
For Example
XYZMyFile.h
XYZMyFile.m
XYZMyFile1.h
XYZMyFile1.m
XYZMyFile2.h
XYZMyFile2.m
TO
ABCMyFile.h
ABCMyFile.m
ABCMyFile1.h
ABCMyFile1.m
ABCMyFile2.h
ABCMyFile2.m
These files are under a directory structure with many layers. Can someone help me with a shell script for this bulk task?
A different approach maybe:
ls *.{h,m} | while read a; do n=ABC$(echo $a | sed -e 's/^XYZ//'); mv $a $n; done
Description:
ls *.{h,m} --> Find all files with .h or .m extension
n=ABC --> Add a ABC prefix to the file name
sed -e 's/^XYZ//' --> Removes the XYZ prefix from the file name
mv $a $n --> Performs the rename
Set globstar first and then use rename like below:
# shopt -s globstar # This will cause '**' to expand to each and everything
# ls -R
.:
nXYZ1.c nXYZ2.c nXYZ3.c subdir XYZ1.m XYZ2.m XYZ3.m
nXYZ1.h nXYZ2.h nXYZ3.h XYZ1.c XYZ2.c XYZ3.c
nXYZ1.m nXYZ2.m nXYZ3.m XYZ1.h XYZ2.h XYZ3.h
./subdir:
nXYZ1.c nXYZ1.m nXYZ2.h nXYZ3.c nXYZ3.m XYZ1.h XYZ2.c XYZ2.m XYZ3.h
nXYZ1.h nXYZ2.c nXYZ2.m nXYZ3.h XYZ1.c XYZ1.m XYZ2.h XYZ3.c XYZ3.m
# rename 's/^XYZ(.*.[mh])$/ABC$1/;s/^([^\/]*\/)XYZ(.*.[mh])$/$1ABC$2/' **
# ls -R
.:
ABC1.h ABC2.m nXYZ1.c nXYZ2.c nXYZ3.c subdir XYZ3.c
ABC1.m ABC3.h nXYZ1.h nXYZ2.h nXYZ3.h XYZ1.c
ABC2.h ABC3.m nXYZ1.m nXYZ2.m nXYZ3.m XYZ2.c
./subdir:
ABC1.h ABC2.h ABC3.h nXYZ1.c nXYZ1.m nXYZ2.h nXYZ3.c nXYZ3.m XYZ2.c
ABC1.m ABC2.m ABC3.m nXYZ1.h nXYZ2.c nXYZ2.m nXYZ3.h XYZ1.c XYZ3.c
# shopt -u globstar # Unset gobstar
This may be the simplest way to achieve your objective.
Note1 : Here I am not changing nXYZ to nABC as you have noticed. If they are meant to be changed the simplified rename command would be
rename 's/XYZ(.*.[mh])$/ABC$1/' **
Note2 : The question has mentioned nothing about multiple occurrences of XYZ. So nothing done in this regard.
Easy find and rename (the binary in /usr/bin, not the Perl function mentioned)
Yes, there is a command to do this non-recursive already.
rename XYZ ABC XYZ*
rename --help
Usage:
rename [options] expression replacement file...
Options:
-v, --verbose explain what is being done
-s, --symlink act on symlink target
-h, --help display this help and exit
-V, --version output version information and exit
For more details see rename(1).
edit: missed the "many layers of directory" part of the question, b/c it's a little messy. Adding the find.
Easiest to remember:
find . -type f -name "*.pdf" -exec rename XYZ ABC {} \;
Probably faster to finish:
find . -type d -not -path "*/\.*" -not -name ".*" -exec rename XYZ ABC {}/*.pdf \;
I'm not sure how to get easier than one command line of code.
For non-recursive, you can use rename which is a perl script:
rename -v -n 's/^.+(?=MyFile)/what-you-want/' *.{h,m}
test:
dir > ls | cat -n
1 XYZMyFile1.h
2 XYZMyFile1.m
3 XYZMyFile.h
4 XYZMyFile.m
dir >
dir > rename -v -n 's/^.+(?=MyFile)/what-you-want/' *.{h,m}
rename(XYZMyFile1.h, what-you-wantMyFile1.h)
rename(XYZMyFile1.m, what-you-wantMyFile1.m)
rename(XYZMyFile.h, what-you-wantMyFile.h)
rename(XYZMyFile.m, what-you-wantMyFile.m)
dir >
and for recursive,use find + this command
If you do not have access to rename, you can use perl directly like so:
perl -le '($old=$_) && s/^xzy/abc/g && rename($old,$_) for <*.[mh]>'
and here is a screen-shot
and with renrem, a CLI I developed using C++, specifically for renaming

Trying to rename certain file types within recursive directories

I have a bunch of files within a directory structure as such:
Dir
SubDir
File
File
Subdir
SubDir
File
File
File
Sorry for the messy formatting, but as you can see there are files at all different directory levels. All of these file names have a string of 7 numbers appended to them as such: 1234567_filename.ext. I am trying to remove the number and underscore at the start of the filename.
Right now I am using bash and using this oneliner to rename the files using mv and cut:
for i in *; do mv "$i" "$(echo $i | cut -d_ -f2-10)"; done
This is being run while I am CD'd into the directory. I would love to find a way to do this recursively, so that it only renamed files, not folders. I have also used a foreach loop in the shell, outside of bash for directories that have a bunch of folders with files in them and no other subdirectories as such:
foreach$ set p=`echo $f | cut -d/ -f1`
foreach$ set n=`echo $f | cut -d/ -f2 | cut -d_ -f2-10`
foreach$ mv $f $p/$n
foreach$ end
But that only works when there are no other subdirectories within the folders.
Is there a loop or oneliner I can use to rename all files within the directories? I even tried using find but couldn't figure out how to incorporate cut into the code.
Any help is much appreciated.
With Perl‘s rename (standalone command):
shopt -s globstar
rename -n 's|/[0-9]{7}_([^/]+$)|/$1|' **/*
If everything looks fine remove -n.
globstar: If set, the pattern ** used in a pathname expansion context will
match all files and zero or more directories and subdirectories. If
the pattern is followed by a /, only directories and subdirectories
match.
bash does provide functions, and these can be recursive, but you don't need a recursive function for this job. You just need to enumerate all the files in the tree. The find command can do that, but turning on bash's globstar option and using a shell glob to do it is safer:
#!/bin/bash
shopt -s globstar
# enumerate all the files in the tree rooted at the current working directory
for f in **; do
# ignore directories
test -d "$f" && continue
# separate the base file name from the path
name=$(basename "$f")
dir=$(dirname "$f")
# perform the rename, using a pattern substitution on the name part
mv "$f" "${dir}/${name/#???????_/}"
done
Note that that does not verify that file names actually match the pattern you specified before performing the rename; I'm taking you at your word that they do. If such a check were wanted then it could certainly be added.
How about this small tweak to what you have already:
for i in `find . -type f`; do mv "$i" "$(echo $i | cut -d_ -f2-10)"; done
Basically just swapping the * with `find . -type f`
Should be possible to do this using find...
find -E . -type f \
-regex '.*/[0-9]{7}_.*\.txt' \
-exec sh -c 'f="${0#*/}"; mv -v "$0" "${0%/*}/${f#*_}"' {} \;
Your find options may be different -- I'm doing this in FreeBSD. The idea here is:
-E instructs find to use extended regular expressions.
-type f causes only normal files (not directories or symlinks) to be found.
-regex ... matches the files you're looking for. You can make this more specific if you need to.
exec ... \; runs a command, using {} (the file we've found) as an argument.
The command we're running uses parameter expansion first to grab the target directory and second to strip the filename. Note the temporary variable $f, which is used to address the possibility of extra underscores being part of the filename.
Note that this is NOT a bash command, though you can of course run it from the bash shell. If you want a bash solution that does not require use of an external tool like find, you may be able to do the following:
$ shopt -s extglob # use extended glob format
$ shopt -s globstar # recurse using "**"
$ for f in **/+([0-9])_*.txt; do f="./$f"; echo mv "$f" "${f%/*}/${f##*_}"; done
This uses the same logic as the find solution, but uses bash v4 extglob to provide better filename matching and globstar to recurse through subdirectories.
Hope these help.

shell script does not find the directory

I'm starting in the shell script.I'm need to make the checksum of a lot of files, so I thought to automate the process using an shell script.
I make to scripts: the first script uses an recursive ls command with an egrep -v that receive as parameter the path of file inputed by me, these command is saved in a ambient variable that converts the output in a string, follow by a loop(for) that cut the output's string in lines and pass these lines as a parameter when calling the second script; The second script take this parameter and pass they as parameter to hashdeep command,wich in turn is saved in another ambient variable that, as in previous script,convert the output's command in a string and cut they using IFS,lastly I'm take the line of interest and put then in a text file.
The output is:
/home/douglas/Trampo/shell_scripts/2016-10-27-001757.jpg: No such file
or directory
----Checksum FILE: 2016-10-27-001757.jpg
----Checksum HASH:
the issue is: I sets as parameter the directory ~/Pictures but in the output error they return another directory,/home/douglas/Trampo/shell_scripts/(the own directory), in this case, the file 2016-10-27-001757.jpg is in the ~/Pictures directory,why the script is going in its own directory?
First script:
#/bin/bash
arquivos=$(ls -R $1 | egrep -v '^d')
for linha in $arquivos
do
bash ./task2.sh $linha
done
second script:
#/bin/bash
checksum=$(hashdeep $1)
concatenado=''
for i in $checksum
do
concatenado+=$i
done
IFS=',' read -ra ADDR <<< "$concatenado"
echo
echo '----Checksum FILE:' $1
echo '----Checksum HASH:' ${ADDR[4]}
echo
echo ${ADDR[4]} >> ~/Trampo/shell_scripts/txt2.txt
I think that's...sorry about the English grammatic errors.
I hope that the question has become clear.
Thanks ins advanced!
There are several wrong in the first script alone.
When running ls in recursive mode using -R, the output is listed per directory and each file is listed relative to their parent instead of full pathname.
ls -R doesn't list the directory in long format as implied by | grep -v ^d where it seems you are looking for files (non directories).
In your specific case, the missing file 2016-10-27-001757.jpg is in a subdirectory but you lost the location by using ls -R.
Do not parse the output of ls. Use find and you won't have the same issue.
First script can be replaced by a single line.
Try this:
#!/bin/bash
find $1 -type f -exec ./task2.sh "{}" \;
Or if you prefer using xargs, try this:
#!/bin/bash
find $1 -type f -print0 | xargs -0 -n1 -I{} ./task2.sh "{}"
Note: enclosing {} in quotes ensures that task2.sh receives a complete filename even if it contains spaces.
In task2.sh the parameter $1 should also be quoted "$1".
If task2.sh is executable, you are all set. If not, add bash in the line so it reads as:
find $1 -type f -exec bash ./task2.sh "{}" \;
task2.sh, though not posted in the original question, is not executable. It has a missing execute permission.
Add execute permission to it by running chmod like:
chmod a+x task2.sh
Goodluck.

Move all files from subdirectory into a new directory without overwriting

I want to consolidate into 1 directory files that are in multiple subdirectories.
The following comes close except that the random string is added after the extension; I want it before the extension:
find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$0" "./$( mktemp "$( basename "$0" ).XXX" )"' '{}' \;
I've searched through dozens of other posts but nothing addressed the specifics of my situation:
I'm on OS X (so it's a BSD flavor of Bash; for ex. there's no -t option for mv)
Many of the files have identical names so I need to rewrite them during the mv (and I can't just use the -n option for mv because there too many files would thus not get moved)
The files are not all the same kind, so I need to use a find -type f
I want to exclude .DS_store files, so it seems like a good option is find -type f -iname "[a-z,0-9]*"
I want the rewritten files's names to be in the form of: oldname-random_string.xyz (but I'm also OK with having the files being renamed as a sequential list: 00001.xyz, 00002.xyz, etc.)
The files are buried 4 levels down from my master directory:
Master/Top dir
Dir 2
Dir 3
Dir 4
Dir 5
file
For the sake of simplicity I prefer a bash command to a .sh script (but I'm happy with either)
GNU Solution
This uses basically the same command that you were using but I supply a template to mktemp so that the XXX pattern appears just before the suffix. With GNU sed:
find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$1" "./$(mktemp -u "$(basename "$1" | sed -E -e '\''s/\.([^.]+)$/.XXX.\1/'\'' -e '\''/XXX/ !s/$/.XXX/'\'')" )"' _ '{}' \;
The key addition above is the use of sed to insert XXX before the suffix in the file name:
sed -E -e 's/\.([^.]+)$/.XXX.\1/' -e '/XXX/ !s/$/.XXX/'
This has two commands. The first puts .XXX before the extension. The second command is run only if the file name has no extension in which case it adds .XXX to the end of the file name.
In the first command, the source regex consists of two parts. The first is \. which matches a period. The second is ([^.]+)$ which captures the extension into group 1. The substitution replaces this with .XXX.\1 where \1 is sed notation for group 1 which, in our case, is the file's extension.
OSX Solution
Under OSX, mktemp is not useful because it only supports templates with the XXX part trailing. As a workaround, we can use a bash script that generates non-overlapping file names:
#!/bin/bash
find . -type f -iname "[a-z,0-9]*" -print0 |
while IFS= read -r -d '' fname
do
new=$(basename "$fname")
[ "$fname" = "./$new" ] && continue
[ "$new" = .DS_store ] && continue
name=${new%.*}
ext=${new#"$name"}
n=0
new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
while [ -f "$new" ]
do
n=$(($n + 1))
new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
done
mv -v "$fname" "$new"
done
The above uses the find command to get the file names. The option -print0 is used to assure that it works with difficult file names. The while loop reads these file names one by one, into the variable fname. fname includes the full path to the source file. The file name without the path is then stored in new. Then two checks are performed. If the source file is already in the current directory, the script continues on to the next loop. Similarly, if the file name id .DS_Store, it is also skipped. (The find command, as given, already skips these files. This line is there just for future flexibility.) Next, the file name is split into two parts: the name and ext, the extension. ext includes the leading period. Next, a loop checks for files of the form name.NNN.ext and stops at the first one that doesn't yet exist. The source file is moved to a file of that name.
Related Notes Regarding the GNU Solution and its Compatibility
Quoting in the above GNU command is complex. The argument to bash -c needs to be in single-quotes to prevent the calling bash from performing premature variable substitution. In addition, the sed commands need to be in single-quotes when executed by the bash subshell to prevent history expansion from interfering with the use of negation, !, within the sed command.
The OSX (BSD) sed does not support combining commands together with semicolons. Consequently, each command is supplied to sed via a separate -e option.
The OSX (BSD) sed seems to treat + differently from the GNU sed. This incompatibility seems to go away when using the -E (extended regex) option. (The corresponding GNU option is -r but, as an undocumented compatibility feature, GNU sed supports -E also.

Resources