Bash script delete file inside another folder if not present in both - bash

The goal of the script is to check to see if a filename exists inside a folder. If the file name does NOT exist, then delete the file.
This is the script I got so far
#!/bin/bash
echo "What's the folder name?"
read folderName
$fileLocation="/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
for files in "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"/*
do
fileNameWithFormatFiles=${files##*$folderName/}
fileNameFiles=${fileNameWithFormat%%.png*}
for entry in "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"/*
do
fileNameWithFormat=${entry##*$folderName/}
fileName=${fileNameWithFormat%%.png*}
if [ -f "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$fileNameFiles.jpg" ]
then
echo "Found File"
else
echo $files
rm -f $files
fi
done
done
read
I have two folders, alignedImages and trainingImages.
All of the images in alignedImages will be inside trainingImages, but not the otherway around. So, I'm trying to make it so that if trainingImages does not contain a file with the same name as the file in alignedImages, then I want it to delete the file in trainingImages.
Also, the pictures are not the same, so I can't just compare md5's or hashes or whatever. Just the file names would be the same, except they are .jpg instead of .png

fileLocation="/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
echo "What's the folder name?"
read folderName
rsync --delete --ignore-existing $fileLocation $folderName
rsync command is what you are looking for and when given the --delete option it will delete from destination dir any file that doesn't exist in source dir and --ignore-existing will cause rsync skip copying files from source if a file with same name already exist in destination dir.
The side effect of this is that it would copy any file in source dir but not in destination. You say all files in source are in destination so I guess that's ok

there is a better way! files, not for loops!
#!/bin/bash
echo "What's the folder name?"
read folderName
cd "/home/daniel/Dropbox/Code/Python/FR/alignedImages/$folderName"
find . -type f -name "*.png" | sed 's/\.png//' > /tmp/align.list
cd "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"
find . -type f -name "*.jpg" | sed 's/\.jpg//' > /tmp/train.list
here's how to find files in both lists:
fgrep -f /tmp/align.list /tmp/train.list | sed 's/.*/&.jpg/' > /tmp/train_and_align.list
fgrep -v finds non-matches instead of matches: find files in train but not align:
fgrep -v -f /tmp/align.list /tmp/train.list | sed 's/.*/&.jpg/' > /tmp/train_not_align.list
test delete of all files in train_not_align.list:
cd "/home/daniel/Dropbox/Code/Python/FR/trainingImages/$folderName"
cat /tmp/train_not_align.list | tr '\n' '\0' | xargs -0 echo rm -f
(if this produces good output, remove the echo statement to actually delete those files.)

Related

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

How can I recursively replace file and directory names using Terminal?

Using the Terminal on macOS, I want to recursively replace a word with the name of both a directory and a file name. For instance, I have an angular app and the module name is article, all of the file names, and directory names contain the word article. I've already done a find and replace to replace articles with apples in the code. Now I want to do the same with the file structure so both the file names and the directories share the same convention.
Just for information, I've already tried to use the newest Yeoman generator to create new files, but there seems to be an issue with it. The alternative is to duplicate a directory and rename all of the files, this is quite time consuming.
got it to work with the following script
var=$1
if [ -n "$var" ]; then
CRUDNAME=$1
CRUDNAMEUPPERCASE=`echo ${CRUDNAME:0:1} | tr '[a-z]' '[A-Z]'`${CRUDNAME:1}
FOLDERNAME=$CRUDNAME's'
# Create new folder
cp -R modules/articles modules/$FOLDERNAME
# Do the find/replace in all the files
find modules/$FOLDERNAME -type f -print0 | xargs -0 sed -i -e 's/Article/'$CRUDNAMEUPPERCASE'/g'
find modules/$FOLDERNAME -type f -print0 | xargs -0 sed -i -e 's/article/'$CRUDNAME'/g'
# Delete useless files due to sed
rm modules/$FOLDERNAME/**/*-e
rm modules/$FOLDERNAME/**/**/*-e
rm modules/$FOLDERNAME/**/**/**/*-e
# Rename all the files
for file in modules/$FOLDERNAME/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
for file in modules/$FOLDERNAME/**/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
for file in modules/$FOLDERNAME/**/**/**/*article* ; do mv $file ${file//article/$CRUDNAME} ; done
else
echo "Usage: sh rename-module.sh [crud-name]"
fi
apparently I'm not the only one to encounter this issue
https://github.com/meanjs/generator-meanjs/issues/79

c shell script: find directory and rename the output of find

i am still new in this shell script. I have a task that given to me which i have difficulty to execute it.
The task is i have a lot of directory which is based on ddmmyy.
03Mar2014 08Aug2013 11Jan2015 16Jan2014 22Feb2014 26Mar2014
03Nov2013 08Jan2014 11Jul2013 16Jul2013 22Jul2013 26Oct2014
03Oct2013 08Jan2015 11Nov2014 16May2014 22Mar2014 26Sep2013
The task is to make the directory to mmyy.
So far, my code is
foreach file(`find . -type d | awk -F/ 'NF == 3'`)
echo $file
set newmove = `echo $file | cut -c 1-2,5-`
echo $newmove
mv $file $newmove
output:
for find . -type d | awk -F/ 'NF == 3':
./24Jan2015/W51A
`echo $file | cut -c 1-2,5-`
./Jan2015/W51A
mv $file $newmove
mv: cannot rename ./24Jan2015/W51A to ./Jan2015/W51A: No such file or directory
but the script didnt work.
Do you guys have any idea how to do this?
First of all, the issue is you're renaming the file but actually tryng to rename the directory, hence the error.
As I understood the idea is to rename the folders in the current working directory to the desired format and by this actually merge the content of the folders from the same mmYYYY format to the new one (since 11Jan2015 and 16Jan2014 will be renamed both to Jan2014)
you can try this:
foreach dir ( `ls` )
set newdir = `echo $dir| cut -c 3-`
mkdir -p $newdir
mv -f $dir/* $newdir
rmdir $dir
end
-p will create the folder and will do nothing if the folder already exists.
Some assumptions:
The folders are at the same place which is pwd
There are always two leading digits in date
You want to merge the folders content of the same mmYYYY format
The files with the same name in different folders will be overwritten
There are only folders in pwd (add check if it's not the case)
In case these folders are in different places (which is not the case according to your output: ./24Jan2015 ) and collision is not an issue - the code should be changed to :
Use find
Create the new folder with the correct path
No merge and overwrite will occur, so 1,3,4 and 5 are not relevant.
UPDATE:
After additional input - if I understand correctly your find is looking only for the folders of depth 3. I can't say why but you can achieve the same much faster with
find . -type d -mindepth=2 -maxdepth=2
The output is the list of the folders that have subfolders.
Then you need to get the second folder's name (assuming it will always be of the expected format).
set olddir = `echo $file| cut -f 1-3 -d '/'`
set newdir = `echo $olddir | cut -c 1-2,5-`
and finally
foreach file(`find . -type d -mindepth=2 -maxdepth=2`)
set olddir = `echo $file| cut -f 1-3 -d '/'`
set newdir = `echo $olddir | cut -c 1-2,5-`
mkdir -p $newdir
mv -f $file $newdir
end
This will also handle the case if two folders were found under the same path.
UPDATE 2:
Since the script will run on Unix - the following updates should be made:
Original find was returned since unix find lacks the mindepth/maxdepth options
We should try to remove the olddir to cleanup the empty folders - it will fail if the folder is not empty but the script should continue to run
foreach file(`find . -type d | awk -F/ 'NF == 3'`)
set olddir = `echo $file| cut -f 1-2 -d '/'`
set newdir = `echo $olddir | cut -c 1-2,5-`
set dir_name=`basename "$file"`
if ( -d "$newdir/$dir_name" ) then
mv -f $file/* $newdir/$dir_name/
else
mkdir -p $newdir
mv -f $file $newdir
endif
rmdir $olddir
end
I really think c shell is the wrong tool for just about anything that involves programming. That said, this looks like it would do what you want with only a little help from an external tool:
#!/bin/csh
foreach file ([0-9][0-9][A-Z][a-z][a-z][0-9][0-9][0-9][0-9])
set new = `echo $file:q | cut -c 3-`
if ( -d "$new" ) then
echo "skipping $file because $new already exists"
continue
endif
mv -v "$file" "$new"
end
Note the glob that matches your list of directories to rename. This script isn't bothering to confirm whether the matched files ARE in fact directories, so if there's the possibility they might not be, you should account for that somehow.
Note that we are using a back-quoted expression to use an external tool, cut to grab a substring from each directory name. We use this (as you did in your question) because CSH IS NOT A PROGRAMMING LANGUAGE, and has no string processing capabilities of its own.
The if statement within the loop will skip any directory whose target already exists. So for example, if you were to go through the top row of your input in your question, 26Mar2014 would be converted to Mar2014, which already exists due to 03Mar2014. Since you haven't specified how this should be handled, this script skips that condition.

How to copy and rename files in shell script

I have a folder "test" in it there is 20 other folder with different names like A,B ....(actually they are name of people not A, B...) I want to write a shell script that go to each folder like test/A and rename all the .c files with A[1,2..] and copy them to "test" folder. I started like this but I have no idea how to complete it!
#!/bin/sh
for file in `find test/* -name '*.c'`; do mv $file $*; done
Can you help me please?
This code should get you close. I tried to document exactly what I was doing.
It does rely on BASH and the GNU version of find to handle spaces in file names. I tested it on a directory fill of .DOC files, so you'll want to change the extension as well.
#!/bin/bash
V=1
SRC="."
DEST="/tmp"
#The last path we saw -- make it garbage, but not blank. (Or it will break the '[' test command
LPATH="/////"
#Let us find the files we want
find $SRC -iname "*.doc" -print0 | while read -d $'\0' i
do
echo "We found the file name... $i";
#Now, we rip off the off just the file name.
FNAME=$(basename "$i" .doc)
echo "And the basename is $FNAME";
#Now we get the last chunk of the directory
ZPATH=$(dirname "$i" | awk -F'/' '{ print $NF}' )
echo "And the last chunk of the path is... $ZPATH"
# If we are down a new path, then reset our counter.
if [ $LPATH == $ZPATH ]; then
V=1
fi;
LPATH=$ZPATH
# Eat the error message
mkdir $DEST/$ZPATH 2> /dev/null
echo cp \"$i\" \"$DEST/${ZPATH}/${FNAME}${V}\"
cp "$i" "$DEST/${ZPATH}/${FNAME}${V}"
done
#!/bin/bash
## Find folders under test. This assumes you are already where test exists OR give PATH before "test"
folders="$(find test -maxdepth 1 -type d)"
## Look into each folder in $folders and find folder[0-9]*.c file n move them to test folder, right?
for folder in $folders;
do
##Find folder-named-.c files.
leaf_folder="${folder##*/}"
folder_named_c_files="$(find $folder -type f -name "*.c" | grep "${leaf_folder}[0-9]")"
## Move these folder_named_c_files to test folder. basename will hold just the file name.
## Don't know as you didn't mention what name the file to rename to, so tweak mv command acc..
for file in $folder_named_c_files; do basename=$file; mv $file test/$basename; done
done

How can I manipulate file names using bash and sed?

I am trying to loop through all the files in a directory.
I want to do some stuff on each file (convert it to xml, not included in example), then write the file to a new directory structure.
for file in `find /home/devel/stuff/static/ -iname "*.pdf"`;
do
echo $file;
sed -e 's/static/changethis/' $file > newfile +".xml";
echo $newfile;
done
I want the results to be:
$file => /home/devel/stuff/static/2002/hello.txt
$newfile => /home/devel/stuff/changethis/2002/hello.txt.xml
How do I have to change my sed line?
If you need to rename multiple files, I would suggest to use rename command:
# remove "-n" after you verify it is what you need
rename -n 's/hello/hi/g' $(find /home/devel/stuff/static/ -type f)
or, if you don't have rename try this:
find /home/devel/stuff/static/ -type f | while read FILE
do
# modify line below to do what you need, then remove leading "echo"
echo mv $FILE $(echo $FILE | sed 's/hello/hi/g')
done
Are you trying to change the filename? Then
for file in /home/devel/stuff/static/*/*.txt
do
echo "Moving $file"
mv "$file" "${file/static/changethis}.xml"
done
Please make sure /home/devel/stuff/static/*/*.txt is what you want before using the script.
First, you have to create the name of the new file based on the name of the initial file. The obvious solution is:
newfile=${file/static/changethis}.xml
Second you have to make sure that the new directory exists or create it if not:
mkdir -p $(dirname $newfile)
Then you can do something with your file:
doSomething < $file > $newfile
I wouldn't do the for loop because of the possibility of overloading your command line. Command lines have a limited length, and if you overload it, it'll simply drop off the excess without giving you any warning. It might work if your find returns 100 file. It might work if it returns 1000 files, but it might fail if your find returns 1000 files and you'll never know.
The best way to handle this is to pipe the find into a while read statement as glenn jackman.
The sed command only works on STDIN and on files, but not on file names, so if you want to munge your file name, you'll have to do something like this:
$newname="$(echo $oldname | sed 's/old/new/')"
to get the new name of the file. The $() construct executes the command and puts the results of the command on STDOUT.
So, your script will look something like this:
find /home/devel/stuff/static/ -name "*.pdf" | while read $file
do
echo $file;
newfile="$(echo $file | sed -e 's/static/changethis/')"
newfile="$newfile.xml"
echo $newfile;
done
Now, since you're renaming the file directory, you'll have to make sure the directory exists before you do your move or copy:
find /home/devel/stuff/static/ -name "*.pdf" | while read $file
do
echo $file;
newfile="$(echo $file | sed -e 's/static/changethis/')"
newfile="$newfile.xml"
echo $newfile;
#Check for directory and create it if it doesn't exist
$dirname=$(dirname "$newfile")
if [ ! -d "$dirname" ]
then
mkdir -p "$dirname"
fi
#Directory now exists, so you can do the move
mv "$file" "$newfile"
done
Note the quotation marks to handle the case there's a space in the file name.
By the way, instead of doing this:
if [ ! -d "$dirname" ]
then
mkdir -p "$dirname"
fi
You can do this:
[ -d "$dirname"] || mkdir -p "$dirname"
The || means to execute the following command only if the test isn't true. Thus, if [ -d "$dirname" ] is a false statement (the directory doesn't exist), you run mkdir.
It's a fairly common shortcut when you see shell scripts.
find ... | while read file; do
newfile=$(basename "$file").xml;
do something to "$file" > "$somedir/$newfile"
done
OUTPUT="$(pwd)";
for file in `find . -iname "*.pdf"`;
do
echo $file;
cp $file $file.xml
echo "file created in directory = {$OUTPUT}"
done
This will create a new file with name whatyourfilename.xml, for hello.pdf the new file created would be hello.pdf.xml, basically it creates a new file with .xml appended at the end.
Remember the above script finds files in the directory /home/devel/stuff/static/ whose file names match the matcher string of the find command (in this case *.pdf), and copies it to your present working directory.
The find command in this particular script only finds files with filenames ending with .pdf If you wanted to run this script for files with file names ending with .txt, then you need to change the find command to this find /home/devel/stuff/static/ -iname "*.txt",
Once I wanted to remove trailing -min from my files. i.e. wanted alg-min.jpg to turn into alg.jpg. so after some struggle, managed to figure something like this:
for f in *; do echo $f; mv $f $(echo $f | sed 's/-min//g');done;
Hope this helps someone willing to REMOVE or SUBTITUDE some part of their file names.

Resources