How to identify files which are not in list using bash? - bash

Unfortunately my knowledge in bash not so well and I have very non-standard task.
I have a file with the files list.
Example: /tmp/my/file1.txt /tmp/my/file2.txt
How can I write a script which can check that files from folder /tmp/my exist and to have two types messages after script is done.
1 - Files exist and show files:
/tmp/my/file1.txt
/tmp/my/file2.txt
2 - The folder /tmp/my including files and folders which are not in your list. The files and folders:
/tmp/my/test
/tmp/my/1.txt

You speak of files and folders, which seems unclear.
Anyways, I wanted to try it with arrays, so here we go :
unset valid_paths; declare -a valid_paths
unset invalid_paths; declare -a invalid_paths
while read -r line
do
if [ -e "$line" ]
then
valid_paths=("${valid_paths[#]}" "$line")
else
invalid_paths=("${invalid_paths[#]}" "$line")
fi
done < files.txt
echo "VALID PATHS:"; echo "${valid_paths[#]}"
echo "INVALID PATHS:"; echo "${invalid_paths[#]}"

You can check for the files' existence (assuming a list of files, one filename per line) and print the existing ones with a prefix using this
# Part 1 - check list contents for files
while read thefile; do
if [[ -n "$thefile" ]] && [[ -f "/tmp/my/$thefile" ]]; then
echo "Y: $thefile"
else
echo "N: $thefile"
fi
done < filelist.txt | sort
# Part 2 - check existing files against list
for filepath in /tmp/my/* ; do
filename="$(basename "$filepath")"
grep "$filename" filelist.txt -q || echo "U: $filename"
done
The files that exist are prefixed here with Y:, all others are prefixed with N:
In the second section, files in the tmp directory that are not in the file list are labelled with U: (unaccounted for/unexpected)
You can swap the -f test which checks that a path exists and is a regular file for -d (exists and is a directory) or -e (exists)
See
man test
for more options.

Related

Check if files exist in case some files contain [ Bash

I've got a set of files, let say
file1.txt
File2.txt
File [3].txt
file 4.txt
In my script, I store the path of each file in a var called $file.
Here is my issue:
in bash, testing the existence of it with following command
[[ ! -f "$file" ]]
WILL WORK (= system see that the file exists) for regular file like
file1.txt
File2.txt
file 4.txt BUT WILL NOT WORK (= system don't find the file - as it is not existing) with file containing [ ] in it, like File [3].txt does.
I assume it is because of the [ ] that interfer with the double [[. Testing with
test ! -f "$file"
is the same, system do not see it and return a missing file.
What can I do to escape the [ or to avoid such behaviour ? I've tried to find the solution on the net, but as I type "check if file exist with filename containing [" there is a bias as [ / [[ is used to check the existence..
Thanks for your help !
EDIT - 2022-01-15
Here is the loop I'm using
while read -r file; do
if [[ ! -f "$file" ]]; then
echo "Missing file $file"
fi
done < Compil.all ;
where Compil.all is a text file containing the path of file :
$cat Compil.all
/media/veracrypt1/file1.txt
/media/veracrypt1/File2.txt
/media/veracrypt1/File [3].txt
/media/veracrypt1/file 4.txt
$
AS I don't want to have issue with space in filenames, I've put the following code in the beginning of the script. Could it be the reason ?
IFS=$(echo -en "\n\b")
How are you storing the file var?
Simply iterating works as shown below:
$ ls
file1.txt File2.txt 'File [3].txt' 'file 4.txt'
$ for file in ./* ;do if [[ -f "$file" ]];then echo $file; fi; done
./file1.txt
./File2.txt
./File [3].txt
./file 4.txt
This also works:
$ [[ ! -f "File [3].txt" ]]
$ echo $?
1

Bash script backup, check if directory contains the files from another directory

I am making a bash backup script and I want to implement a functionality that checks if the files from a directory are already contained in another directory, if they are not I want to output the name of these files
#!/bin/bash
TARGET_DIR=$1
INITIAL_DIR=$2
TARG_ls=$(ls -A $1)
INIT_ls=$(ls -A $2)
if [[ "$(ls -A $2)" ]]; then
if [[ ! -n "$(${TARG_ls} | grep ${INIT_ls})" ]]; then
echo All files in ${INITIAL_DIR} have backups for today in ${TARGET_DIR}
exit 0
else
#code for listing the missing files
fi
else
echo Error!! ${INITIAL_DIR} has no files
exit 1
fi
I have thought about storing the ls output of both directories inside strings and comparing them, as it is shown in the code, but in the event where I have to list the files from INITIAL_DIR that are missing in TARGET_DIR, I just don't know how to proceed.
I tried using the diff command comparing the two directories but that takes into account the preexisting files of TARGET_DIR.
In the above code if [[ "$(ls -A $2)" ]]; checks if the CURRENT_DIR contains any files and if [[ ! -n "$(${TARG_ls} | grep ${INIT_ls})" ]]; checks if the target directory contains all the initial directory files.
Anyone have a suggestion, hint?
you can use comm command
$ comm <(ls -A a) <(ls -A b)
will give you files in a only, both in a and b, and in only b in three columns. To get the list of files in a only for example
$ comm -23 <(ls -A a) <(ls -A b)
rsync has a --dry-run switch that will show you what files have changed between 2 directories. Before doing rsync copies of my home directory I preview the changes this way to see if there could be evidence of mass mal encryption or corruption before proceeding.

Parse CSV to find names corresponding to code, then copying folders with matching code to folders with corresponding name

I'm trying to automate the packaging of files and contents from various sources using a bash script.
I have a main directory which contains pdf files, a csv file, and various folders with additional contents. The folders are named with the location code they pertain to, e.g. 190, 191, etc.
A typical row in my csv file looks like this: form_letters_Part1.pdf,PX_A31_smith.adam.pdf,190,
Where the first column is the original pdf name, the second is what it will be renamed to, and the third column is the location code the person belongs to.
The first part of my script renames the pdf files from the cover letters format to the PX_A31... format, and then creates a directory for each file and moves them into it.
#!/usr/bin/tcsh bash
sed 's/"//g' rename_list_lab.csv | while IFS=, read orig new num; do
mv "$orig" "$new"
done
echo 'Rename Done.'
for file in *.pdf; do
mkdir "${file%.*}"
mv "$file" "${file%.*}"
done
echo 'Directory creation done.'
What needs to happen next is the folders with the location-specific contents get copied into those new directories just created, corresponding to the location code from the csv file.
So I tried this after the above echo 'Directory Creation Done.' line:
echo 'Directory Creation Done.'
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
for folder in *; do
if [[ -d .* = "$num" ]]; then
cp -R "$folder" "${file%.*}"
fi
done
echo 'Code Folder Contents Sort Done.'
However this results in a syntax error:
syntax error in conditional expression
syntax error near `='
` if [[ -d .* = "$num" ]]; then'
EDIT: To clarify the second part if statement, the intended logic of the statement is as follows: For the items in the current directory, if it is a directory, and the name of the directory matches the location code from the csv, that directory should be copied to any directories which have that same corresponding location code in the csv.
In other words, if the newly created directory from the first part is PX_A31_smith.adam whose location code in the csv line above is 190, then the folder called 190 should be copied into the directory PX_A31_smith.adam.
If three other people also have the 190 code in the csv, the 190 directory should also be copied to those as well.
EDIT 2: I resolved the syntax error, and also realized I had an nonterminated do statement. Fixing those, still seem to be having trouble with the evaluation of the if statement. Updated script below:
#!/usr/bin/tcsh bash
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
mv "$orig" "$new"
done
echo '1 Done.'
for file in *.pdf; do
mkdir "${file%.*}"
mv "$file" "${file%.*}"
done
echo '2 done.'
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
for folder in * ; do
if [[ .* = "$num" ]]; then
cp -R "$folder" "${file%.*}"
else echo "No matches found."
fi
done
done
echo '3 Done.'
I'm not really sure if this answers your question, but I think it will at least set you on the right track. Structurally, I just combined all of the loops into one. This removes some of the possible logic errors that would not be considered syntax errors like the use of $file in the second part. This is a local variable to the loop in the first part and no longer exists. However, this would be interpreted as an empty string.
#!/usr/bin/bash
#^Fixed shebang line.
sed 's/"//g' rename_list.csv | while IFS=, read -r orig new num; do
if [[ -f $orig ]]; then #If the file we want to rename is indeed a file.
mkdir "${new%.*}" #make the directory from the file name you want
mv "$orig" "${new%.*}/$new" #Rename when we move the file into the new directory
if [[ -d $num ]]; then #If the number directory exists
cp -R "$num" "${new%.*}" #Fixed this based on your edit.
else
#Here you can handle what to do if the number directory does not exist.
echo "$num is not a directory."
fi
else
#Here you can handle what to do if the file does not exist.
echo "The file $orig does not exist."
fi
done
Edited based on your clarification
Note: This is pretty lacking as far as error checking goes. Remember, any of these functions could fail, which will have unwanted behavior. Either check if [[ $? != 0 ]] to check the exit status (0 being success) of the last issued command. You could also do something like mkdir somedir || exit 2 to exit on failure.

bash - find duplicate file in directory and rename

I have a directory that has thousands of files in it with various extensions. I also have a drop location where users drop files to be migrated to this directory. I'm looking for a script that will scan the target directory for a duplicate file name, if found, rename the file in the drop folder, then move it to the target directory.
Example:
/target/file.doc
/drop/file.doc
Script will rename file.doc to file1.doc then move it to /target/.
It needs to maintain the file extension too.
for fil in /drop/*
do
test -f "/target/$fil"
if [ "$?" = 0 ]
then
suff=$(awk -F\. '{ print "."$NF }' <<<$fil)
bdot=$(basename -s $suff $fil)
mv "/drop/$fil" "/drop/${bdot}1$suff"
cp "/drop/${bdot}1.$suff" "/target/${bdot}1$suff"
fi
done
Take each file in the drop directory and check it is existing the /target using test -e. If it does then move (rename) and then copy.
You have to take a bit more care than simply checking if a file exists before moving in order to provide a flexible solution that can handle files with or without extensions. You also may want to provide a way of forming duplicate filenames that preserves sort order. e.g. if file.txt already exists, you may want to use file_001.txt as the duplicate in target rather than file1.txt as when you reach 10 you will no longer have a canonical sort by filename.
Also, you never want to iterate with for i in $(ls dir) that is wrought with pitfalls. See Bash Pitfalls No. 1
Putting those pieces together, and including detail in the comments below, you could do something similar to the following and have a reasonable flexible solution allowing you to specify only the filename.ext to move or /path/to/drop/filename.ext. You must specify the drop and target directories in the script to meet your circumstance., e.g.
#!/bin/bash
tgt=target ## set target and drop directories as required
drp=drop
declare -i cnt=1 ## counter for filename_$cnt
test -z "$1" && { ## validate one argument given
printf "error: insufficient input\nusage: %s filename\n" "${0##*/}"
exit 1
}
test -w "$1" || test -w "$drp/$1" || { ## validate valid filename is writeable
printf "error: file not found or lack permission to move '%s'.\n" "$1"
exit 1
}
fn="${1##*/}" ## strip any path info from filename
if test "$1" != "${1%.*}" ; then
ext="${fn##*.}" ## get file extension
fnwoe="${fn%."$ext"}" ## get filename without extension
test "$fnwoe" = '' && ext= ## was a dotfile, reset ext
fi
vfn="$fn" ## set valid filename = filename
## form valid filename e.g. "$fn_001.$ext" if duplicate found
while test -e "$tgt/$vfn"; do
if test -n "$ext" ## did we have have an extension?
then
printf -v vfn "%s_%03d.%s" "$fnwoe" "$((cnt++))" "$ext"
else
printf -v vfn "%s_%03d" "$fn" "$((cnt++))"
fi
done
mv "$drp/$fn" "$tgt/$vfn" ## move file under non-conflicting name
Example drop and target
$ ls -1 drop
file
file.txt
$ ls -1 target
file.txt
file_001.txt
file_002.txt
Example Use
$ bash mvdrop.sh file
$ bash mvdrop.sh drop/file.txt
Resulting drop and target
$ ls -1 drop
$ ls -1 target
file
file.txt
file_001.txt
file_002.txt
file_003.txt
This will test to see if it exists, preserve the extension (along with any structure before the extension such as in the case of FILE.tar.gz), and move it to the target directory.
#!/bin/bash
TARGET="\target\"
DROP="\drop\"
for F in `ls $DROP`; do
if [[ -f $TARGET$F ]]; then
EXT=`echo $F | awk -F "." '{print $NF}'`
PRE=`echo $F | awk -F "." '{$NF="";print $0}' | sed -e 's/ $//g;s/ /./g'`
mv $DROP$F $DROP$PRE"1".$EXT
F=$PRE"1".$EXT
fi
mv $DROP$F $TARGET
done
Additionally you may want to do come restricting in the ls command, so that you aren't copying entire directories.
Display only regular files (no directories or symbolic links)
ls -p $DROP | grep -v /

Bash: Find any subdirectories without a given file present

I want to know if my file exists in any of the sub directories below. The sub directories are created in the steps above in my shell script, the below code always tells me the file do not exist (even if it does) and I want the path to be printed as well.
#!/bin/bash
....
if ! [[ -e [ **/**/somefile.txt && -s **/**/somefile.txt ]]; then
echo "===> Warn: somefile.txt was not created in the following path: "
# I want to be able to print the path in which file is not generated
exit 1
fi
I know the file name is somefile.txt which is to be created in all sub-directories, but the subdirectory names change a lot.. Hence globbing.
#!/bin/bash
shopt -s extglob ## enable **, which by default has no special behavior
for d in **/; do
if ! [[ -s "$d/somefile.txt" ]]; then
echo "===> WARN: somefile.txt was not created (or is empty) in $d" >&2
exit 1
fi
done

Resources