To move blocks of folders within a specific range in bash script - bash

I have to handle something like 2775 folders, each of which contains some files. Each folder is named as conf+n, where n is a number that ranges from 1 to 2775 (e.g. conf1, ...conf250,...conf2775). I created some parent folders, named A10, A20...A100, A200,...A1000,A2000 etc. The problem is the following: I would like to move the first 100 folders, so conf1-conf100 to A100, conf101-con200 to A200 etc.
The problem is that the total number of folders (2775) is not fixed, since I would like my script to adapt for different ranges of folders.
In my trial script I create 250 conf# folders as well as 25 A# folders, but I got stucked in the process of moving the conf1-conf10 to A10, conf11-conf20 to A20 etc. I tried to create a iterative for loop for that, but it failed.
#!/bin/bash
for d in ./test_abc/
do (cd "$d" &&
for ((i=1;i<=250;i++))
do
mkdir $i
done);
done
while true; do
if [ "250" -le "500" ];
then
pfldrs=$((250 / 10))
echo -en '\n'
echo -e "\x1B[31m CREATING BLOCKS ...\x1B[0m \x1B[32m $op \x1B[0m"
echo -en '\n'
for d in ./test_abc/
do (cd "$d" &&
for ((k=1;k<=$pfldrs;k++))
do
m=$(($k*10))
mkdir A$m
done);
done
break
fi
done
I expect the A10 folder to contain folders conf1-con10, folderA20 to contain folders conf11-conf20, etc.
Thanks in advance for the help!

I just made a simple script, But I think there are better ideas.
move.sh
#!/bin/bash
source_dir=$1
target_dir=$2
interval=$3
total_dir=$(basename $(find $source_dir -type d -name "conf*" | sort -Vr | head -1) | sed "s/conf//g")
echo "total $total_dir interval $interval"
target=0
for i in $(seq 1 $total_dir); do
if [ $i -gt $target ]; then
target=$(($target + $interval))
mkdir $target_dir/A$target
fi
mv $source_dir/conf$i $target_dir/A$target/
done
run
./move.sh ${source_dir where conf* dirs are exist} ${target_dir where A* dirs will be created} ${interval}
ex) ./move.sh ./conf_dir ./move_dir 10

Related

check whether a directory contains all (and only) the listed files

I'm writing unit-tests to test file-IO functions.
There's no formalized test-framework in my target language, so my idea is to run a little test program that somehow manipulates files in a test-directory, and after that checks the results in a little shell script.
To evaluate the output, I want to check a given directory whether all expected files are there and no other files have been created during the test.
My first attempt goes like this:
set -e
test -e "${test_dir}/a.txt"
test -e "${test_dir}/b.txt"
test -d "${test_dir}/dir"
find "${test_dir}" -mindepth 1 \
-not -wholename "${test_dir}/a.txt" \
-not -wholename "${test_dir}/b.txt" \
-not -wholename "${test_dir}/dir" \
| grep . && exit 1 || true
This properly detects whether there are two files a.txt and b.txt, and a subdirectory dir/ in the ${test_dir}.
If there happens to be a file c.txt, the test should and will fail.
However, this doesn't scale well.
There are dozens of unit-tests and each has a different set of files/directories, so I find myself repeating lines very similar to the above again and again.
So I'd rather wrap the above into a function call like so:
if checkdirectory "${test_dir}" a.txt b.txt dir/ dir/subdir/ dir/.hidden.txt; then
echo "ok"
else
echo "ko"
fi
Unfortunately I have no clue how to implement checkdirectory (esp. the find invocation with multiple -not -wholename ... stanzas gives me headache).
To add a bit of fun, the constraints are:
support both (and differentiate between) files and directories
must (EDITed from should) run on Linux, macOS & MSYS2/MinGW, therefore:
POSIX if possible (in reality it will be bash, but probably bash<<4! so no fancy features)
EDIT
some more constraints (these didn't make it into original my late-night question; so just consider them "extra constraints for bonus points")
the test-directory may contain subdirectories and files in subdirectories (up to an arbitrary depth), so any check needs to operate on more than just the top-level directory
ideally, the paths may contain weirdo characters like spaces, linebreaks,... (this is really unit-testing. we do want to test for such cases)
the testdir is more often than not some randomly generated directory using mktemp -d, so it would be nice if we could avoid hardcoding it in the tests
no assumptions about the underlying filesystem can be made.
Assuming we have a directory tree as an example:
$test_dir/a.txt
$test_dir/b.txt
$test_dir/dir/c.txt
$test_dir/dir/"d e".txt
$test_dir/dir/subdir/
then would you please try:
#!/bin/sh
checkdirectory() {
local i
local count
local testdir=$1
shift
for i in "$#"; do
case "$i" in
*/) [ -d "$testdir/$i" ] || return 1 ;; # check if the directory exists
*) [ -f "$testdir/$i" ] || return 1 ;; # check if the file exists
esac
done
# convert each filename to just a newline, then count the lines
count=`find "$testdir" -mindepth 1 -printf "\n" | wc -l`
[ "$count" -eq "$#" ] || return 1
return 0
}
if checkdirectory "$test_dir" a.txt b.txt dir/ dir/c.txt "dir/d e.txt" dir/subdir/; then
echo "ok"
else
echo "ko"
fi
One easy fast way would be to compare the output of find with a reference string:
Lets start with an expected directory and files structure:
d/FolderA/filexx.csv
d/FolderA/filexx.doc
d/FolderA/Sub1
d/FolderA/Sub2
testassert
#!/usr/bin/env bash
assertDirContent() {
read -r -d '' s < <(find "$1" -printf '%y %p\n')
[ "$2" = "$s" ]
}
testref='d d/FolderA/
f d/FolderA/filexx.csv
f d/FolderA/filexx.doc
d d/FolderA/Sub1
d d/FolderA/Sub2'
if assertDirContent 'd/FolderA/' "$testref"; then
echo 'ok'
else
echo 'Directory content assertion failed'
fi
Testing it:
$ ./testassert
ok
$ touch d/FolderA/unwantedfile
$ ./testassert
Directory content assertion failed
$ rm d/FolderA/unwantedfile
$ ./testassert
ok
$ rmdir d/FolderA/Sub1
$ ./testassert
Directory content assertion failed
$ mkdir d/FolderA/Sub1
$ ./testassert
ok
$ rmdir d/FolderA/Sub2
# Replace with a file instead of a directory
touch d/FolderA/Sub2
$ ./testassert
Directory content assertion failed
Now if you add timestamps and other info like permissions, owner, group to the find -printf output, you can also check all these matches the asserted string output.
I don't know what you mean by differentiating between files and directories since your last if statement is somehow binary. Here's what worked for me:
#! /bin/bash
function checkdirectory()
{
test_dir="${1}"
shift
content="$#"
for file in ${content}
do
[[ -z "${test_dir}/${file}" ]] && return 1
done
# -I is meant to be appended to "ls" to ignore the files in order to check if other files exist.
matched=" -I ${content// / -I } -I ${test_dir}"
[[ -e `ls $matched` ]] && return 1
return 0
}
if checkdirectory /some/directory a.txt b.txt dir; then
echo "ok"
else
echo "ko"
fi
Here's a possible solution i dreamed up during the night.
It destroys the test-data, so might not be usable in many cases (though it might just work for paths generated on-the-fly during unit tests):
checkdirectory() {
local i
local testdir=$1
shift
# try to remove all the listed files
for i in "$#"; do
if [ "x${i}" = "x${i%/}" ]; then
rm "${testdir}/${i}" || return 1
fi
done
# the directories should now be empty,
# so try to remove those dirs that are listed
for i in "$#"; do
if [ "x${i}" != "x${i%/}" ]; then
rmdir "${testdir}/${i}" || return 1
fi
done
# finally ensure that no files are left
if find "${testdir}" -mindepth 1 | grep . >/dev/null ; then
return 1
fi
return 0
}
When invoking the checkdirectory function, deeper directories must come first (that is checkdirectory foo/bar/ foo/ rather than checkdirectory foo/ foo/bar/).

Move directory from queue directory Bash script

I'm trying to implement directory Queue.
I have the following directories:
Q_Dir
folder1
subfolder1
...
subfolderN
files1....filesN+X
....
Target_Dir
folder1
subfolder1
....
subfolderN
files1...filesN
....
I want to move maximum X files from Q_Dir to Target_Dir.
Pseudo code:
While True:
totalFiles = Count of total files in Target_Dir
If totalFiles < X then:
Move X-totalFiles files From Q_Dir to Target_Dir
Else
Sleep 5 seconds
I looking for the best solution in Linux bash script to do it
Any suggestions?
Consider the following implementation of the pseudo-code. It is a one-to-one implementation. Possible to improve, if more details will be available.
S=Q_Dir
T=Target_Dir
X=6
mkdir -p $T
while true ; do
t_count=$(find $T -type f | wc -l)
if [[ "$t_count" -lt "$X" ]] ; then
readarray -t -n "$((X-t_count))" files <<< "$(cd $S && find . -type f)"
echo "F=${#files[#]}"
for f in "${files[#]}" ; do
d=${f%/*}
mkdir -p $T/$d
mv $S/$f $T/$d/
done
sleep 3
else
sleep 5
fi
done
Note that the code does not provide atomic update - if multiple instances of the script will be executing against the same source or target folder.

Nested for loop to enter and exit multiple directories Bash script

As an example, I have 7 directories each containing 4 files. The 4 files follow the following naming convention name_S#_L001_R1_001.fastq.gz. The sed command is to partially keep the unique file name.
I have a nested for loop in order to enter a directory and perform a command, exit the directory and proceed to the next directory. Everything seems to be working beautifully, however the code gets stuck on the last directory looping 4 times.
for f in /completepath/*
do
[ -d $f ] && cd "$f" && echo Entering into $f
for y in `ls *.fastq.gz | sed 's/_L00[1234]_R1_001.fastq.gz//g' | sort -u`
do
echo ${y}
done
done
Example output-
Entering into /completepath/m_i_cast_avpv_1
iavpvcast1_S6
Entering into /completepath/m_i_cast_avpv_2
iavpvcast2_S6
Entering into /completepath/m_i_int_avpv_1
iavpvint1_S5
Entering into /completepath/m_i_int_avpv_2
iavpvint2_S5
Entering into /completepath/m_p_cast_avpv_1
pavpvcast1_S8
Entering into /completepathd/m_p_int_avpv_1
pavpvint1_S7
Entering into /completepath/m_p_int_avpv_2
pavpvint2_S7
pavpvint2_S7
pavpvint2_S7
pavpvint2_S7
Any recommendations of how to correctly exit the inner loop?
It looks like /completepath/ contains some entries that are not directories. When the loop over /completepath/* sees something that's not a directory, it doesn't enter it, thanks to the [ -d $f ] check.
But it still continues to run the next for y in ... loop.
At that point the script is still in the previous directory it has seen.
One way to solve that is to skip the rest of the loop when $f is not a directory:
if [ -d $f ]; then
cd "$f" && echo Entering into $f
else
continue
fi
There's an even better way. By writing /completepath/*/ only directory entries will be matched, so you can simplify your loop to this:
for f in /completepath/*/
do
cd "$f" && echo "Entering into $f" || { echo "Error: could not enter into $f"; continue; }
for y in $(ls *.fastq.gz | sed 's/_L00[1234]_R1_001.fastq.gz//g' | sort -u)
do
echo ${y}
done
done

How to split the file path to extract the various subfolders into variables? (Ubuntu Bash)

I need help with Ubuntu Precise bash script.
I have several tiff files in various folders
masterFOlder--masterSub1 --masterSub1-1 --file1.tif
|--masterSub1-2 --masterSub1-2-1 --file2.tif
|
|--masterSub2 --masterSub1-2 .....
I need to run an Imagemagick command and save them to new folder "converted" while retaining the sub folder tree i.e. the new tree will be
converted --masterSub1 --masterSub1-1 --file1.png
|--masterSub1-2 --masterSub1-2-1 --file2.png
|
|--masterSub2 --masterSub1-2 .....
How do i split the filepath into folders, replace the first folder (masterFOlder to converted) and recreate a new file path?
Thanks to everyone reading this.
This script should work.
#!/bin/bash
shopt -s extglob && [[ $# -eq 2 && -n $1 && -n $2 ]] || exit
MASTERFOLDER=${1%%+(/)}/
CONVERTFOLDER=$2
OFFSET=${#MASTERFOLDER}
while read -r FILE; do
CPATH=${FILE:OFFSET}
CPATH=${CONVERTFOLDER}/${CPATH%.???}.png
CDIR=${CPATH%/*}
echo "Converting $FILE to $CPATH."
[[ -d $CDIR ]] || mkdir -p "$CDIR" && echo convert "$FILE" "$CPATH" || echo "Conversion failed."
done < <(exec find "${MASTERFOLDER}" -mindepth 1 -type f -iname '*.tif')
Just replace echo convert "$FILE" "$CPATH" with the actual command you use and run bash script.sh masterfolder convertedfolder

BASH user drive selection

I am creating a simple script for mac os x to provide a user with a list of available drives to backup from based on the contents of /Volumes, but I am running into an issue with handling the output of the 'find' command if the drive name contains a space. The find command outputs each drive on a separate line, but the 'for each' breaks the name into parts. Example:
Script:
#!/bin/bash
find /Volumes -maxdepth 1 -type d
echo ""
i=1
for Output in $(find /Volumes -maxdepth 1 -type d)
do
DriveChoice[$i]=$Output
echo $i"="${DriveChoice[$i]}
i=$(( i+1 ))
done
Output:
/Volumes
/Volumes/backup
/Volumes/EZBACKUP DRIVE
/Volumes/Tech
1=/Volumes
2=/Volumes/backup
3=/Volumes/EZBACKUP
4=DRIVE
5=/Volumes/Tech
logout
[Process completed]
This seems like it should be fairly straight-forward. Is there a better way for me to accomplish this?
Update: Thank you chepner, that works perfectly. It is a simple script to generate a ditto command, but I will post it here anyway in case someone finds any part of it useful:
#!/bin/bash
#Get admin rights
sudo -l -U administrator bash
#Set the path to the backup drive
BackupPath="/Volumes/backup/"
#Generate a list of source drives, limiting out invalid options
i=1
while read -r Output; do
if [ "$Output" != "/Volumes" ] && [ "$Output" != "/Volumes/backup" ] && [ "$Output" != "/Volumes/Tech" ] ; then
DriveChoice[$i]=$Output
echo "$i=${DriveChoice[$i]}"
i=$(( i+1 ))
fi
done < <( find /Volumes -maxdepth 1 -type d)
#Have the user select from valid drives
echo "Source Drive Number?"
read DriveNumber
#Ensure the user input is in range
if [ $DriveNumber -lt $i ] && [ $DriveNumber -gt 0 ]; then
Source=${DriveChoice[$DriveNumber]}"/"
#Get the user's NetID for generating the folder structure
echo "User's NetID?"
read NetID
NetID=$NetID
#Grab today's date for generating folder structure
Today=$(date +"%m_%d_%Y")
#Destination for the Logfile
Destination=$BackupPath$NetID"_"$Today"/"
#Full path for the LogFile
LogFile=$Destination$NetID"_log.txt"
mkdir -p $Destination
touch $LogFile
#Destination for the backup
Destination=$Destination"ditto/"
#Execute the command
echo "Processing..."
sudo ditto "$Source" "$Destination" 2>&1 | tee "$LogFile"
else
#Fail if the drive selection was out of range
echo "Drive selection error!"
fi
You cannot safely iterate over the output of find using a for loop, because of the space issue you are seeing. Use a while loop with the read built-in instead:
#!/bin/bash
find /Volumes -maxdepth 1 -type d
echo ""
i=1
while read -r output; do
DriveChoice[$i]=$output
echo "$i=${DriveChoice[$i]}"
i=$(( i+1 ))
done < <( find /Volumes -maxdepth 1 -type d)

Resources