Sort files based on filename into folders and concat files within each folder based on folder name - bash

Any help would be VERY appreciated! I have hundreds of video files named in the following format (see below). The first 4 characters are random, but there is always 4. 3000 is always there.
Can someone please help me create folders based on the center of the filename (ie 000, 001, 002, 003 and so on).
Then concatenate all the files in each of the folders using ffmpeg in order in their filename. 0000.ts, 0001.ts, 0002.ts and so on to a file named 000merged.ts, 001merged.ts, 002merged.ts and so on...
This is close to what I need
find . -type f -name "*jpg" -maxdepth 1 -exec bash -c 'mkdir -p "${0%%_*}"' {} \; \
-exec bash -c 'mv "$0" "${0%%_*}"' {} ;

mkdir /tmp/test && cd $_ #or cd ~/Desktop
echo A > 1e98_3000_000_000_0000.ts #create some small test files
echo B > 1e98_3000_000_000_0001.ts
echo C > 1e98_3000_000_000_0002.ts
echo D > 1e98_3000_000_000_0003.ts
echo E > d82j_3000_001_000_0000.ts
echo F > d82j_3000_001_000_0001.ts
echo G > d82j_3000_001_000_0002.ts
echo H > d82j_3000_001_000_0003.ts
echo I > a03l_3000_002_000_0000.ts
echo J > a03l_3000_002_000_0001.ts
echo K > a03l_3000_002_000_0002.ts
echo L > a03l_3000_002_000_0003.ts
# mkdir and copy each *.ts into its dir plus rename file:
perl -E'/^...._3000_(...)_..._(....\.ts)$/&&qx(mkdir -p $1;cp -p $_ $1/$2)for#ARGV' *.ts
ls -rtl
find ??? -type f -ls
for dir in ???;do cat $dir/????.ts > $dir/${dir}merged.ts; done
ls -rtl */*merged.ts
Cleanup test:
rm -rf /tmp/test/??? #cleanup new dirs with files
rm -rf /tmp/test #cleanup all

Related

moving files to their respective folders using bash scripting

I have files in this format:
2022-03-5344-REQUEST.jpg
2022-03-5344-IMAGE.jpg
2022-03-5344-00imgtest.jpg
2022-03-5344-anotherone.JPG
2022-03-5343-kdijffj.JPG
2022-03-5343-zslkjfs.jpg
2022-03-5343-myimage-2010.jpg
2022-03-5343-anotherone.png
2022-03-5342-ebee5654.jpeg
2022-03-5342-dec.jpg
2022-03-5341-att.jpg
2022-03-5341-timephoto_december.jpeg
....
about 13k images like these.
I want to create folders like:
2022-03-5344/
2022-03-5343/
2022-03-5342/
2022-03-5341/
....
I started manually moving them like:
mkdir name
mv name-* name/
But of course I'm not gonna repeat this process for 13k files.
So I want to do this using bash scripting, and since I am new to bash, and I am working on a production environment, I want to play it safe, but it doesn't give me my results. This is what I did so far:
#!/bin/bash
name = $1
mkdir "$name"
mv "${name}-*" $name/
and all I can do is: ./move.sh name for every folder, I didn't know how to automate this using loops.
With bash and a regex. I assume that the files are all in the current directory.
for name in *; do
if [[ "$name" =~ (^....-..-....)- ]]; then
dir="${BASH_REMATCH[1]}"; # dir contains 2022-03-5344, e.g.
echo mkdir -p "$dir" || exit 1;
echo mv -v "$name" "$dir";
fi;
done
If output looks okay, remove both echo.
Try this
xargs -i sh -c 'mkdir -p {}; mv {}-* {}' < <(ls *-*-*-*|awk -F- -vOFS=- '{print $1,$2,$3}'|uniq)
Or:
find . -maxdepth 1 -type f -name "*-*-*-*" | \
awk -F- -vOFS=- '{print $1,$2,$3}' | \
sort -u | \
xargs -i sh -c 'mkdir -p {}; mv {}-* {}'
Or find with regex:
find . -maxdepth 1 -type f -regextype posix-extended -regex ".*/[0-9]{4}-[0-9]{2}-[0-9]{4}.*"
You could use awk
$ cat awk.script
/^[[:digit:]-]/ && ! a[$1]++ {
dir=$1
} /^[[:digit:]-]/ {
system("sudo mkdir " dir )
system("sudo mv " $0" "dir"/"$0)
}
To call the script and use for your purposes;
$ awk -F"-([0-9]+)?[[:alpha:]]+.*" -f awk.script <(ls)
You will see some errors such as;
mkdir: cannot create directory ‘2022-03-5341’: File exists
after the initial dir has been created, you can safely ignore these as the dir now exist.
The content of each directory will now have the relevant files
$ ls 2022-03-5344
2022-03-5344-00imgtest.jpg 2022-03-5344-IMAGE.jpg 2022-03-5344-REQUEST.jpg 2022-03-5344-anotherone.JPG

Move all files in a folder to a new location if the same existing Folder name exists at remote location

Looking for a bash script:
Here's the situation:
I have 1000's folders and subfolders on my Backup Directory Drive
lets say.....
/backup
/backup/folderA
/backup/folderA/FolderAA
/backup/folderB
/backup/folderB/FolderBB
I have Dozens of similar folders in a secondary location (with files in them) and the Folder names will match one of the folders or subfolders in the main backup drive.
I would like to move all contents of specific extension types from my secondary location $FolderName to the Backup location + matching subfolder ONLY if the $FolderName matches exactly and remove the folders from my secondary location!
If there is no corrosponding folder or subfolder in the backup location then leave the source folders & files alone.
looking forward to getting some help/guidance.
Mike
Additional info requested.Expected input and ouput
Lets say i have the following:
Backup Folder
/backup/test/file.bak
And for my secondary folder location:
/secondarylocation/mike/test/hello/john.bak
/secondarylocation/mike/test/hello/backup.zip
i would like this as the end result
/backup/test/file.bak
/backup/test/john.bak
/backup/test/backup.zip
and /secondarylocation/mike/test *and sub folders and files removed
run this script with quoted folders and file types:
./merge.sh "backup" "secondarylocation/mike" "*.zip" "*.bak"
replace -iname with -name if you want to search for suffix case sensitive
replace mv -fv with mv -nv when you don't want to overwrite duplicate file names
add -mindepth 1 to last find if you want to keep empty folder test
merge.sh
#!/bin/bash
# read folders from positional parameters
[ -d "$1" ] && targetf="$1" && shift
[ -d "$1" ] && sourcef="$1" && shift
if [ -z "$targetf" ] || [ -z "$sourcef" ]
then
echo -e "usage: ./merge.sh <targetfolder> <sourcefolder> [PATTERN]..."
exit 1
fi
# add prefix -iname for each pattern
while [ ${pattern:-1} -le $# ]
do
set -- "$#" "-iname \"$1\""
shift
pattern=$((${pattern:-1}+1))
done
# concatenate all prefix+patterns with -o and wrap in parentheses ()
if (( $# > 1 ))
then
pattern="\( $1"
while (( $# > 1 ))
do
pattern="$pattern -o $2"
shift
done
pattern="$pattern \)"
else
pattern="$1"
fi
# move files from searchf to destf
find "$targetf" -mindepth 1 -type d -print0 | sort -z | while IFS=$'\0' read -r -d $'\0' destf
do
find "$sourcef" -mindepth 1 -type d -name "${destf##*/}" -print0 | sort -z | while IFS=$'\0' read -r -d $'\0' searchf
do
if (( $# ))
then
# search with pattern
eval find "\"$searchf\"" -depth -type f "$pattern" -exec mv -fv {} "\"$destf\"" \\\;
else
# all files
find "$searchf" -depth -type f -exec mv -fv {} "$destf" \;
fi
# delete empty folders
find "$searchf" -depth -type d -exec rmdir --ignore-fail-on-non-empty {} +
done
done
exit 0
this will merge hello into test (earn the fruits and cut the tree)

Copying files with specific name to respective directories

I have a source directory in UNIX hiving below files
20180401abc.txt,20180402acb.txt,20180402def.txt
and in target having directories like 20180401,20180402
How can i move 20180401abc.txt to 20180401 & 20180402acb.txt,20180402def.txt to 20180402 directories respectively.
using below code ,
ls /home/source/ > filelist.txt
for line in `cat filelist.txt`
do
dir_path=`echo $line|cut -c1-8`
mkdir -p "/home/target/${dir_path}"
find /home/source/ -type f -exec cp {} /home/target/${dir_path} \;
done
#rm filelist.txt
Just use below script, it will solve your issue:-
ls /home/source/ > filelist.txt
while read filename
do
dir_name=$(echo $line | cut -c1-8 )
dir_path="/home/target/"$dir_name
mkdir $dir_path
chmod 666 $dir_path
mv $filename $dir_path
done < filelist.txt
rm -rf filelist.txt
$ touch 20180401abc.txt 20180402acb.txt 20180402def.txt
$ for F in *.txt; do
DIR=$(echo $F | sed -r 's/[a-z]{3}.txt$//');
mkdir -p $DIR;
mv $F $DIR/$F;
done
$ find
> ./20180402
> ./20180402/20180402acb.txt
> ./20180402/20180402def.txt
> ./20180401
> ./20180401/20180401abc.txt
Code listed below is working:
for filename in `ls /home/source/`
do
dir_name=$(echo $filename | cut -c1-8)
dir_path="/home/target/${dir_name}"
mkdir -p $dir_path
cp /home/source/$filename $dir_path
done

Merge two directories keeping larger files

Consider for example
mkdir dir1
mkdir dir2
cd dir1
echo "This file contains something" > a
touch b
echo "This file contains something" > c
echo "This file contains something" > d
touch e
cd ../dir2
touch a
echo "This file contains something" > b
echo "This file contains something" > c
echo "This file contains more data than the other file that has the same name but is in the other directory. BlaBlaBlaBlaBlaBlaBlaBlaBla BlaBlaBlaBlaBlaBlaBlaBlaBla BlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBlaBla. bla!" > d
I would like to merge dir1 and dir2. If two files have the same name, then only the one which size is the largest must be kept. Here is the expected content of the merged directory
a # Comes from `dir1`
b # Comes from `dir2`
c # Comes from either `dir1` or `dir2`
d # Comes from `dir2`
e # Comes from `dir1`(is empty)
Assuming that no file name a newline:
find . -type f -printf '%s %p\n' \
| sort -nr \
| while read -r size file; do
if ! [ -e "dest/${file#./*/}" ]; then
cp "$file" "dest/${file#./*/}";
fi;
done
The output of find is a list of "filesize path":
221 ./dir1/a
1002 ./dir1/b
11 ./dir2/a
Then we sort the list numeric:
1002 ./dir1/b
221 ./dir1/a
11 ./dir2/a
And fianlly we reach the while read -r size filename loop, where each file is copied over to the destination dest/${file#./*/} if they don't already exists.
${file#./*/} expands to the value of the parameter file with the leading directory removed:
./abc/def/foo/bar.txt -> def/foo/bar.txt, which means you might need to create the directory def/foo in the dest directory:
| while read -r size file; do
dest=dest/${file#./*/}
destdir=${dest%/*}
[ -e "$dest" ] && continue
[ -e "$destdir" ] || mkdir -p -- "$destdir"
cp -- "$file" "$dest"
done
I cannot comment on the other answer due to not enough reputation, but I was getting a syntax error due to missing fi. I also got an error where the target directory needed to be created before copying. So:
find . -type f -printf '%s %p\n' | sort -nr | while read -r size file; do if ! [ -e "dest/${file#./*/}" ]; then mkdir -p "$(dirname "dest/${file#./*/}")" && cp "$file" "dest/${file#./*/}"; fi; done

Destroy Hierarchy of directory?

I have a folder with many folders in it and many folders below that and so on and so forth. In those final folders are are small clusters of files. I am attempting to move those files to the main folder and delete the now empty folder hierarchy. This is what I have so far.
#!/bin/bash
NAME=`whoami`
DEST="/Users/"$NAME"/Desktop/Music 2"
FILES=`find "$DEST" -type f`
for F in "$FILES"
do
mv "${F}" "${DEST}"
done
If I replace the mv command with "echo" it will catch all the right names but when I run this it gives me an error saying that the name is too long. Help will be greatly appreciated.
So say I have
/foo/bar/in/side/test1.txt
/foo/bar/in/down/test2.doc
/foo/bar/last/dog/test3.mp3
I want test1.txt, test2.doc, and test3.mp3 to be in /foo, and for each of the (now empty) directories /foo/bar, /foo/bar/in, /foo/bar/in/side, /foo/bar/in/down, /foo/bar/last, and /foo/bar/last/dog to be deleted.
End result:
/foo/test1.txt
/foo/test2.doc
/foo/test3.mp3
Try doing this :
find "$DEST" -type f -exec bash -c '
mv "$1" "$DEST"; rmdir "${1%/*}" &>/dev/null
' -- {} \;
Especially when the path names have spaces in them, using FILES=$(find ...) really doesn't work. You got the file name too long message because "$FILES" in the for loop treats all the names as a single file name; ${F} contains everything, and the mv command is trying to move a single file to ${DEST}.
The rmdir -p command removes directories that are empty (think of -p for 'prune'), working depth first.
GNU mv has a very useful option -t target for use in this context:
DEST=/foo
find /foo/bar -type f -exec mv -t "${DEST}" {} +
find /foo/bar -type d -depth -exec rmdir -p {} +
Given that you're on Mac OS X, you don't have quite that convenience, so your best bet is a slower (but equally effective):
DEST=/foo
find /foo/bar -type f -exec mv {} "${DEST}" ';'
find /foo/bar -type d -depth -exec rmdir -p {} +
This executes the mv command once per file (whereas with GNU mv, many files may be moved with a single invocation). Otherwise, it is equivalent.
Both sets of commands avoid the issues with spaces in file names.
Note that if $DEST is the same as the directory you're searching in, you'll run into problems moving the files already in $DEST over themselves. As written, the code does not avoid that problem. If necessary, you can avoid that with:
find "$DEST"/*/ -type f ...
The trailing slash enforces 'only directories' (think of it as equivalent to "$DEST"/*/.).
Proof of concept script
Remember: always test destructive scripts (scripts that delete stuff) on copies of the live material, unless you've got good backups on hand. Actually, make them on copies anyway; it is almost invariably quicker to make a copy than to recover from a backup (but you should have a backup anyway if the data is crucial).
echo "Before"
du -a .
filelist="./foo/bar/in/side/test1.txt
./foo/bar/in/down/test2.doc
./foo/bar/last/dog/test3.mp3"
for file in $filelist
do
mkdir -p $(dirname $file)
cp script $file
done
FIFO=./foo/bar/first/installment
mkdir $(dirname $FIFO)
mkfifo $FIFO
echo "Created"
du -a .
echo "Clean up"
DEST=./foo
find ./foo/bar -type f -exec mv {} "${DEST}" ';'
find ./foo/bar -depth -type d -exec rmdir -p {} + 2>/dev/null
echo "After"
du -a .
rm -fr ./foo
The rmdir -p process is noisy. It reports on directories it can't remove. The GNU version of rmdir provides an option to suppress some errors (--ignore-fail-on-non-empty), but in the context, you end up with some errors about non-existent directories, too (they were removed by the pruning process before the entry that listed the directory on its own). So, after putting up with the noise for a bit, I redirected all errors from rmdir to /dev/null. Remove that redirection until you're satisfied things work as intended.
This script should be run in an empty directory you've just created and made your current directory:
mkdir junk
cd junk
cp ../script .
sh -x ./script
Sample output:
$ sh -x script
+ echo Before
Before
+ du -a .
4 ./script
8 .
+ filelist='./foo/bar/in/side/test1.txt
./foo/bar/in/down/test2.doc
./foo/bar/last/dog/test3.mp3'
+ for file in '$filelist'
++ dirname ./foo/bar/in/side/test1.txt
+ mkdir -p ./foo/bar/in/side
+ cp script ./foo/bar/in/side/test1.txt
+ for file in '$filelist'
++ dirname ./foo/bar/in/down/test2.doc
+ mkdir -p ./foo/bar/in/down
+ cp script ./foo/bar/in/down/test2.doc
+ for file in '$filelist'
++ dirname ./foo/bar/last/dog/test3.mp3
+ mkdir -p ./foo/bar/last/dog
+ cp script ./foo/bar/last/dog/test3.mp3
+ FIFO=./foo/bar/first/installment
++ dirname ./foo/bar/first/installment
+ mkdir ./foo/bar/first
+ mkfifo ./foo/bar/first/installment
+ echo Created
Created
+ du -a .
4 ./foo/bar/in/side/test1.txt
8 ./foo/bar/in/side
4 ./foo/bar/in/down/test2.doc
8 ./foo/bar/in/down
20 ./foo/bar/in
4 ./foo/bar/last/dog/test3.mp3
8 ./foo/bar/last/dog
12 ./foo/bar/last
0 ./foo/bar/first/installment
4 ./foo/bar/first
40 ./foo/bar
44 ./foo
4 ./script
52 .
+ echo 'Clean up'
Clean up
+ DEST=./foo
+ find ./foo/bar -type f -exec mv '{}' ./foo ';'
+ find ./foo/bar -depth -type d -exec rmdir -p '{}' +
+ echo After
After
+ du -a .
0 ./foo/bar/first/installment
4 ./foo/bar/first
8 ./foo/bar
4 ./foo/test1.txt
4 ./foo/test3.mp3
4 ./foo/test2.doc
24 ./foo
4 ./script
32 .
+ rm -fr ./foo
Note that this script carefully creates a non-file (a FIFO) in a separate directory under ./foo/bar and shows that it is left behind. Comment out the mkfifo line that create the FIFO and the run looks like:
$ sh -x script
+ echo Before
Before
+ du -a .
4 ./script
8 .
+ filelist='./foo/bar/in/side/test1.txt
./foo/bar/in/down/test2.doc
./foo/bar/last/dog/test3.mp3'
+ for file in '$filelist'
++ dirname ./foo/bar/in/side/test1.txt
+ mkdir -p ./foo/bar/in/side
+ cp script ./foo/bar/in/side/test1.txt
+ for file in '$filelist'
++ dirname ./foo/bar/in/down/test2.doc
+ mkdir -p ./foo/bar/in/down
+ cp script ./foo/bar/in/down/test2.doc
+ for file in '$filelist'
++ dirname ./foo/bar/last/dog/test3.mp3
+ mkdir -p ./foo/bar/last/dog
+ cp script ./foo/bar/last/dog/test3.mp3
+ FIFO=./foo/bar/first/installment
++ dirname ./foo/bar/first/installment
+ mkdir ./foo/bar/first
+ echo Created
Created
+ du -a .
4 ./foo/bar/in/side/test1.txt
8 ./foo/bar/in/side
4 ./foo/bar/in/down/test2.doc
8 ./foo/bar/in/down
20 ./foo/bar/in
4 ./foo/bar/last/dog/test3.mp3
8 ./foo/bar/last/dog
12 ./foo/bar/last
4 ./foo/bar/first
40 ./foo/bar
44 ./foo
4 ./script
52 .
+ echo 'Clean up'
Clean up
+ DEST=./foo
+ find ./foo/bar -type f -exec mv '{}' ./foo ';'
+ find ./foo/bar -depth -type d -exec rmdir -p '{}' +
+ echo After
After
+ du -a .
4 ./foo/test1.txt
4 ./foo/test3.mp3
4 ./foo/test2.doc
16 ./foo
4 ./script
24 .
+ rm -fr ./foo
$
This strongly suggests that if written properly and handled carefully, the code above does work correctly without damaging the system. But you should still be cautious before using any variant of this in production (and even in test).
Tests run on an Ubuntu 12.04 derivative.

Resources