How to write every Nth file to new folder - bash

I have this code which scans folders and moves all files in each folder to a new one.
How do I make it so only every Nth file is moved?
#!/bin/bash
# Save this file in the directory containing the folders (bb in this case)
# Then to run it, type:
# ./rencp.sh
# The first output frame number
let "frame=1"
# this is where files will go. A new directory will be created if it doesn't exist
outFolder="collected"
# print info every so many files.
feedbackFreq=250
# prefix for new files
namePrefix="ben_timelapse"
#new extension (uppercase is so ugly)
ext="jpg"
# this will make sure we only get files from camera directories
srcPattern="ND850"
mkdir -p $outFolder
for f in *${srcPattern}/*
do
mv $f `printf "$outFolder/$namePrefix.%05d.$ext" $frame`
if ! ((frame % $feedbackFreq)); then
echo "moved and renamed $frame files to $outFolder"
fi
let "frame++"
done
Pretty sure I need to edit the line for f in *${srcPattern}/* but not sure of the correct syntax

If files in the ND850 folders are sequential when listed (i.e. padded frame numbers), and the folders themselves are in order, then the following code should work.
#!/bin/bash
# Maintain a counter, and the output frame number
let "frame=1"
let "outframe=1"
outFolder="collected"
# frequency
gap=5
namePrefix="ben_timelapse"
#new extension (uppercase is so ugly)
ext="jpg"
srcPattern="ND850"
echo "Copying and renaming 1 in every $gap files"
mkdir -p "$outFolder"
for f in *${srcPattern}/*
do
if ! ((frame % $gap)); then
outfile=`printf "$outFolder/$namePrefix.%05d.$ext" $outframe`
cp $f "$outfile"
echo "copied $f to $outfile"
let "outframe++"
fi
let "frame++"
done

Try this instead of your mv command after do:
if ! ((frame % 5)); then
a=$((frame / 5));
mv $f `printf "$outFolder/$namePrefix.%05d.$ext" $a`
fi
It will move frame=5,10, and so on, to $outFolder/$namePrefix.00001.$ext,$outFolder/$namePrefix.00002.$ext, and so on

Related

Bash/sh: Move Folder + subfolder(s) reclusively rename files if they exist [duplicate]

This question already has answers here:
Extract filename and extension in Bash
(38 answers)
Closed 1 year ago.
I'm trying to create a bash script that will move all files recursively from a source folder to a target folder, and simply rename files if they already exist. Similar to the way M$ Windows does, when a file exists it auto-renames it with "<filemame> (X).<ext>", etc. except for ALL files.
I've create the below, which works fine for almost all scenarios except when a folder has a (.) period in its name and a file within that folder has no extension (no period in its name).
eg a folder-path-file such as: "./oldfolder/this.folder/filenamewithoutextension"
I get (incorrectly):
"./newfolder/this (1).folder/filenamewithoutextension"
if "./newfolder/this.folder/filenamewithoutextension" already exist in the target location (./newfolder),
instead of correctly naming the new file: "./oldfolder/this.folder/filenamewithoutextension (1)"
#!/bin/bash
source=$1 ; target=$2 ;
if [ "$source" != "" ] && [ "$target" != "" ] ; then
#recursive file search
find "$source" -type f -exec bash -c '
#setup variables
oldfile="$1" ; osource='"${source}"' ; otarget='"${target}"' ;
#set new target filename with target path
newfile="${oldfile/${osource}/${otarget}}" ;
#check if file already exists at target
[ -f "${newfile}" ] && {
#get the filename and fileextension for numbering - ISSUE HERE?
filename="${newfile%/}" ; newfileext="${newfile##*.}" ;
#compare filename and file extension for missing extension
if [ "$filename" == "$newfileext" ] ; then
#filename has no ext - perhaps fix the folder with a period issue here?
newfileext="" ;
else
newfileext=".$newfileext" ;
fi
#existing files counter
cnt=1 ; while [ -f "${newfile%.*} (${cnt})${newfileext}" ] ; do ((cnt+=1)); done
#set new filename with counter - New Name created here *** Needs re-work, as folder with a period = fail
newfile="${newfile%.*} (${cnt})${newfileext}";
}
#show mv command
echo "mv \"$oldfile\" \"${newfile}\""
' _ {} \;
else
echo "Requires source and target folders";
fi
I suspect the issue is, how to properly identify the filename and extension, found in this line:
filename="${newfile%/}" ; newfileext="${newfile##*.}" which doesn't identify a filename properly (files are always after the last /).
Any suggestion on how to make it work properly?
UPDATED: Just some completion notes - Issues fixes with:
Initially Splitting each full path filename: path - filename - (optional ext)
Reconstructing the full path filename: path - filename - counter - (optional ext)
fixed the file move to ensure directory structure exists with mkdir -p (mv does not create new folders if they do not exist in the target location).
Maybe you could try this instead?
filename="${newfile##*/}" ; newfileext="${filename#*.}"
The first pattern means: remove the longest prefix (in a greedy way) up to the last /.
The second one: remove the prefix up to the first dot (the greedy mode seems unnecessary here) − and as you already noted, in case the filename contains no dot, you will get newfileext == filename…
Example session:
newfile='./oldfolder/this.folder/filenamewithoutextension'
filename="${newfile##*/}"; newfileext="${filename#*.}"
printf "%s\n" "$filename"
#→ filenamewithoutextension
printf "%s\n" "$newfileext"
#→ filenamewithoutextension
newfile='./oldfolder/this.folder/file.tar.gz'
filename="${newfile##*/}"; newfileext="${filename#*.}"
printf "%s\n" "$filename"
#→ file.tar.gz
printf "%s\n" "$newfileext"
#→ tar.gz

Renaming files with a specific scheme

I have a FTP folder receiving files from a remote camera. The camera stores the video file name always as ./rec_YYYY-MM-DD_HH-MM.mkv. The video files are stored all in the same folder, the root folder from the FTP server.
I need to move these files to another folder, with this new scheme:
Remove rec_ from the file name.
Change date format to DD-MM-YY.
Remove date from the file name and make it a folder instead, where that same file and all the others in the same date will be stored in.
Final file path would be: ./DD-MM-YYYY/HH-MM.mkv.
The process would continue to all the files, putting them in the folder corresponding to the day it was created.
Summing up: ./rec_YYYY-MM-DD_HH-MM.mkv >> ./DD-MM-YYYY/HH-MM.mkv. The same should apply to all files that are in the same folder.
As I can't make it happen directly from the camera, this needs to be done with Bash on the server that is receiving the files.
So far, what I got is script, which would get the file's creation date and use it to make a folder, and then get creation time to move the file with the new name, based on it's creation time.:
for f in *.mp4
do
mkdir "$f" "$(date -r "$f" +"%d-%m-%Y")"
mv -n "$f" "$(date -r "$f" +"%d-%m-%Y/%H-%M-%S").mp4"
done
I'm getting this output (with testfile 1.mp4):
It creates the folder based on the file's creation date;
it renames the file to it's creation time;
Then, it returns mkdir: cannot create directory ‘1.mp4’: File exists
If two or more files, only one gets renamed and moved as described. The others stay the same and terminal returns:
mkdir: cannot create directory ‘1.mp4’: File exists
mkdir: cannot create directory ‘2.mp4’: File exists
mkdir: cannot create directory ‘12-12-2018’: File exists
Could someone help me out? Better suggestions? Thanks!
Honestly I would just use Perl or Python for this. Here's how to embed either in a shell script.
Here's a perl script that doesn't use any libraries, even ones that ship with Perl (so it'll work without extra packages on distributions like CentOS that don't ship with the entire Perl library). The perl script launches one new process per file in order to perform the copy.
perl -e '
while (<"*.m{p4,kv}">) {
my $path = $_;
my ($prefix, $year, $month, $day, $hour, $minute, $ext) =
split /[.-_]/, $path;
my $sec = q[00];
die "unexpected prefix ($prefix) in $path"
unless $prefix eq q[rec];
die "unexpected extension ($ext) in $path"
unless $ext eq q[mp4] or $ext eq q[mkv];
my $dir = "$day-$month-$year";
my $name = "$hour-$min-$sec" . q[.] . $ext;
my $destpath = $dir . q[/] . $name;
die "$dir . $name is unexpectedly a directory" if (-d $dir);
system("cp", "--", $path, $destpath);
}
'
Here's a Python example, it's compatible with either Python 2 or Python 3 but does use the standard library. The Python script does not spawn any additional processes.
python3 -c '
import os.path as path
import re
from glob import iglob
from itertools import chain
from os import mkdir
from shutil import copyfile
for p in chain(iglob("*.mp4"), iglob("*.mkv")):
fields = re.split("[-]|[._]", p)
prefix = fields[0]
year = fields[1]
month = fields[2]
day = fields[3]
hour = fields[4]
minute = fields[5]
ext = fields[6]
sec = "00"
assert prefix == "rec"
assert ext in ["mp4", "mkv"]
directory = "".join([day, "-", month, "-", year])
name = "".join([hour, "-", minute, "-", sec, ".", ext])
destpath = "".join([directory, "/", name])
assert not path.isdir(destpath)
try:
mkdir(directory)
except FileExistsError:
pass
copyfile(src=p, dst=destpath)
'
Finally, here's a bash solution. It splits paths using -, ., and _ and then extracts various subfields by indexing into $# inside a function. The indexing trick is portable, although regex substitution on variables is a bash extension.
#!/bin/bash
# $1 $2 $3 $4 $5 $6 $7 $8
# path rec YY MM DD HH MM ext
process_file() {
mkdir "$5-$4-$3" &> /dev/null
cp -- "$1" "$5-$4-$3"/"$6-$7-00.$8"
}
for path in *.m{p4,kv}; do
[ -e "$path" ] || continue
# NOTE: two slashes are needed in the substitution to replace everything
# read -a ARRAYVAR <<< ... reads the words of a string into an array
IFS=' ' read -a f <<< "${path//[-_.]/ }"
process_file "$path" "${f[#]}"
done
If you cd /to/some/directory/containing_your_files then you could use the following script
#!/usr/bin/env bash
for f in rec_????-??-??_??-??.m{p4,kv} ; do
dir=${f:4:10} # skip 4 chars ('rec_') take 10 chars ('YYYY_MM_DD')
fnm=${f:15} # skip 15 chars, take the remainder
test -d "$dir" || mkdir "$dir"
mv "$f" "$dir"/"$fnm"
done
note ① that I have not exchanged the years and the days, if you absolutely need to do the swap you can extract the year like this, year=${dir::4} etc and ② that this method of parameter substitution is a Bash-ism, e.g., it doesn't work in dash.
your problem is: mkdir creates folder but you are giving filename for folder creation.
if you want to use fileName for folder creation then use it without extension.
the thing is you are trying to create folder with the already existing fileName

A bash script to split a data file into many sub-files as per an index file using dd

I have a large data file that contains many joint files.
It has an separate index file has that file name, start + end byte of each file within the data file.
I'm needing help in creating a bash script to split the large file into it's 1000's of sub files.
Data File : fileafilebfilec etc
Index File:
filename.png<0>3049
folder\filename2.png<3049>6136.
I guess this needs to loop through each line of the index file, then using dd to extract the relevant bytes into a file. Maybe a fiddly part might be the folder structure bracket being windows style rather than linux style.
Any help much appreciated.
while read p; do
q=${p#*<}
startbyte=${q%>*}
endbyte=${q#*>}
filename=${p%<*}
count=$(($endbyte - $startbyte))
toprint="processing $filename startbyte: $startbyte endbyte: $endbyte count: $c$
echo $toprint
done <indexfile
Worked it out :-) FYI:
while read p; do
#sort out variables
q=${p#*<}
startbyte=${q%>*}
endbyte=${q#*>}
filename=${p%<*}
count=$(($endbyte - $startbyte))
#let it know we're working
toprint="processing $filename startbyte: $startbyte endbyte: $endbyte count: $c$
echo $toprint
if [[ $filename == *"/"* ]]; then
echo "have found /"
directory=${filename%/*}
#if no directory exists, create it
if [ ! -d "$directory" ]; then
# Control will enter here if $directory doesn't exist.
echo "directory not found - creating one"
mkdir ~/etg/$directory
fi
fi
dd skip=$startbyte count=$count if=~/etg/largefile of=~/etg/$filename bs=1
done <indexfile

linux for loop two variables each time

I have several files in a directory and I want to run some linux packages on these files by every two of them, like ERR1045141_1 with ERR1045141_2 and ERR1045144_1 with ERR1045144_2 and so on. So I write a for loop for this but it is not working.
files:
ERR1045141_1.fastq.gz
ERR1045141_2.fastq.gz
ERR1045144_1.fastq.gz
ERR1045144_2.fastq.gz
ERR1045145_1.fastq.gz
ERR1045145_2.fastq.gz
ERR1045146_1.fastq.gz
ERR1045146_2.fastq.gz
ERR1045148_1.fastq.gz
ERR1045148_2.fastq.gz
ERR1045149_1.fastq.gz
ERR1045149_2.fastq.gz
ERR1045151_1.fastq.gz
ERR1045151_2.fastq.gz
ERR1045152_1.fastq.gz
ERR1045152_2.fastq.gz
ERR1045154_1.fastq.gz
ERR1045154_2.fastq.gz
codes:
files=ls
for (( i=0; i<${#files[#]} ; i+=2 )) ; do
echo "${files[i]}" "${files[i+1]}"
done
It did not work and I am not sure is the files=ls has something wrong.Or any better way to do it.please advise.
Try the following if you are sure about the existence of the second file:
for file1 in ERR*_1*
do
file2=`echo $file1 | sed 's/_1/_2/g'`
echo $file1 $file2
done
No, what you really want to do is to process all the 1 files, performing some action on it and its associated 2 file.
You can do that with something as simple as the for loop in this complete test program:
#!/usr/bin/env bash
doSomethingWith() {
echo "[$1] [$2]"
}
touch 'xERR1045141_1.fastq.gz' 'xERR1045141_2.fastq.gz'
touch 'xERR1045144_1.fastq.gz' 'xERR1045144_2.fastq.gz'
touch 'xERR1045145_1.fastq.gz' 'xERR1045145_2.fastq.gz'
touch 'xERR1045146_1.fastq.gz' 'xERR1045146_2.fastq.gz'
touch 'xERR1045148_1.fastq.gz' 'xERR1045148_2.fastq.gz'
touch 'xERR1045149_1.fastq.gz' 'xERR1045149_2.fastq.gz'
touch 'xERR1045151_1.fastq.gz' 'xERR1045151_2.fastq.gz'
touch 'xERR1045152_1.fastq.gz' 'xERR1045152_2.fastq.gz'
touch 'xERR1045154_1.fastq.gz' 'xERR1045154_2.fastq.gz'
touch 'xERR 45154_1.fastq.gz' 'xERR 45154_2.fastq.gz'
for file1 in xERR*_1.fastq.gz ; do
file2="${file1/_1/_2}"
doSomethingWith "${file1}" "${file2}"
done
rm -rf xERR*.fastq.gz
This program outputs:
[xERR1045141_1.fastq.gz] [xERR1045141_2.fastq.gz]
[xERR1045144_1.fastq.gz] [xERR1045144_2.fastq.gz]
[xERR1045145_1.fastq.gz] [xERR1045145_2.fastq.gz]
[xERR1045146_1.fastq.gz] [xERR1045146_2.fastq.gz]
[xERR1045148_1.fastq.gz] [xERR1045148_2.fastq.gz]
[xERR1045149_1.fastq.gz] [xERR1045149_2.fastq.gz]
[xERR1045151_1.fastq.gz] [xERR1045151_2.fastq.gz]
[xERR1045152_1.fastq.gz] [xERR1045152_2.fastq.gz]
[xERR1045154_1.fastq.gz] [xERR1045154_2.fastq.gz]
[xERR 45154_1.fastq.gz] [xERR 45154_2.fastq.gz]
to show that the names are being handled correctly.
Note that I've named the files xERR* so as not to clash with your own files. You should adjust the loop to handle your own files once you're satisfied it will work okay.
And, just as an aside, if you don't want to do anything except for those cases where both files exist, you can simply replace the "action" line with something like:
[[ -f "${file2}" ]] && doSomethingWith "${file1}" "${file2}"
This will bypass those where the 2 file is not a regular file.

accessing newly created directory in shell script

I'm attempting to make a new folder, a duplicate of the input, and then tar the contents of that folder. I can't figure out why - but it seems like instead of searching the contents of my newly created directory - it is searching my entire computer... returning lines such as
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Sine - Vocal 1.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Sine - Vocal 2.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Arp.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Asym 4.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Eml.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square is a folder
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square/Square - Arp.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square/Square - Bl Saw.raw is a file
can you guys spot a simple error?
BTW, I know that the script to tar isn't present yet, but that will be easy once i can navigate the new folder.
#!/bin/bash
##--- deal with help args ------------------
##
print_help_message() {
printf "Usage: \n"
printf "\t./`basename $0` <input_dir> <output_dir>\n"
printf "where\n"
printf "\tinput_dir : (required) the input directory.\n"
printf "\toutput_dir : (required) the output directory.\n"
}
if [ "$1" == "help" ]; then
print_help_message
exit 1
fi
## ------ get cli args ----------------------
##
if [ $# == 2 ]; then
INPUT_DIR="$1"
OUTPUT_DIR="$2"
fi
## ------ tree traversal function -----------
##
mkdir "$2"
cp -r "$1"/* "$2"/
## ------ return output dir name ------------
##
return_output_dir() {
echo $OUTPUT_DIR/$(basename $(basename $(dirname $1)))
}
bt() {
output_dir="$1"
for filename in $output_dir/*; do
if [ -d "${filename}" ]; then
echo "$filename is a folder"
bt $filename
else
echo "$filename is a file"
fi
done
}
## ------ main ------------------------------
##
main() {
bt $return_output_dir
exit 0
}
main
}
Well, I can tell you why it's doing that, but I'm not clear on what it's supposed to be doing, so I'm not sure how to fix it. The immediate problem is that return_output_dir is a function, not a variable, so in the command bt $return_output_dir the $return_output_dir part expands to ... nothing, and bt gets run with no argument. That means that inside bt, output_dir gets set to the empty string, so for filename in $output_dir/* becomes for filename in /*, which iterates over the top-level items on your boot volume.
There are a number of other things that're confusing/weird about this code:
The function main() doesn't seem to serve any purpose -- some of the main-line code is outside it (notably, the argument parsing stuff), some inside, for no apparent reason. Having a main function is required in some languages, but in a shell script it generally makes more sense to just put the main code inline. (Also, functions shouldn't exit, they should return.)
You have variables named both OUTPUT_DIR and output_dir. Use distinct names. Also, it's generally best to stick to lowercase (or mixed-case) variable names, to avoid conflicts with the variables that're used by the shell and other programs.
You copy $1 and $2 into INPUT_DIR and OUTPUT_DIR, then continue to use $1 and $2 rather than the more-clearly-named variables you just copied them into.
output_dir is changed in the recursive function, but not declared as local; this means that inner invocations of bt will be changing the values that outer ones might try to use, leading to weirdness. Declare function-local variables as local to avoid trouble.
$(basename $(basename $(dirname $1))) doesn't make sense. Suppose $1 is "/foo/bar/baz/quux": then dirname $1 returns /foo/bar/baz, basename /foo/bar/baz returns "baz", and basename baz returns "baz" again. The second basename isn't doing anything! And in any case, I'm pretty sure the whole thing isn't doing what you expect it to.
What directory is bt supposed to be recursing through? Nothing in how you call it has any reference to either INPUT_DIR or OUTPUT_DIR.
As a rule, you should put variable references in double-quotes (e.g. for filename in "$output_dir"/* and bt "$filename"). You do this in some places, but not others.

Resources