Bash script to obtain the newest file X in a folder and create a new variable called X+1 - bash

I am trying to create a loop in Bash script for a series of data migrations:
At the beginning of every step, the script should get the name of the newest file in a folder
called "migrationfiles/ and store it in the variable "migbefore" and create a new variable called "migbefore+1":
Example: if the "migrationfiles/" folder contains the following files:
migration.pickle1 migration.pickle2 migration.pickle3
The variable "migbefore" and migafter should have the following value:
migbefore=migration.pickle3
migafter=migration.pickle4
At the end of every step, the function "metl", which is in charge of making the data migration, uses the file "migbefore" to load the data and creates 1 new file called "migafter" and stores it in the "migrationfiles/" folder, so in this case, the new file created will be called:
"migration.pickle4"
The code I pretend using is the following:
#!/bin/bash
migbefore=0
migafter=0
for y in testappend/*
for x in migrationfiles/*
do
migbefore=migration.pickle(oldest)
migafter=migbefore+1
done
do
metl -m migrationfiles/"${migbefore}"
-t migrationfiles/"${migafter}"
-s "${y}"
config3.yml
done
Does anyone know how I could make the first loop (The one that searches for the newest file in the "migrationfiles/" folder) and then assigns the name of the variable "migafter" as "migbefore+1"?

I think this might do what you want.
#!/bin/bash
count=0
prefix=migration.pickle
migbefore=$prefix$((count++))
migafter=$prefix$((count++))
for y in testappend/*; do
echo metl -m migrationfiles/"${migbefore}" \
-t migrationfiles/"${migafter}" \
-s "${y}" \
config3.yml
migbefore=$migafter
migafter=$prefix$((count++))
done

Copy with Numbered Backups
It's really hard to tell what you're really trying to do here, and why. However, you might be able to make life simpler by using the --backup flag from the cp command. For example:
cp --backup=numbered testappend/migration.pickle migrationfiles/
This will ensure that you have a sequence of migration files like:
migration.pickle
migration.pickle.~1~
migration.pickle.~2~
migration.pickle.~3~
where the older versions have larger ordinal numbers, while the latest version has no ordinal extension. It's a pretty simple system, but works well for a wide variety of use cases. YMMV.

# configuration:
path=migrationfiles
prefix=migration.pickle
# determine number of last file:
last_number=$( find ${path} -name "${prefix}*" | sed -e "s/.*${prefix}//g" | sort -n | tail -1 )
# put together the file names:
migbefore=${prefix}${last_number}
migafter=${prefix}$(( last_number + 1 ))
# test it:
echo $migbefore $migafter
This should work even if there are no migration files yet. In that case, the value of migbefore is just the prefix and does not point to a real file.

Related

How to create a list of sequentially numbered folders using an existing folder name as the base name

I've done a small amount of bash scripting. Mostly modifying a script to my needs.
On this one I am stumped.
I need a script that will read a sub-folder name inside a folder and make a numbered list of folders based on that sub-folder name.
Example:
I make a folder named “Pictures”.
Then inside I make a sub-folder named “picture-set”
I want a script to see the existing sub-folder name (picture-set) and make 10 more folders with sequential numbers appended to the end of the folder names.
ex:
folder is: Pictures
sub-folder is: picture-set
want to create:
“picture-set-01”
“picture-set-02”
“picture-set-03”
and so forth up to 10. Or a number specified in the script.
The folder structure would look like this:
/home/Pictures/picture-set
/home/Pictures/picture-set-01
/home/Pictures/picture-set-02
/home/Pictures/picture-set-03
... and so on
I am unable to tell the script how to find the base folder name to make additional folders.
ie: “picture-set”
or a better option:
Would be to create a folder and then create a set of numbered sub-folders based on the parent folder name.
ex:
/home/Songs - would become:
/home/Songs/Songs-001
/home/Songs/Songs-002
/home/Songs/Songs-003
and so on.
Please pardon my bad formatting... this is my first time asking a question on a forum such as this. Any links or pointers as to proper formatting is welcome.
Thanks for the help.
Bash has a parameter expansion you can use to generate folder names as arguments to the mkdir command:
#!/usr/bin/env bash
# Creates all directories up to 10
mkdir -p -- /home/Songs/Songs-{001..010}
This method is not very flexible if you need to dinamically change the range of numbers to generate using variables.
So you may use a Bash for loop and print format the names with desired number of digits and create each directory in the loop:
#!/usr/bin/env bash
start_index=1
end_index=10
for ((i=start_index; i<=end_index; i++)); do
# format a dirpath with the 3-digits index
printf -v dirpath '/home/Songs/Songs-%03d' $i
mkdir -p -- "$dirpath"
done
# Prerequisite:
mkdir Pictures
cd Pictures
# Your script:
min=1
max=12
name="$(basename "$(realpath .)")"
for num in $(seq -w $min $max); do mkdir "$name-$num"; done
# Result
ls
Pictures-01 Pictures-03 Pictures-05 Pictures-07 Pictures-09 Pictures-11
Pictures-02 Pictures-04 Pictures-06 Pictures-08 Pictures-10 Pictures-12

How to recursively rename all files and folder including specific part of the filename with Windows Bash?

This has to be a duplicate but I have read and tried at least a dozen of Q&As here on SO, and I cannot get any of them working for my case.
Really hope this won't result in downvotes because of it.
So I'm on Windows (10) and have a Bash terminal that I want to use for my task. The MINGW64 one I downloaded when I started working with Git.
I would prefer the solution with this program, but will be perfectly happy with one in Command Prompt Terminal or even PowerShell.
I created a TemplateApp which is in C:\Apps\TemplateApp folder which has multiple folders and subfolders named TemplateApp or TemplateApp.something as well as a lot of files that have TemplateApp as a part of their name.
Could be:
TemplateApp.ext
TemplateApp.something.ext
something.TemplateApp.something.ext
Then I copied the uppermost folder to C:\Apps\TemplateApp - Copy and in turn renamed it to C:\Apps\ProductionApplication.
Now for the love of whomever, I cannot make any of the scripts I found on SO to work for my case, ie. to rename all the above mentioned files and folders by replacing TemplateApp with ProductionApplication.
Here is a bash function I wrote that I think does very much like what you are wanting to do.
function func_CreateSourceAndDestination() {
#
for (( i = 0 ; i < ${#files_syncSource[#]} ; i++ )) ; do
files_syncDestination[${i}]="${files_syncSource[${i}]#${directory_MusicLibraryRoot_source}}"
file_destinationPath="$( dirname -- "${directory_PMPRoot_destination}${files_syncDestination[${i}]}" )"
if [ ! -d "${file_destinationPath}" ] ; then
mkdir -p "${file_destinationPath}"
fi
rsync -rltDvPmz "${files_syncSource[${i}]}" "${directory_PMPRoot_destination}${files_syncDestination[${i}]}"
done
}
In my case I'm feeding into rsync for a source and a destination. I'm pulling all the file paths from an array that has been split into path segments. I have to make certain character substitutions for FAT and NTFS file systems. I do this recursively.
files_syncDestination[${i}]="${files_syncDestination[${i}]//\:/__}"
That's the magic. I load a new array with the character substituted. You could do the same with a loaded variable including your phrases for change.
files_syncDestination[${i}]="${files_syncDestination[${i}]//${targetPhrase}/${subPhrase}}"
After that change in the function, you could use rsync or cp or mv as you prefer to go from your source array to your destination array.
(The double-slash in the substitution makes the substitution global.)

Split a folder which has 100s of subfolders with each having few files into one more level of subfolder using shell

I have a following data dir:
root/A/1
root/A/2
root/B/1
root/B/2
root/B/3
root/C/1
root/C/2
And I want to convert it into following file structure:
root2/I/A/1
root2/I/A/2
root2/I/B/1
root2/I/B/2
root2/I/B/3
root2/II/C/1
root2/II/C/2
Purpose of doing it is I want to run some script which takes home folder (root here) and runs on it. And I want to run it in parallel on many folders(I, II) to speed up the process.
Simple assumption about file and folder name is that all are alphanumeric, even no period or underscore.
Edit: I tried following:
for i in `seq 1 30`; do mkdir -p "root2/folder$i"; find root -type f | head -n 4000 | xargs -i cp "{}" "root2/folder$i"; done
Problem is that it creates something like following, which is not what i wanted.
root2/I/1
root2/I/2
root2/I/1
root2/I/2
root2/I/3
root2/II/1
root2/II/2
You may wish to use a lesser known command called dirsplit, the usual application of which is to split a directory into multiple directories for burning purposes.
Use it like below :
dirsplit -m -s 300M /root/ -p /backup/folder1
Options implies below stuff :
-m|--move Move files to target dirs
-e 2 special exploration mode, 2 means files in directory are put together
-p prefix to be attached to each directory created, in you case I, II etc
-s Maximum size allowed for each new folder created.
For more information see :
dirsplit -H

Adding a status (file integrity)check to a cbr cbz converting bash script

First post, so Hi! Let me start by saying I'm a total noob regarding programming. I understand very basic stuff, but when it comes to checking exit codes or what the adequate term is, I'm at a loss. Apparently my searchfoo is really weak in this area, I guess it's a question of terminology.
Thanks in advance for taking your time to reading this/answering my question!
Description: I found a script that converts/repack .cbr files to .cbz files. These files are basically your average rar and zip files, however renamed to another extension as they are used for (comic)book applications such as comicrack, qcomicbook and what not. Surprisingly enough there no cbr -> cbz converters out there. The advantages of .cbz is besides escaping the proprietary rar file format, that one can store the metadata from Comic Vine with e. g comictagger.
Issue: Sometimes the repackaging of the files doesn't end well and would hopefully be alleviated by a integrity check & another go. I modified said script slightly to use p7zip as it can both pack/unpack 7z, zip-files and some others, i. e great for options. p7zip can test the archive by:
7z t comicfile.cbz tmpworkingdir
I guess it's a matter of using if & else here(?) to check the integrity and then give it another go, if there are any error.
Question/tl;dr: What would be the "best"/adequate approach to add a integrity file check to the script below?
#!/bin/bash
#Source: http://comicrack.cyolito.com/forum/13-scripts/30013-cbr3cbz-rar-to-zip-conversion-for-linux
echo "Converting CBRs to CBZs"
# Set the "field separator" to something other than spaces/newlines" so that spaces
# in the file names don't mess things up. I'm using the pipe symbol ("|") as it is very
# unlikely to appear in a file name.
IFS="|"
# Set working directory where to create the temp dir. The user you are using must have permission
# to write into this directory.
# For performance reasons I'm using ram disk (/dev/shm/) in Ubuntu server.
WORKDIR="/dev/shm/"
# Set name for the temp dir. This directory will be created under WORDDIR
TEMPDIR="cbr2cbz"
# The script should be invoked as "cbr2cbz {directory}", where "{directory}" is the
# top-level directory to be searched. Just to be paranoid, if no directory is specified,
# then default to the current working directory ("."). Let's put the name of the
# directory into a shell variable called SOURCEDIR.
# Note: "$1" = "The first command line argument"
if test -z "$1"; then
SOURCEDIR=`pwd`
else
SOURCEDIR="$1"
fi
echo "Working from directory $SOURCEDIR"
# We need an empty directory to work in, so we'll create a temp directory here
cd "$WORKDIR"
mkdir "$TEMPDIR"
# and step into it
cd "$TEMPDIR"
# Now, execute a loop, based on a "find" command in the specified directory. The
# "-printf "$p|" will cause the file names to be separated by the pipe symbol, rather than
# the default newline. Note the backtics ("`") (the key above the tab key on US
# keyboards).
for CBRFILE in `find "$SOURCEDIR" -name "*.cbr" -printf "%p|while read line; do
# Now for the actual work. First, extract the base file name (without the extension)
# using the "basename" command. Warning: more backtics.
BASENAME=`basename $CBRFILE ".cbr"`
# And the directory path for that file, so we know where to put the finished ".cbz"
# file.
DIRNAME=`dirname $CBRFILE`
# Now, build the "new" file name,
NEWNAME="$BASENAME.cbz"
# We use RAR file's name to create folder for unpacked files
echo "Processing $CBRFILE"
mkdir "$BASENAME"
# and unpack the rar file into it
7z x "$CBRFILE" -O"$BASENAME"
cd "$BASENAME"
# Lets ensure the permissions allow us to pack everything
sudo chmod 777 -R ./*
# Put all the extracted files into new ".cbz" file
7z a -tzip -mx=9 "$NEWNAME" *
# And move it to the directory where we found the original ".cbr" file
mv "$NEWNAME" $DIRNAME/"$NEWNAME"
# Finally, "cd" back to the original working directory, and delete the temp directory
# created earlier.
cd ..
rm -r "$BASENAME"
# Delete the RAR file also
rm "$CBRFILE"
done
# At the end we cleanup by removing the temp folder from ram disk
cd ..
echo "Conversion Done"
rm -r "$TEMPDIR"
Oh the humanity, not posting more than two links before 10 reputation and I linked the crap out of OP.. [edit]ah.. mh-mmm.. there we go..
[edit 2] I removed unrar as an dependency and use p7zip instead, as it can extract rar-files.
You will need two checks:
7z t will test the integrity of the archive
You should also test the integrity of all the image files in the archive. You can use at tools like ImageMagick for this.
A simple test would be identify file but that might read only the header. I'd use convert file -resize 5x5 png:- > /dev/null
This scales the image down to 5x5 pixels, converts it to PNG and then pipes the result to /dev/null (discarding it). For the scaling, the whole image has to be read. If this command fails with an error, something is wrong with the image file.

Sync File Modification Time Across Multiple Directories

I have a computer A with two directory trees. The first directory contains the original mod dates that span back several years. The second directory is a copy of the first with a few additional files. There is a second computer be which contains a directory tree which is the same as the second directory on computer A (new mod times and additional files). How update the files in the two newer directories on both machines so that the mod times on the files are the same as the original? Note that these directory trees are in the order of 10s of gigabytes so the solution would have to include some method of sending only the date information to the second computer.
The answer by Paul is partly correct, rsync is able to do this, however with different parameters. The correct command is
rsync -Prt --size-only original_dir copy_dir
where -P enables partial transfers and displays a progress indicator, -r recurses through subdirectories, -t preserves time stamps and --size-only doesn't transfer files that match in size.
The following command will make sure that TEST2 gets the same date assigned that TEST1 has
touch -t `stat -t '%Y%m%d%H%M.%S' -f '%Sa' TEST1` TEST2
Now instead of using hard-coded values here, you could find the files using "find" utility and then run touch via SSH on the remote machine. However, that means you may have to enter the password for each file, unless you switch SSH to cert authentication. I'd rather not do it all in a super fancy one-liner. Instead let's work with temp files. First go to the directory in question and run a find (you can filter by file type, size, extension, whatever pleases you, see "man find" for details. I'm just filtering by type file here to exclude any directories):
find . -type f -print -exec stat -t '%Y%m%d%H%M.%S' -f '%Sm' "{}" \; > /tmp/original_dates.txt
Now we have a file that looks like this (in my example there are only two entries there):
# cat /tmp/original_dates.txt
./test1
200809241840.55
./test2
200809241849.56
Now just copy the file over to the other machine and place it in the directory (so the relative file paths match) and apply the dates:
cat original_dates.txt | (while read FILE && read DATE; do touch -t $DATE "$FILE"; done)
Will also work with file names containing spaces.
One note: I used the last "modification" date at stat, as that's what you wrote in the question. However, it rather sounds as if you want to use the "creation" date (every file has a creation date, last modification date and last access date), you need to alter the stat call a bit.
'%Sm' - last modification date
'%Sc' - creation date
'%Sa' - last access date
However, touch can only change the modification time and access time, I think it can't change the creation time of a file ... so if that was your real intention, my solution might be sub-optimal... but in that case your question was as well ;-)
I would go through all the files in the source directory tree and gather the modification times from them into a script that I could run on the other directory trees. You will need to be careful about a few 'gotchas'. First, make sure that your output script has relative paths, and make sure you run it from the proper target directory, which should be the root directory of the target tree. Also, when changing machines make sure you are using the same timezone as you were on the machine where you generated the script.
Here's a Perl script I put together that will output the touch commands needed to update the times on the other directory trees. Depending on the target machines, you may need to tweak the date formats or command options, but this should give you a place to start.
#!/usr/bin/perl
my $STARTDIR="$HOME/test";
chdir $STARTDIR;
my #files = `find . -type f`;
chomp #files;
foreach my $file (#files) {
my $mtime = localtime((stat($file))[9]);
print qq(touch -m -d "$mtime" "$file"\n);
}
The other approach you could try is to attach the remote directory using NFS and then copy the times using find and touch -r.
I think rsync (with the right options)
will do this - it claims to only send
file differences, so presumably will
work out that there are no differences
to be transferred.
--times preserves the modification times, which is what you want.
See (for instance)
http://linux.die.net/man/1/rsync
Also add -I, --ignore-times don't skip files that match size and time
so that all files are "transferred' and trust to rsync's file differences optimisation to make it "fairly efficient" - see excerpt from the man page below.
-t, --times
This tells rsync to transfer modification times along with the files and update them on the remote system. Note that if this option is not used, the optimization that excludes files that have not been modified cannot be effective; in other words, a missing -t or -a will cause the next transfer to behave as if it used -I, causing all files to be updated (though the rsync algorithm will make the update fairly efficient if the files haven't actually changed, you're much better off using -t).
I used the following Python scripts instead.
Python scripts run much faster than an approach creating new processes for each file (like using find and stat). The solution below also works in case of timezone differences between systems, as it uses UTC times. It also works with paths containing spaces (but not paths containing newline!). It doesn't set times for symlinks, because the operating system provides no mechanism to modify the timestamp of a symlink, but in a file manager the time of the file the symlink points at is shown instead anyway. It uses a maxTime parameter to avoid resetting dates for files that are actually modified after copying from the original directory.
listMTimes.py:
import os
from datetime import datetime
from pytz import utc
for dirpath, dirnames, filenames in os.walk('./'):
for name in filenames+dirnames:
path = os.path.join(dirpath, name)
# Avoid symlinks because os.path.getmtime and os.utime get and
# set the time of the pointed file, and in the new directory,
# the link may have been redirected.
if not os.path.islink(path):
mtime = datetime.fromtimestamp(os.path.getmtime(path), utc)
print(mtime.isoformat()+" "+path)
setMTimes.py:
import datetime, fileinput, os, sys, time
import dateutil.parser
from pytz import utc
# Based on
# http://stackoverflow.com/questions/6999726/python-getting-millis-since-epoch-from-datetime
def unix_time(dt):
epoch = datetime.datetime.fromtimestamp(0, utc)
delta = dt - epoch
return delta.total_seconds()
if len(sys.argv) != 2:
print('Syntax: '+sys.argv[0]+' <maxTime>')
print(' where <maxTime> an ISO time, e. g. "2013-12-02T23:00+02:00".')
exit(1)
# A file with modification time newer than maxTime is not reset to
# its original modification time.
maxTime = unix_time(dateutil.parser.parse(sys.argv[1]))
for line in fileinput.input([]):
(datetimeString, path) = line.rstrip('\r\n').split(' ', 1)
mtime = dateutil.parser.parse(datetimeString)
if os.path.exists(path) and not os.path.islink(path):
if os.path.getmtime(path) <= maxTime:
os.utime(path, (time.time(), unix_time(mtime)))
Usage: in the first directory (the original) run
python listMTimes.py >/tmp/original_dates.txt
Then in the second directory (a copy of the original, possibly with some files modified/added/deleted) run something like this:
python setMTimes.py 2013-12-02T23:00+02:00 </tmp/original_dates.txt

Resources