Bash rsync script to recursively copy video & dir structure from FUSE? - bash

I am trying to copy all video files recursively from a locally synced FUSE file system (GDrive, Insync, Cryptomator, Ubuntu 20.04), recreating the source directory structure in the destination for any video that is moved - only. Although it is working for some video others are left behind with the same file extension as some that were successfully moved.
Can I adjust my script somehow to make this work properly please? I was also using --ignore-existing but I removed it and have been testing with an empty source directory just in case that helped but it did not.
#!/bin/bash
# Define source and destination directories
src_dir="/home/user/cloud"
dest_dir="/home/user/testmove"
# Use rsync to move files and create directories recursively
rsync --progress --ignore-existing -avm -f'+ *[mM][pP]3$' -f'+ *[fF][lL][aA][cC]$' -f'+ *[mM][4][aA]$' -f'+ *[wW][aA][vV]$' -f'+ *[aA][iI][fF][fF]$' -f'+ *[pP][cC][mM]$' -f'+ *[aA][aA][cC]$' -f'+ *[oO][gG][gG]$' -f'+ *[aA][lL][aA][cC]$' -f'+ */' -f'- *' "$src_dir"/ "$dest_dir"/ --prune-empty-dirs
I found using the $ stopped any files being copied but I am testing following which seems to work, and Eric's much more comprehensive and flexible version from his answer below:
#!/bin/bash
# Define source and destination directories
src_dir="/home/user/cloud"
dest_dir="/home/user/testmove"
# Use rsync to copy files and create directories recursively
rsync -ahPv \
-f'+ *.[mM][pP]3' \
-f'+ *.[mM][4][aA]' \
-f'+ *.[wW][aA][vV]' \
-f'+ *.[oO][gG][gG]' \
-f'+ *.[wW][mM][aA]' \
-f'+ *.[fF][lL][aA][cC]' \
-f'+ *.[aA][iI][fF][fF]' \
-f'+ *.[pP][cC][mM]' \
-f'+ *.[aA][aA][cC]' \
-f'+ *.[aA][lL][aA][cC]' \
-f'+ */' \
-f'- *' \
"$src_dir"/ "$dest_dir" --prune-empty-dirs --stats -n
# To delete empty directories left in source when transferring:
# find "$src_dir" -depth -type d -empty -delete

There were some elements of the command structure which were conflicting, preventing the desired restrictions, and overlooking the inclusions. The following form provides the desired output.
Note that I have some hardcoded values that you need to remove.
#!/bin/bash
audio=1
video=0
DryRun=""
while [ $# -gt 0 ]
do
case $1 in
--source ) src_dir="$2" ; shift ; shift ;;
--target ) dest_dir="$2" ; shift ; shift ;;
--audio ) audio=1 ; video=0 ; shift ;;
--video ) audio=0 ; video=1 ; shift ;;
--dry ) DryRun="--dry-run --stats" ; shift ;;
* ) echo -e "\n Invalid option used on command line. Only valid options: [ --source {src_dir} | --target {dest_dir} | --audio | --video | --dry ]\n Bye!\n" ; exit 1 ;;
esac
done
# Define source and destination directories
#src_dir="/home/user/cloud"
src_dir="`pwd`/Output"
#dest_dir="/home/user/testmove"
dest_dir="`pwd`/Dupl"
# Define rules for include or exclude
cat >"EXCLUDES.filters" <<"EnDoFiNpUt"
- *
EnDoFiNpUt
#filters="--filter \"merge AUDIO_TYPES.filters\" " ### THIS FORM DOES NOT WORK !!!
if [ ${audio} -eq 1 ]
then
### NOTE: Trailing "$" to match end of string is implicit, unless there is a trailing "*"
cat >"AUDIO_TYPES.filters" <<"EnDoFiNpUt"
+ */
+ *\.[mM][pP][3]
+ *\.[fF][lL][aA][cC]
+ *\.[mM][4][aA]
+ *\.[wW][aA][vV]
+ *\.[aA][iI][fF][fF]
+ *\.[pP][cC][mM]
+ *\.[aA][aA][cC]
+ *\.[oO][gG][gG]
+ *\.[aA][lL][aA][cC]
EnDoFiNpUt
filters="--include-from=AUDIO_TYPES.filters --exclude-from=EXCLUDES.filters"
fi
if [ ${video} -eq 1 ]
then
### NOTE: Trailing "$" to match end of string is implicit, unless there is a trailing "*"
cat >"VIDEO_TYPES.filters" <<"EnDoFiNpUt"
+ */
+ *\.[mM][pP][4]
+ *\.[aA][vV][iI]
+ *\.[mM][kK][vV]
+ *\.[fF][lL][vV]
+ *\.[wW][mM][vV]
+ *\.[mM][oO][vV]
+ *\.[Aa][Vv][Cc][Hh][Dd]
+ *\.[wW][eE][bB][mM]
+ *\.[hH][2][6][4]
+ *\.[mM][pP][eE][gG][4]
+ *\.[Aa][Vv][Cc][Hh][Dd]
EnDoFiNpUt
filters="--include-from=VIDEO_TYPES.filters --exclude-from=EXCLUDES.filters"
fi
# Use rsync to move files and create directories recursively
### NOTE: I prefer long-form options for understanding code at first glance; shortform doesn't do that for me.
#rsync -rlptgoDvP ### shortform for universal options
rsync \
--verbose \
--recursive \
--links \
--owner \
--group \
--perms \
--times \
--devices \
--specials \
--prune-empty-dirs \
--partial \
--progress \
${DryRun}\
${filters} \
"${src_dir}/" "${dest_dir}/"
Log of session output:
building file list ...
3 files to consider
created directory /0__WORK/Dupl
./
DEF.mP4
0 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=1/3)
abc.mp4
0 100% 0.00kB/s 0:00:00 (xfr#2, to-chk=0/3)
sent 196 bytes received 126 bytes 644.00 bytes/sec
total size is 0 speedup is 0.00
Session log for version of script modified per requestor's (saltyeggs) reformulated command (also works, differently, but as expected):
building file list ...
3 files to consider
created directory /0__WORK/Dupl
./
DEF.mP4
abc.mp4
Number of files: 3 (reg: 2, dir: 1)
Number of created files: 3 (reg: 2, dir: 1)
Number of deleted files: 0
Number of regular files transferred: 2
Total file size: 0 bytes
Total transferred file size: 0 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 128
Total bytes received: 94
sent 128 bytes received 94 bytes 444.00 bytes/sec
total size is 0 speedup is 0.00 (DRY RUN)

Related

how can i run some custom commands in a loop sequentially

There are variables that I need to change and run sequentially within certain codes.For example, the ionex_list variable looks like this and i need to change this in a loop. I have 3 variable files like this and 9 pieces of data in a row (ionex_list,receiver_ids,data_record).
ionex_list data_record
jplg0910.19i.Z ISTA00TUR_R_20190910000_01D_30S_MO.crx.gz
jplg0920.19i.Z ISTA00TUR_R_20190920000_01D_30S_MO.crx.gz
... ...
jplg0980.19i.Z ISTA00TUR_R_20190980000_01D_30S_MO.crx.gz
jplg0990.19i.Z ISTA00TUR_R_20190990000_01D_30S_MO.crx.gz
For example, I need to show ISTA00TUR_R_20190910000_01D_30S_MO.crx.gz as -dataFile in rnxEditGde.py, but I get the error "File: "/y2019/d091" does not exist" and I cannot show the path. these .gz files are a few directories under from the directory I'm working in. Since I couldn't pass the rnxEditGde.py part, I can't see if the codes below work properly.
dir=`pwd`
function pl {
sed -n "$1p" $2
}
for j in {091..099}; do
ids=$(pl $j receiver_ids)
ionex=$(pl $j ionex_list)
dataRecordFile=$(pl $j data_record)
gd2e.py -mkTreeS Trees
sed -i "s/jplg.*/**$ionex**/g" $dir/Trees/ppp_0.tree
rnxEditGde.py -dataFile "/y2019/d${j}/**$dataRecordFile**" -o dataRecordFile.Orig.gz
gd2e.py \
-treeSequenceDir Trees \
-drEditedFile** $dataRecordFile** \
-antexFile goa-var/etc/antennaCalsGNSS/igs14_2038.atx \
-GNSS https://sideshow.jpl.nasa.gov/pub/JPL_GNSS_Products/MultiGNSS_Rapid_Test\
-runType PPP \
-recList **$ids** \
-staDb /goa-var/sta_info/sta_db_qlflinn \
>gd2e.log 2>gd2e.err
done

Join splitted -001.mkv files with bash

I have some -001.mkv files splitted in this way
the rings + of power-001.mkv
the rings + of power-002.mkv
..
nightmare.return-001.mkv
nightmare.return-002.mkv
..
I need to join -001.mkv files to obtain files like this
the rings + of power.mkv
nightmare.return.mkv
I thought of such a code, but doesn't work
for file in "./source/*001.mkv;" \
do \
echo mkvmerge --join \
--clusters-in-meta-seek -o "./joined/$(basename "$file")" "$file"; \
done
source is the source folder where -001.mkv files are located
joined is the target folder
Can you try this?
for f in ./source/*-001.mkv
do
p=${f%-001.mkv}
echo mkvmerge --clusters-in-meta-seek \
-o ./joined/"${p##*/}".mkv "$p"-*.mkv
done
Given your example, the for f in ./source/*-001.mkv should loop though nightmare.return-001.mkv and the rings + of power-001.mkv.
Let's assume that we're in the first iteration and $f expands to nightmare.return-001.mkv
${f%-001.mkv} right-side-strips -001.mkv from $f; that would give ./source/nightmare.return. We store this prefix in the variable p.
${p##*/} left-side-strips all characters up to the last / from $p; that would give nightmare.return.
So ... -o ./joined/"${p##*/}".mkv "$p"-*.mkv would be equivalent to ... -o ./joined/nightmare.return.mkv ./source/nightmare.return-*.mkv
PS: Once you checked that the displayed commands correspond to what you expect then you can remove the echo

Qsub script - Unable to run job: Script length does not match declared length

I have a script that submits processing jobs to a queue. Before I submit the jobs, I assign the string variables to each respective data point so I can use them as the arguments before I submit the jobs through qsub.
I had to fix up the module I'm loading first by putting in a -v variable to set up my working environment. I got the error message that is in the title however, and looking around there is very limited resources to debugging it. One resource I found seems to have led me in the direction of the potential likelihood of an extraneous space in the qsub command itself. Has anyone run into this?
I also did echo on my qsub command to make sure it was being inputted correctly, as it was.
Here's my script:
#!/bin/bash
# This script is for submitting the initial registration subjects for Greedy registration.
# It can serve as a template for later studies when multiple submissions could be handy
# GO_HOME = Origin diqrectory for all niftis of interest
GO_NIFTI="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Nifti/"
GO_B0="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Protocols/ants_SyNBaseline/W_Registration_antsSyN_Baseline7/"
GO_FM="/gpfs/fs001/medorg/comp_space/myname/Test-Retest/Protocols/brainmage_batch_t1/"
FINAL_DESTINATION="/gpfs/fs001/cbica/comp_space/wingerti/Test-Retest/Protocols/Registration_greedy_Rigid/"
cd $GO_NIFTI
nii_directories=($(find . -type d -name "*t1*" -o -name "*t0*" -o -name "*t2*" -maxdepth 1 ))
module load greedy
# Will look at these subjects individually, taking them out list to not run DTI_Preprocess
unset nii_directories[27] # 1000009_t0_test
unset nii_directories[17] # 1000001_t0
unset nii_directories[4] # 1000009_t2
# With directories, navigate into each, and find where the suitable niis are (31dir and 33dir)
for g in "${nii_directories[#]}";
do
# Subject ID argument
subjid=${g:2:9}
echo "$subjid is the subject ID..."
# -i argument (T1 NIFTI File and DTI)
cd $GO_NIFTI
nii_is=$(find $subjid -type f -name ${subjid}_T1.nii.gz)
nii_i=${GO_NIFTI}${nii_is}
cd $GO_B0
cd $subjid
GO_B0_2=$PWD
b0_is=$(find . -type f -name b0.nii.gz)
b0_i=${GO_B0_2}${b0_is}
echo "-i arguments for $subjid is $nii_i and $b0_i"
# -m argument (Mask File)
#cd $GO_DTI
#mask_ms=$(find $subjid -type f -name ${subjid}_tensor_mask.nii.gz)
#mask_m=${GO_DTI}${mask_ms}
#echo "-m argument for $subjid is $mask_m"
# -fm argument (T1 mask)
cd $GO_FM
mask_fms=$(find $subjid -type f -name ${subjid}_t1_brain_mask.nii.gz)
mask_fm=${GO_FM}${mask_fms}
echo "-fm argument for $subjid is $mask_fm"
# -o argument (Working Directory for possible debugging and tmp dir organization among experiments)
cd $FINAL_DESTINATION
g=${FINAL_DESTINATION:73:-1}
experiment_name="${subjid}_${g}"
mkdir $experiment_name
output_o=${FINAL_DESTINATION}${experiment_name}/${experiment_name}_rigid.txt
echo "-o argument for $g is $output_o"
#
printf "\nSubmitting the following command: \n
qsub -m beas -M myname#medschool.edu -N Registration_${experiment_name} "$(which greedy)" -d3 -i $nii_i $b0_i -o $output_o -a -m MI -n 100x100 -fm $mask_fm dof 6 -ia-identity\n
as JobID: Registration_${experiment_name}\n\n"
qsub -v /medorg/software/external/greedy/centos7/c6dca2e -m beas -M myname#medschool.edu -N Registration_${experiment_name} "$(which greedy)" -d3 -i $nii_i $b0_i -o $output_o -a -m MI -n 100x100 -fm $mask_fm dof 6 -ia-identity
# --- Above line submits Greedy Rigid jobs (dof 6) with
# --- "-m" for emailing updates on jobs, inbox sorts job submission emails
# --- "-N" names the job for book-keeping
cd $GO_NIFTI
done

GhostScript auto pagenumbering

I want to export one certain page from a pdf document to an image and automatically fill the page number in the file name. When I run the following code:
gs \
-sDEVICE=jpeg \
-o outfile-%03.jpeg \
-dFirstPage=12 \
-dLastPage=12 \
wo.pdf
I get: outfile-001.jpeg instead of outfile-012.jpeg.
I've wrote a bash script for the job:
function extract_nth_page(){
printf -v j "outfile-%05g.png" $1
echo $j
gs -q -dNOPAUSE -sDEVICE=png16m -r600 -dFirstPage=$1 -dLastPage=$1 -sOutputFile=$j $2 -c quit
return 0
}
# Extracts page number 42 from myFile.pdf to outfile-00042.png
extract_nth_page 42 myFile.pdf

Jenkins Reload Configuration Removes Changes

Currently using Jenkins at my company, I set up a server for all of our engineers to plug in to, in doing so I made some server management jobs to make my life a little easier. One of them was a config editor to edit the $JENKINS_HOME/config.xml file and trigger a configuration reload to reflect the new changes.
However today when I went to go use that job, the changes we no longer taking effect, nor were they shown when ssh'd into the server and cat-ing the config.xml file.
Did some debugging, made sure that the file contents were being replaced correctly, even threw the checks into the build executor to make sure I knew that everything was correct prior to running the reload-configuration command by double checking md5 sums as the entire content is replaced in my script. I even sleep 15-d before the reload so I could cat the config.xml file and ensure my changes are there, and they always are.
However, as soon as the reload command is run, all of my changes are replaced with what the config contents were just before I made my changes (I also confirmed this from md5 sums of the file in my debugging)
Here's the executor of my job if that helps at all:
$CONFIG_FILE is always $JENKINS_HOME/config.xml
#!/bin/bash
set -o pipefail -e -u -x
cp "$CONFIG_FILE" "$WORKSPACE/config_backup.xml"
printf "Creating an AMI profile with these parameters: \n\n\
Config File: | $CONFIG_FILE \n\
AMI ID: | $AMI_ID \n\
Description: | $DESCRIPTION \n\
Instance Type: | $INSTANCE_TYPE \n\
Security Groups: | $SECURITY_GROUPS \n\
Remote Workspace: | $REMOTE_WORKSPACE \n\
Label(s): | $LABELS \n\
Subnet ID: | $SUBNET_ID \n\
IAM Profile: | $IAM_INSTANCE_PROFILE \n\
Instance Tags: | $TAGS \n\
Executors: | $EXECUTORS \
\n\n\
"
new_xml="$(python "$WORKSPACE/<scriptname removed for security reasons>" \
--file $CONFIG_FILE \
--ami $AMI_ID \
--description $DESCRIPTION \
--type $INSTANCE_TYPE \
--security-groups $SECURITY_GROUPS \
--remote-workspace $REMOTE_WORKSPACE \
--labels $LABELS \
--iam-instance-profile $IAM_INSTANCE_PROFILE \
--subnet-id $SUBNET_ID \
--tags $TAGS \
--executors $EXECUTORS)" || true
if [ -z "$new_xml" ]; then
echo "Ran into an error..."
cat "xml_ami_profile_parser.log"
exit 1
fi
echo "setting new config file content..."
echo "$new_xml" > "$CONFIG_FILE"
echo "config file set!"
CONFIG_MD5="$(md5sum "$CONFIG_FILE" | awk '{print $1}')"
NEW_MD5="$(echo "$new_xml" | md5sum | awk '{print $1}')"
printf "comparing MD5 Sums: \n\
[ $CONFIG_MD5 ] \n\
[ $NEW_MD5 ]\n\n"
if [[ "$CONFIG_MD5" != "$NEW_MD5" ]]; then
echo "Config File ($CONFIG_FILE) was not overwritten successfully. Restoring backup..."
cp "$WORKSPACE/config_backup.xml" "$CONFIG_FILE"
exit 1
fi
# use jenkins api user info
USERNAME="$(cat <scriptname removed for security reasons> | awk '{print $8}')"
PASSWORD="$(cat <scriptname removed for security reasons> | awk '{print $9}')"
curl -X POST -u "$USERNAME:$PASSWORD" "<url removed for security reasons>"
sleep 10
NEW_MD5="$(md5sum "$CONFIG_FILE" | awk '{print $1}')"
printf "comparing MD5 Sums: \n\
[ $CONFIG_MD5 ] \n\
[ $NEW_MD5 ]\n\n"
if [[ "$CONFIG_MD5" != "$NEW_MD5" ]]; then
echo "Config file reverted after reload, marking build as error."
exit 1
fi
Any help at all is greatly appreciated!
EDIT:
Here's the common output now and can't get past it:
setting new config file content...
config file set!
comparing MD5 Sums:
[ 58473de6acbb48b2e273e3395e64ed0f ]
[ 58473de6acbb48b2e273e3395e64ed0f ]
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
comparing MD5 Sums:
[ 58473de6acbb48b2e273e3395e64ed0f ]
[ f521cec2a2e376921995f773522f78e1 ]
Config file reverted after reload, marking build as error.
Build step 'Execute shell' marked build as failure
Finished: FAILURE
For everyone coming to this later, I solved my own problem. Jenkins has it's own failsafe to keep uptime but doesn't give you any notice of it doing so. If you replace a config.xml with something that a plugin can't parse correctly (in my case the Amazon EC2 Plugin) then the plugin tells Jenkins that the config file is bad, and Jenkins will revert to the last correct XML file it was using (usually the one it has in memory).
If this happens to you double check that you aren't using special chars.
the offending code in mine was an output of the tags section including html char converted quotations " -> " and the plugin couldn't parse this. It was solely a difference in:
<tags>
<hudson.plugins.ec2.EC2Tag>
<name>"Email</name>
<value><removed for security reasons>"</value>
</hudson.plugins.ec2.EC2Tag>
<hudson.plugins.ec2.EC2Tag>
<name>"Name</name>
<value><removed for security reasons>"</value>
</hudson.plugins.ec2.EC2Tag>
</tags>
and
<tags>
<hudson.plugins.ec2.EC2Tag>
<name>Email</name>
<value><removed for security reasons></value>
</hudson.plugins.ec2.EC2Tag>
<hudson.plugins.ec2.EC2Tag>
<name>Name</name>
<value><removed for security reasons></value>
</hudson.plugins.ec2.EC2Tag>
</tags>

Resources