Creating variables in running script - bash

I'm trying convert some files to read only in backup environment. Data Domain has retention-lock feature that can lock files with external trigger which touch -a -t "dateuntillocked" /backup/foo.
In this situation there is also metadata files in folder that should not be locked otherwise next backup job cannot update metadata file and fails.
I extracted metadata file names but file count can be changed. For exp.
foo1.meta foo2.meta . . fooN.meta
Is it possible to create a variable for each entry and add to command dynamically?
Like:
var1=/backup/foo234.meta
var2=/backup/foo322.meta
.
.
varN=/backup/fooNNN.meta
<find command> | grep -v $var1 $var2....varN | while read line; do touch -a -t "$dateuntillocked" "$line"; done
another elaboration of the case is
for example you executed a ls in a folder but amount of file can differs in time. script will create a variable for every file and use in a touch command with while loop. if 3 files in folder, script will create 3 variable and use 3 variable with touch in while loop. if "ls" result find 4 files, script dynamically create 4 variable fof files and use all in while loop etc. I am not a programmer so my logic can differ. May be another way to do this with easier way.

Just guessing what your intentions might be.
You can combine find | grep | command into a single command:
find /backup -name 'foo*.meta' -exec touch -a -t "$dateuntillocked" {} +

Related

Bash script to check if a new file has been created on a directory after run a command

By using bash script, I'm trying to detect whether a file has been created on a directory or not while running commands. Let me illustrate the problem;
#!/bin/bash
# give base directory to watch file changes
WATCH_DIR=./tmp
# get list of files on that directory
FILES_BEFORE= ls $WATCH_DIR
# actually a command is running here but lets assume I've created a new file there.
echo >$WATCH_DIR/filename
# and I'm getting new list of files.
FILES_AFTER= ls $WATCH_DIR
# detect changes and if any changes has been occurred exit the program.
After that I've just tried to compare these FILES_BEFORE and FILES_AFTER however couldn't accomplish that. I've tried;
comm -23 <($FILES_AFTER |sort) <($FILES_BEFORE|sort)
diff $FILES_AFTER $FILES_BEFORE > /dev/null 2>&1
cat $FILES_AFTER $FILES_BEFORE | sort | uniq -u
None of them gave me a result to understand there is a change or not. What I need is detecting the change and exiting the program if any. I am not really good at this bash script, searched a lot on the internet however couldn't find what I need. Any help will be appreciated. Thanks.
Thanks to informative comments, I've just realized that I've missed the basics of bash script but finally made that work. I'll leave my solution here as an answer for those who struggle like me.:
WATCH_DIR=./tmp
FILES_BEFORE=$(ls $WATCH_DIR)
echo >$WATCH_DIR/filename
FILES_AFTER=$(ls $WATCH_DIR)
if diff <(echo "$FILES_AFTER") <(echo "$FILES_BEFORE")
then
echo "No changes"
else
echo "Changes"
fi
It outputs "Changes" on the first run and "No Changes" for the other unless you delete the newly added documents.
I'm trying to interpret your script (which contains some errors) into an understanding of your requirements.
I think the simplest way is simply to rediect the ls command outputto named files then diff those files:
#!/bin/bash
# give base directory to watch file changes
WATCH_DIR=./tmp
# get list of files on that directory
ls $WATCH_DIR > /tmp/watch_dir.before
# actually a command is running here but lets assume I've created a new file there.
echo >$WATCH_DIR/filename
# and I'm getting new list of files.
ls $WATCH_DIR > /tmp/watch_dir.after
# detect changes and if any changes has been occurred exit the program.
diff -c /tmp/watch_dir.after /tmp/watch_dir.before
If the any files are modified by the 'commands', i.e. the files exists in the 'before' list, but might change, the above will not show that as a difference.
In this case you might be better off using a 'marker' file created to mark the instance the monitoring started, then use the find command to list any newer/modified files since the market file. Something like this:
#!/bin/bash
# give base directory to watch file changes
WATCH_DIR=./tmp
# get list of files on that directory
ls $WATCH_DIR > /tmp/watch_dir.before
# actually a command is running here but lets assume I've created a new file there.
echo >$WATCH_DIR/filename
# and I'm getting new list of files.
find $WATCH_DIR -type f -newer /tmp/watch_dir.before -exec ls -l {} \;
What this won't do is show any files that were deleted, so perhaps a hybrid list could be used.
Here is how I got it to work. It's also setup up so that you can have multiple watched directories with the same script with cron.
for example, if you wanted one to run every minute.
* * * * * /usr/local/bin/watchdir.sh /makepdf
and one every hour.
0 * * * * /user/local/bin/watchdir.sh /incoming
#!/bin/bash
WATCHDIR="$1"
NEWFILESNAME=.newfiles$(basename "$WATCHDIR")
if [ ! -f "$WATCHDIR"/.oldfiles ]
then
ls -A "$WATCHDIR" > "$WATCHDIR"/.oldfiles
fi
ls -A "$WATCHDIR" > $NEWFILESNAME
DIRDIFF=$(diff "$WATCHDIR"/.oldfiles $NEWFILESNAME | cut -f 2 -d "")
for file in $DIRDIFF
do
if [ -e "$WATCHDIR"/$file ];then
#do what you want to the file(s) here
echo $file
fi
done
rm $NEWFILESNAME

Is there a way to find all the files that got created a specific time?

I want to make a bash script in unix where the user gives a number between 1 and 24.then it has to scan every file in the directory that the user is and find all the files that got created the same time as the number.
I know that unix wont store teh birth time for most of the files.So I found that each file has crtime, which you can find with this line of code: debugfs -R 'stat /path/to/file' /dev/sda2
The problem is that i have to know every crtime so I can search them by the hour.
thanks in advance and sorry for the complicated explenation and bad english
Use a loop to execute stat for each file, then use grep -q to check whether the crtime matches the time given by the user. Since paths in debugfs -R 'stat path' do not necessarily correspond to paths on your system and there might be quoting issues, we use inode numbers instead.
#! /usr/bin/env bash
hour="$1"
for f in ./*; do
debugfs -R "stat <$(stat -c %i "$f")>" /dev/sda2 2> /dev/null |
grep -Eq "^crtime: .* 0?$hour:" &&
echo "$f"
done
Above script assumes that your working directory is on an ext4 file systems on /dev/sda2. Because of debugfs you have to run the script as the superuser (for instance by using sudo). Depending on your system there might be an alternative to debugfs which can be run as a regular user.
Example usage:
Print all files and directories which were created between 8:00:00 am and 8:59:59 am.
$ ./scriptFromAbove 8
./some file
./another file
./directories are matched too
If you want to exclude directories you can add [ -f "$f" ] && in front of debugfs.
The OP has already identified the issue that Unix system calls do not retrieve the file creation time - just the modification times (either mtime or ctime). Assuming this is acceptable substitution, an EFFICIENT way to find all the files in the current directory created on a specific hour is to leverage 'ls' and an awk filter.
#! /bin/sh
printf -v hh '%02d' $1
ls -l --time-style=long-iso | awk -v hh=$hh '+$7 == hh'
While this solution does not access the actual creation time, it does not require super user access (which is usually required for debugfs)

How to make folders for individual files within a directory via bash script?

So I've got a movie collection that's dumped into a single folder (I know, bad practice in retrospect.) I want to organize things a bit so I can use Radarr to grab all the appropriate metadata, but I need all the individual files in their own folders. I created the script below to try and automate the process a bit, but I get the following error.
Script
#! /bin/bash
for f in /the/path/to/files/* ;
do
[[ -d $f ]] && continue
mkdir "${f%.*}"
mv "$f" "${f%.*}"
done
EDIT
So I've now run the script through Shellcheck.net per the suggestion of Benjamin W. It doesn't throw any errors according to the site, though I still get the same errors when I try running the command.
EDIT 2*
No errors now, but the script does nothing when executed.
Assignments are evaluated only once, and not whenever the variable being assigned to is used, which I think is what your script assumes.
You could use a loop like this:
for f in /path/to/all/the/movie/files/*; do
mkdir "${f%.*}"
mv "$f" "${f%.*}"
done
This uses parameter expansion instead of cut to get rid of the file extension.

How to iterate over files in many folders

I have 15 folders and each folder contained a *.gz file. I would like to use that file for one of the package to do some filtering.
For this I would like to write something that can open that folder and read the that specific file and do the actions as mentioned and than save the results in the same folder with different extension.
What I did is(PBS Script):
#!/bin/bash
#PBS -N Trimmomatics_filtering
#PBS -l nodes=1:ppn=8
#PBS -l walltime=04:00:00
#PBS -l vmem=23gb
#PBS -q ext_chem_guest
# Go to the Trimmomatics directory
cd /home/tb44227/bioinfo_packages/Trimmomatic/Trimmomatic-0.36
# Java module load
module load java/1.8.0-162
# Input File (I have a list of 15 folders and each contained fastq.gz file)
**inputFile= for f in /home/tb44227/nobackup/small_RNAseq_260917/support.igatech.it/sequences-export/536-RNA-seq_Disco_TuDO/delivery_25092017/754_{1..15}/*fastq.gz; $f**
# Start the code to filter the file and save the results in the same folder where the input file is
java -jar trimmomatic-0.36.jar SE -threads ${PBS_NUM_PPN} -phred33 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:17 $inputFile $outputFile
# Output File
outputFile=$inputFile{.TRIMMIMG}
My question is How could I define $inputFile and $outputfile so that it can read for all the 15 files.
Thanks
If your application does only process a single input file at a time, you have two options:
Process all files in one single job
Process each file in a different job
From the user's perspective you are usually more interested in the second option, as multiple jobs may run simultaneously if there are resources available. However, this depends on the number of files you need to process and your system usage policy, as sending too many jobs in a short amount of time can cause problems in the job scheudler.
The first option is, more or less, what you already got. You can use find program and a simple bash loop. You basically store find output into a variable, and then iterate over it, like in this example:
#!/bin/bash
# PBS job parameters
module load java
root_dir=/home/tb44227/nobackup/small_RNAseq_260917/support.igatech.it/sequences-export/536-RNA-seq_Disco_TuDO/delivery_25092017
# Get all files to be processed
files=$(find $root_dir -type f -name "*fastq.gz")
for inputfile in $files; do
outputfile="$inputFile{.TRIMMIMG}"
# Process one file at a time
java -jar ... $inputfile $outputfile
done
Then, you just submit your job script, which will generate a single job.
$ qsub myjobscript.sh
The second option is more powerful, but requires you to change the jobscript for each file. Most job managers let you pass the job script by standard input. This is really helpful because it avoids us to generate intermediate files, which pollute your directories.
#!/bin/bash
function submit_job() {
# Submit job. Jobscript passed through standard input using a HEREDOC.
# Must define $inputfile and $outputfile before calling the function.
qsub - <<- EOF
# PBS job parameters
module load java
# Process a single file only
java -jar ... $inputfile $outputfile
EOF
}
root_dir=/home/tb44227/nobackup/small_RNAseq_260917/support.igatech.it/sequences-export/536-RNA-seq_Disco_TuDO/delivery_25092017
# Get all files to be processed
files=$(find $root_dir -type f -name "*fastq.gz")
for inputfile in $files; do
outputfile="$inputFile{.TRIMMIMG}"
submit_job
done
Since you are calling qsub inside the script, you just need to call the script itself, like any regular shell script file.
$ bash multijobscript.sh

shell create new folder

I have many files' path, but I need to copy all files into other location /sample, and I want to copy files into different folders:
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
I want to copy those files into AS34_59329 folder inside /sample
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
I want to copy those file into AS34_59328 folder inside /sample
I write codes to scp all file into /sample folder, but I don't know how to put each files into different sub-directory, like:
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
put into AS34_59328
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i]; then
#how to put different files into different sub-directory
scp -r $i xxx#191.168.174.43:/sample
fi
done
done < data.list
new changed part
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i ]
then
relname=$(echo $i | sed 's%\(/[^/][^/]*\)\{5\}/%%')
echo $relname
fi
done
done < /home/jesse/T11073_all_3254.fq.list
It appears you need to strip the leading 5 components of the pathname off the filename. Since you don't have spaces in your names (the way you're using for i in $myline precludes that possibility), you can use:
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i ]
then
relname=$(echo $i | sed 's%\(/[^/][^/]*\)\{5\}/%%')
scp -r $i xxx#191.168.174.43:/sample/$relname
fi
done
done < data.list
The regex is just a way of looking for a sequence of five sets of slash followed by one or more non-slashes plus one more slash and deleting them. Since slashes figure prominently in the search, I used % to mark the sections of the s/// operation instead.
For example, given the input:
/a/b/c/d/e/f/g
the output from the sed is:
f/g
Note that this code does not explicitly create directories on the remote machine; it just specifies where the file is to go. If you need to create them too, you will have to investigate ssh, probably, to run mkdir -p /sample/$(dirname $relname) on the remote machine (where the dirname operation can be run either locally or remotely).
Note that scp has a recursive copy mode (-r) which would simplify things considerably if you knew you needed to copy all the files from the local directory to the remote.

Resources