How to create a bash script to make directories and specific files inside each directory - bash

I wrote a bash script trying to generate one directory named after each file inside the directory from which I run the script.
Original directory= /home/agalvez/data//sims/phylip_format
sim1.phylip
sim2.phylip
Directories to create = sim1 sim2
The contents of these new directories should be a copy of the original file that names the new directory and an extra file called "input". This file should contain the name of the .phylip file as well as the following:
"Name of original file"
U
5
Y
/home/agalvez/data/sims/trees/tree_nodenames.txt
After that I want to run the following command (sequentially) in all these new directories:
phylip dollop < input > screenout
My approach is the following one but it is not working:
!/bin/bash
for f in *.phylip;
mkdir /home/agalvez/data/sims/dollop/$f;
cp $f /home/agalvez/data/sims/dollop/$f;
cd /home/agalvez/data/sims/dollop/$f;
echo "$f" | cat > input;
echo "U" | cat >> input;
echo "5" | cat >> input;
echo "Y" | cat >> input;
echo "/home/agalvez/data/sims/trees/tree_nodenames.txt" | cat >> input;
phylip dollop < input > screenout;
;done
Edit: The error messge looks like this:
line 4: syntax error near unexpected token `mkdir'
line 4: ` mkdir /home/agalvez/data/sims/dollop/$f;'
FINAL SOLUTION:
#!/bin/bash
for f in *.phylip;
do
mkdir /home/agalvez/data/sims/dollop/$f;
cp /home/agalvez/data/sims/phylip_format/$f /home/agalvez/data/sims/dollop/$f;
cd /home/agalvez/data/sims/dollop/$f;
echo "$f" | cat > input;
echo "U" | cat >> input;
echo "5" | cat >> input;
echo "Y" | cat >> input;
echo "/home/agalvez/data/sims/trees/tree_nodenames.txt" | cat >> input;
phylip dollop < input > screenout;
done

The immediate problem is that you are lacking a do at the beginning of the loop body; but you'll want to refactor this code to avoid hardcoding the directory structure etc.
The first line needs to start with literally the two characters # and ! in order to be a valid shebang.
Notice also When to wrap quotes around a shell variable?
The printf could be replaced with a here document; I like the compactness of printf here.
#!/bin/bash
for f in *.phylip; do
mkdir -p dollop/"$f"
cp "$f" dollop/"$f"
cd dollop/"$f"
printf "%s\n" "$f" "U" "5" "Y" \
"/home/agalvez/data/sims/trees/tree_nodenames.txt" |
phylip dollop > screenout
done
Going forward, try http://shellcheck.net/ for diagnosing many common beginner problems in shell scripts.

Assuming you have a directory named pingping in your ${HOME} folder with files 1.txt, 2.txt, 3.txt. You can accomplish that like this. Modify this code to suit your needs.
#! /bin/bash
working_directory="${HOME}/pingping/"
cd $working_directory
for f in *.txt
do
mkdir "${f%%.*}"
if [ -f "${f%%.*}.txt" ]
then
if [ -d "${f%%.*}" ]
then
cp ${f%%.*}.txt ${f%%.*}
echo "Done copying"
#phylip dollop < input > screenout
#echo "Succesfully ran the command
fi
else
echo "not found"
fi
done

Related

Move files from directories listed in file

I have a directory structure like the following toy example
DirectoryTo
DirectoryFrom
-Dir1
---File1.txt
---File2.txt
---File3.txt
-Dir2
---File4.txt
---File5.txt
---File6.txt
-Dir3
---File1.txt
---File5.txt
---File7.txt
I'm trying to copy all the files from DirectoryFrom to DirectoryTo, keeping the newer file if there are duplicates.
DirectoryTo
-File1.txt
-File2.txt
-File3.txt
-File4.txt
-File5.txt
-File6.txt
-File7.txt
DirectoryFrom
-Dir1
---File1.txt
---File2.txt
---File3.txt
-Dir2
---File4.txt
---File5.txt
---File6.txt
-Dir3
---File1.txt
---File5.txt
---File7.txt
I've created a text file with a list of all the subdirectories. This list is in the order such that the NEWEST files will be listed first:
Filelist.txt
C:/DirectoryFrom/Dir1
C:/DirectoryFrom/Dir2
C:/DirectoryFrom/Dir3
So what I'd like to do is loop through each directory in Filelist.txt, copy the files, and NOT replace if the file already exists.
I'd like to do this at the command line, in a shell script, or possibly in Python. I'm pretty new to Python, but have a little experience with the command line. However, I've never done something this complicated.
In reality, I have ~60 folders, each with 50-200 files in them, to give you a feel for how many I have. Also, each file is ~75MB.
I've done something similar in R before, but it's slow and not really meant for this. But here's what I've tried for a shell script, edited to fit this toy example:
#!/bin/bash
for line in Filelist.txt
do
cp -n line C:/DirectoryTo/
done
If you have only one one directory level in your DirectoryFrom then you can use:
cp -n DirectoryFrom/*/* DirectoryTo
explanation : copy every file which exist in subdirectories of DirectoryFrom to DirectoryTo if it doesn't exist
n flag is for not overwriting files if they already exist.
cp will also ignore directories if they exist in subdirectories of DirectoryTo
# Create test environnement :
mkdir C:/DirectoryTo
mkdir C:/DirectoryFrom
cd C:/DirectoryFrom
mkdir Dir1 Dir2 Dir3
(
cat << EOF
Dir1/File1.txt
Dir1/File2.txt
Dir1/File3.txt
Dir2/File4.txt
Dir2/File5.txt
Dir2/File6.txt
Dir3/File1.txt
Dir3/File5.txt
Dir3/File7.txt
EOF
)| while read f
do
echo "$f : `date`"
echo "$f : `date`" > $f
sleep 1
done
# create Filelist.txt file :
(
cat << EOF
C:/DirectoryFrom/Dir1
C:/DirectoryFrom/Dir2
C:/DirectoryFrom/Dir3
EOF
) > Filelist.txt
# Generate the liste of all files :
cd C:/DirectoryFrom
cat Filelist.txt | while read f; do ls -1 $f; done | sort -u > filenames.txt
cat filenames.txt
# liste of all files path, sorted by time order :
cd C:/DirectoryFrom
ls -1tr */* > all_filespath_sorted.txt
cat all_filespath_sorted.txt
# selected files to be copied :
cat filenames.txt | while read f; do cat all_filespath_sorted.txt | grep $f | tail -1 ; done
# copy of selected files:
cat filenames.txt | while read f; do cat all_filespath_sorted.txt | grep $f | tail -1 ; done | while read c
do
echo $c
cp -p $c C:/DirectoryTo
done
# verifying :
cd C:/DirectoryTo
ls -ltr
# or
ls -1 | while read f; do echo -e "\n$f\n-------"; cat $f; done
#------------------------------------------------
# Other solution for a limited number of files :
#------------------------------------------------
# To list files by order :
find `cat Filelist.txt | xargs` -type f | xargs ls -1tr
# To copy files, the newer will replace the older :
find `cat Filelist.txt | xargs` -type f | xargs ls -1tr | while read c
do
echo $c
cp -p $c C:/DirectoryTo
done

Adapting a bash script to make a Nautilus-Actions script

I have made (with my little knowledge of Bash and an extensive use of a search engine) a Bash script to reorder the pages of a big PDF file:
#!/bin/bash
file=originalfile.pdf;
newfile=$(basename $file .pdf)-2.pdf;
tmpfile=$(mktemp --suffix=.pdf);
blankfile=$(mktemp --suffix=.pdf);
cp -f $file $newfile;
cp -f $file $tmpfile;
numberofpages=`pdftk $file dump_data | grep "NumberOfPages" | sed 's:.*\([0-9][0-9*]\).*:\1:'`;
echo "" | ps2pdf -sPAPERSIZE=a4 - $blankfile;
while (( $numberofpages % 4 != 0 ));
do
((numberofpages++));
pdftk A=$newfile B=$blankfile cat A B output $tmpfile;
cp -f $tmpfile $newfile;
done;
neworder=`
for (( a=1, b=3, c=4, d=2 ;
a <=numberofpages ;
((a+=4)), ((b+=4)), ((c+=4)), ((d+=4))
));
do
echo -n "$a $b $c $d ";
done`;
pdftk $tmpfile cat $neworder output $newfile;
I wanted to make a Nautilus-Actions script out of it so it could be "installed" and used by a regular user. By regular user, I mean someone unable to type any command-line and unable to follow a few steps to copy the script at a specified place.
Unfortunately the script didn't work and I came up with this new script thanks to the help of people commenting below:
#!/bin/bash
file=originalfile.pdf;
newfile=$(basename $file .pdf)-2.pdf;
tmpfile=$(mktemp --suffix=.pdf);
blankfile=$(mktemp --suffix=.pdf);
cp -f $file $newfile;
cp -f $file $tmpfile;
numberofpages=`pdftk $file dump_data | grep "NumberOfPages" | sed 's:.*\([0-9][0-9*]\).*:\1:'`;
echo "" | ps2pdf -sPAPERSIZE=a4 - $blankfile;
while (( $numberofpages % 4 != 0 )); # NOTE: replace % by %% in Nautilus-Actions
do
((numberofpages++));
pdftk A=$newfile B=$blankfile cat A B output $tmpfile;
cp -f $tmpfile $newfile;
done;
a=0;
neworder=$(
while [ $a -lt $numberofpages ];
do
echo -n "$(($a + 1)) $(($a + 3)) $(($a + 4)) $(($a + 2)) ";
((a+=4));
done;
);
pdftk $tmpfile cat $neworder output $newfile;
I did paste everything in the Path entry of Nautilus-Actions and it finally worked. The newly created Nautilus-action could then be exported in a .desktop file (and therefore imported very easily by any user):
If I ask Nautilus-Actions to display the output, It seems that Nautilus-Actions execute the command line inside a /bin/sh -c 'myscript...' command.
Could you explain to me why I had to change so many things in order to make it work ? Especially why I had to change the for into a while ?
Note: I completely revamp the question since It was a mess.

create and rename multiple copies of files

I have a file input.txt that looks as follows.
abas_1.txt
abas_2.txt
abas_3.txt
1fgh.txt
3ghl_1.txt
3ghl_2.txt
I have a folder ff. The filenames of this folder are abas.txt, 1fgh.txt, 3ghl.txt. Based on the input file, I would like to create and rename the multiple copies in ff folder.
For example in the input file, abas has three copies. In the ff folder, I need to create the three copies of abas.txt and rename it as abas_1.txt, abas_2.txt, abas_3.txt. No need to copy and rename 1fgh.txt in ff folder.
Your valuable suggestions would be appreciated.
You can try something like this (to be run from within your folder ff):
#!/bin/bash
while IFS= read -r fn; do
[[ $fn =~ ^(.+)_[[:digit:]]+\.([^\.]+)$ ]] || continue
fn_orig=${BASH_REMATCH[1]}.${BASH_REMATCH[2]}
echo cp -nv -- "$fn_orig" "$fn"
done < input.txt
Remove the echo if you're happy with it.
If you don't want to run from within the folder ff, just replace the line
echo cp -nv -- "$fn_orig" "$fn"
with
echo cp -nv -- "ff/$fn_orig" "ff/$fn"
The -n option to cp so as to not overwrite existing files, and the -v option to be verbose. The -- tells cp that there are no more options beyond this point, so that it will not be confused if one of the files starts with a hyphen.
using for and grep :
#!/bin/bash
for i in $(ls)
do
x=$(echo $i | sed 's/^\(.*\)\..*/\1/')"_"
for j in $(grep $x in)
do
cp -n $i $j
done
done
Try this one
#!/bin/bash
while read newFileName;do
#split the string by _ delimiter
arr=(${newFileName//_/ })
extension="${newFileName##*.}"
fileToCopy="${arr[0]}.$extension"
#check for empty : '1fgh.txt' case
if [ -n "${arr[1]}" ]; then
#check if file exists
if [ -f $fileToCopy ];then
echo "copying $fileToCopy -> $newFileName"
cp "$fileToCopy" "$newFileName"
#else
# echo "File $fileToCopy does not exist, so it can't be copied"
fi
fi
done
You can call your script like this:
cat input.txt | ./script.sh
If you could change the format of input.txt, I suggest you adjust it in order to make your task easier. If not, here is my solution:
#!/bin/bash
SRC_DIR=/path/to/ff
INPUT=/path/to/input.txt
BACKUP_DIR=/path/to/backup
for cand in `ls $SRC_DIR`; do
grep "^${cand%.*}_" $INPUT | while read new
do
cp -fv $SRC_DIR/$cand $BACKUP_DIR/$new
done
done

shell get string

I have some lines have same structure like
1000 AS34_59329 RICwdsRSYHSD11-2-IPAAPEK-93 /ifshk5/BC_IP/PROJECT/T1
1073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IP
AAPEK-93_1.fq.gz /ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_5932
9/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz /ifshk5/
BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/clean_111220_I631_FCC0E5EACXX_
L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info 11.824 0.981393
43.8283 95.7401 OK
And I want to get the Bold part to check whether in /home/jesse/ has this folder, if not create mkdir /home/jesse/AS34_59329
I use this code
! /bin/bash
myPath="/home/jesse/"
while read myline
do
dirname= echo "$myline" | awk -F ' ' '{print $2}'
echo $dirname
myPath= $myPath$dirname
echo $myPath
mkdir -p "$myPath"
done < T11073_all_3254.fq.list
But it can't mkdir and show the path name, it shows
-bash: /home/jesse/: is a directory
/home/jesse/
AS39_59324
read can read each field into a separate variable, and mkdir -p will create a dir only if it doesn't exist:
path="/home/jesse"
while read _ dir _
do
mkdir -p "$path/$dir"
done < T11073_all_3254.fq.list
for will iterate over each whitespace separated token. Try this instead.
#!/usr/bin/env bash
# Invoke with first arg as file containing the lines
# foo.sh <input_filename>
for i in `cat $1 | cut -d " " -f2`
do
if [ -d /home/jesse/$i ]
then
echo "Directory /home/jesse/$i exists"
else
mkdir /home/jesse/$i;
echo "Directory /home/jesse/$i created"
fi
done

Redirect grep output to file

I am not sure as to why that redirection provided in the code does not work. Every time I run the script, the output file is always empty. Does anyone have an idea on that?
Thanks.
#!/bin/sh
LOOK_FOR="DefaultProblem"
FILES=`ls plugins/*source*.jar`
for i in $FILES
do
# echo "Looking in $i ..."
unzip -p $i | grep -i $LOOK_FOR > output #> /dev/null
if [ $? == 0 ]
then
echo ">>>> Found $LOOK_FOR in $i <<<<"
fi
done
You may want to use >> (append) instead of > (overwrite) for redirection as:
unzip -p $i | grep -iF "$LOOK_FOR" >> output
Since you're executing this command in a loop and overwriting file output every time, it might be blank in the end if very last command with grep doesn't find any matching line in unzip output.
You have three problems
Don't try to parse the output of ls. Instead just use for i in plugins/*source*.jar The major reason is that your script will completely and utterly break on any files that have spaces in their names. See this link for a litany of reasons why not to parse ls
You need to use >> instead of > as the latter will overwrite the output file on each iteration of the loop. The former will append to it
Use more quotes! You'll want to quote your variables to make sure they aren't subjected to word splitting
Also, you can inline the if test. So putting it all together we have:
#!/bin/sh
LOOK_FOR="DefaultProblem"
for i in plugins/*source*.jar
do
# echo "Looking in $i ..."
if unzip -p "$i" | grep -i "$LOOK_FOR" >> output #> /dev/null
then
echo ">>>> Found $LOOK_FOR in $i <<<<"
fi
done
You can redirect the output of the entire loop:
#!/bin/sh
LOOK_FOR="DefaultProblem"
FILES=`ls plugins/*source*.jar`
for i in $FILES ; do
# echo "Looking in $i ..." 1>&2
unzip -p $i | grep -i $LOOK_FOR
if [ $? == 0 ] ; then
echo ">>>> Found $LOOK_FOR in $i <<<<" 1>&2
fi
done > output
Note that I've redirected the diagnostic messages to stderr.
Instead of a for loop and an if conditional you can do everything in one find command
find /path/to/plugins -name "*source*.jar" -exec sh -c 'unzip -l "{}" | grep -q DefaultProblem' \; -print

Resources