shell create new folder - bash

I have many files' path, but I need to copy all files into other location /sample, and I want to copy files into different folders:
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
I want to copy those files into AS34_59329 folder inside /sample
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
I want to copy those file into AS34_59328 folder inside /sample
I write codes to scp all file into /sample folder, but I don't know how to put each files into different sub-directory, like:
/ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59328/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info
put into AS34_59328
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i]; then
#how to put different files into different sub-directory
scp -r $i xxx#191.168.174.43:/sample
fi
done
done < data.list
new changed part
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i ]
then
relname=$(echo $i | sed 's%\(/[^/][^/]*\)\{5\}/%%')
echo $relname
fi
done
done < /home/jesse/T11073_all_3254.fq.list

It appears you need to strip the leading 5 components of the pathname off the filename. Since you don't have spaces in your names (the way you're using for i in $myline precludes that possibility), you can use:
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i ]
then
relname=$(echo $i | sed 's%\(/[^/][^/]*\)\{5\}/%%')
scp -r $i xxx#191.168.174.43:/sample/$relname
fi
done
done < data.list
The regex is just a way of looking for a sequence of five sets of slash followed by one or more non-slashes plus one more slash and deleting them. Since slashes figure prominently in the search, I used % to mark the sections of the s/// operation instead.
For example, given the input:
/a/b/c/d/e/f/g
the output from the sed is:
f/g
Note that this code does not explicitly create directories on the remote machine; it just specifies where the file is to go. If you need to create them too, you will have to investigate ssh, probably, to run mkdir -p /sample/$(dirname $relname) on the remote machine (where the dirname operation can be run either locally or remotely).
Note that scp has a recursive copy mode (-r) which would simplify things considerably if you knew you needed to copy all the files from the local directory to the remote.

Related

mv command and rename not working on multiple flies

Below is a bash script to move files around and rename them. The problem is it doesn't work when there is more than one file in the directory. I'm assuming because the last parameter in the mv command is a file. Any suggestions?
'#!/bin/bash'
'INPUTDIR="/home/southern-uniontn/S001007420"'
'OUTPUTDIR="/mnt/edi-06/southern-uniontn/flats-in"'
'BACKUPDIR="/backup/southern-uniontn/S001007420"'
YEAR=`date +%Y`
MONTH=`date +%m`
DAY=`date +%d`
HOUR=`date +%H`
MINUTE=`date +%M`
######## Do some error checking #########
# Does backup dir exist?
if [ ! -d $BACKUPDIR/$YEAR ]
then
mkdir $BACKUPDIR/$YEAR
fi
if [ ! -d $BACKUPDIR/$YEAR/$MONTH ]
then
mkdir $BACKUPDIR/$YEAR/$MONTH
fi
if [ ! -d $BACKUPDIR/$YEAR/$MONTH/$DAY ]
then
mkdir $BACKUPDIR/$YEAR/$MONTH/$DAY
fi
if [[ $(find $INPUTDIR -type f | wc -l) -gt 0 ]];
then
###### Rename the file, move it to Backup, then copy to the Output Directory #####
for f in $INPUTDIR/*
do
echo "`date` - Move recurring txt flat file to BackupDir for Union TN from Southern"
mv $INPUTDIR/* $BACKUPDIR/$YEAR/$MONTH/$DAY/UnionTN-S001007420-$YEAR$MONTH$DAY-$HOUR$MINUTE.txt
sleep 2
echo "`date` - Copy backup file to the Union TN Output Directory"
cp $BACKUPDIR/$YEAR/$MONTH/$DAY/UnionTN-S001007420-$YEAR$MONTH$DAY-$HOUR$MINUTE.txt $OUTPUTDIR/
done;
fi
Some notes:
Get out of the habit of using ALLCAPS variable names, leave those as reserved
by the shell. One day you'll write PATH=something and then wonder
why your script is
broken.
mkdir -p can create parent directories, and will not error if the dir already exists
store the filenames in an array. Then the shell does not have to duplicate
the work, and you don't need to count how many there are: if there are no
files, the loop has zero iterations
if you want to keep the same directory hierarchy in the outputdir,
you need to do that by hand.
use read to get the date parts
with bash v4.2+, printf can be used instead of calling out to date
use magic value "-1" to mean "now".
printf '%(%Y-%m-%d)T\n' -1 prints "2021-10-25" (as of the day I write this)
This is, I think, what you want:
#!/bin/bash
inputdir='/home/southern-uniontn/S001007420'
outputdir='/mnt/edi-06/southern-uniontn/flats-in'
backupdir='/backup/southern-uniontn/S001007420'
read year month day hour minute < <(printf '%(%Y %m %d %H %M)T\n' -1)
# create backup dirs if not exists
date_dir="$year/$month/$day"
mkdir -p "$backupdir/$date_dir"
mkdir -p "$outputdir/$date_dir"
mapfile -t files < <(find $inputdir -type f)
for f in "${files[#]}"
do
###### Rename the file, move it to Backup, then copy to the Output Directory #####
backup_file="UnionTN-S001007420-$year$month$day-$hour$minute.txt"
printf '%(%c)T - Move recurring txt flat file to backupdir for Union TN from Southern\n' -1
mv "$f" "$backupdir/$date_dir/$backup_file"
printf '%(%c)T - Copy backup file to the Union TN Output Directory\n' -1
cp "$backupdir/$date_dir/$backup_file" "$outputdir/$date_dir/$backup_file"
done
When using a glob with mv, the target must be an existing directory, and all matching files will be moved inside that directory.
In your case,
mv $INPUTDIR/* $BACKUPDIR/$YEAR/$MONTH/$DAY/UnionTN-S001007420-$YEAR$MONTH$DAY-$HOUR$MINUTE.txt
tells mv to move all file inside the $INPUTDIR/* directory to a directory named $BACKUPDIR/$YEAR/$MONTH/$DAY/UnionTN-S001007420-$YEAR$MONTH$DAY-$HOUR$MINUTE.txt.
I'm not sure what you're trying to do, but I hope this help.
Some more advice you could use:
Don't put the shebang (the first line beginning with "#") and the first three variable declarations inside single-quotes.
Some argue it is more portable and better to write /usr/bin/env bash instead of /bin/bash in the shebang
if [ CONDITION ] /then ACTION /fi statements can be simplified by writing [ CONDITION ] && ACTION
You reduce your likely hood of encountering unexpected behaviour when double-quoting your strings and variable (i.e. write "${year}/${month}/" instead of $year/$month.
No need to call mkdir a, followed by mkidr a/b, then mkdir a/b/c and so on, you can just call mkdir -p a/b/c. The p flag tells mkdir to create parent directories if they don't already exist.
It is unnecessary to validate the existence of a directory before calling mkdir since mkdir already validates that for you.
As pointed out by commenters, all-caps variables are conventions for special POSIX related variables. You should use another type of casing.
You could use date to do the formatting for you: date +%Y/%m/%d will print 2021/10/25
Strings without interpolation can have single-quotes.
(Optional, prevent undesired behaviors) Put set -e at the beginning of your scripts, after the shebang, to tell bash to halt if an error is encountered
And finally, use man <command_name> for built-in documentation!

Extracting certain files from a tar archive on a remote ssh server

I am running numerous simulations on a remote server (via ssh). The outcomes of these simulations are stored as .tar archives in an archive directory on this remote server.
What I would like to do, is write a bash script which connects to the remote server via ssh and extracts the required output files from each .tar archive into separate folders on my local hard drive.
These folders should have the same name as the .tar file from which the files come (To give an example, say the output of simulation 1 is stored in the archive S1.tar on the remote server, I want all '.dat' and '.def' files within this .tar archive to be extracted to a directory S1 on my local drive).
For the extraction itself, I was trying:
for f in *.tar; do
(
mkdir ../${f%.tar}
tar -x -f "$f" -C ../${f%.tar} "*.dat" "*.def"
)
done
wait
Every .tar file is around 1GB and there is a lot of them. So downloading everything takes too much time, which is why I only want to extract the necessary files (see the extensions in the code above).
Now the code works perfectly when I have the .tar files on my local drive. However, what I can't figure out is how I can do it without first having to download all the .tar archives from the server.
When I first connect to the remote server via ssh username#host, then the terminal stops with the script and just connects to the server.
Btw I am doing this in VS Code and running the script through terminal on my MacBook.
I hope I have described it clear enough. Thanks for the help!
Stream the results of tar back with filenames via SSH
To get the data you wish to retrieve from .tar files, you'll need to pass the results of tar to a string of commands with the --to-command option. In the example below, we'll run three commands.
# Send the files name back to your shell
echo $TAR_FILENAME
# Send the contents of the file back
cat /dev/stdin
# Send EOF (Ctrl+d) back (note: since we're already in a $'' we don't use the $ again)
echo '\004'
Once the information is captured in your shell, we can start to process the data. This is a three-step process.
Get the file's name
note that, in this code, we aren't handling directories at all (simply stripping them away; i.e. dir/1.dat -> 1.dat)
you can write code to create directories for the file by replacing the forward slashes / with spaces and iterating over each directory name but that seems out-of-scope for this.
Check for the EOF (end-of-file)
Add content to file
# Get the files via ssh and tar
files=$(ssh -n <user#server> $'tar -xf <tar-file> --wildcards \'*\' --to-command=$\'echo $TAR_FILENAME; cat /dev/stdin; echo \'\004\'\'')
# Keeps track of what state we're in (filename or content)
state="filename"
filename=""
# Each line is one of these:
# - file's name
# - file's data
# - EOF
while read line; do
if [[ $state == "filename" ]]; then
filename=${line/*\//}
touch $filename
echo "Copying: $filename"
state="content"
elif [[ $state == "content" ]]; then
# look for EOF (ctrl+d)
if [[ $line == $'\004' ]]; then
filename=""
state="filename"
else
# append data to file
echo $line >> <output-folder>/$filename
fi
fi
# Double quotes here are very important
done < <(echo -e "$files")
Alternative: tar + scp
If the above example seems overly complex for what it's doing, it is. An alternative that touches the disk more and requires to separate ssh connections would be to extract the files you need from your .tar file to a folder and scp that folder back to your workstation.
ssh -n <username>#<server> 'mkdir output/; tar -C output/ -xf <tar-file> --wildcards *.dat *.def'
scp -r <username>#<server>:output/ ./
The breakdown
First, we'll make a place to keep our outputted files. You can skip this if you already know the folder they'll be in.
mkdir output/
Then, we'll extract the matching files to this folder we created (if you don't want them to be in a different folder remove the -C output/ option).
tar -C output/ -xf <tar-file> --wildcards *.dat *.def
Lastly, now that we're running commands on our machine again, we can run scp to reconnect to the remote machine and pull the files back.
scp -r <username>#<server>:output/ ./

Bash - Moving files from subdirectories

I am relatively new to bash scripting.
I need to create a script that will loop through a series of directories, go into subdirectories with a certain name, and then move their file contents into a common folder for all of the files.
My code so far is this:
#!/bin/bash
#used to gather usable pdb files
mkdir -p usable_pdbFiles
#loop through directories in "pdb" folder
for pdbDirectory in */
do
#go into usable_* directory
for innerDirectory in usable_*/
do
if [ -d "$innerDirectory" ] ; then
for file in *.ent
do
mv $file ../../usable_pdbFiles
done < $file
fi
done < $innerDirectory
done
exit 0
Currently I get
usable_Gather.sh: line 7: $innerDirectory: ambiguous redirect
when I try and run the script.
Any help would be appreciated!
The redirections < $innerDirectory and < $file are invalid and this is causing the problem. You don't need to use a loop for this, you can instead rely on the shell's filename expansion and use mv directly:
mkdir -p usable_pdbFiles
mv */usable_*/*.ent usable_pdbFiles
Bear in mind that this solution, and the loop based one that you are working on, will overwrite files with the same name in the destination directory.

Bash If then that reads a list in a file condition

Here is the condition:
I have a file with all packages installed.
I have a folder with all kinds of other packages, but they include all of the ones in the list, plus more.
I need a bash script that will read the file and check a folder for packages that don't exist in the list then remove them, they are not needed, but keep the packages that are on the list in that folder.
Or perhaps the bash should read folder then if packages in the folder aren't on the list them rm -f that or those packages.
I am familiar with writing if then conditional statements, I just don't know how to do if making the items in the list a variable or variables (in a loop).
thanks!
I would move the packages on the list to a new folder, delete the original folder, and move the temporary folder back:
DIR=directory-name
mkdir "$DIR-tmp"
while read pkgname; do
if [[ -f "$DIR/$pkgname" ]]; then
mv "$DIR/$pkgname" "$DIR-tmp"
fi
done < package-list.txt
# Confirm $DIR-tmp has the files you want first!
rm -rf "$DIR"
mv "$DIR-tmp" "$DIR"
I think you want something like this:
for file in $(ls folder) ; do
grep -E "$file" install-list-file >/dev/null || \
echo $file
done > rm-list
vi rm-list # view file to ensure correct
rm $(<rm_list)
There are ways to make this faster (using parameter substitution to avoid fork/exec's), but I recommend avoiding fancy shell stuff [${file##*/}] until you've got the basics down. Also, this script basically translates the description into a script and is not intended to be much more than a guide on how to approach the problem.

shell bash get file path and put into sub-directory

i have a long string in each line, one line like,
1000 AS34_59329 RICwdsRSYHSD11-2-IPAAPEK-93 /ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz /ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_2.fq.gz /ifshk5/BC_IP/PROJECT/T11073/T11073_RICekkR/Fq/AS34_59329/clean_111220_I631_FCC0E5EACXX_L4_RICwdsRSYHSD11-2-IPAAPEK-93_1.fq.gz.total.info 11.824 0.981393 43.8283 95.7401 OK
this line contains three file locations(bold parts), i need to scp those files to another location like /sample . and also create sub-directory to put files, like this line files put into AS34_59329. so need create /sample/AS34_59329
Maybe many lines' sub-directory name is the same, so it need to judge whether the sub-directory has already create.
how to auto create the sub-directory?
#! /bin/bash
while read myline
do
for i in $myline
do
if [ -f $i]; then
scp -r $i xxxx#192.168.174.33:/sample
fi
done
done < data.list
It looks like you have ssh keys, so if you ssh then remote commands will work for you
if [ -f $i]; then
ssh xxxx#192.168.174.33 '[ -d /sample ] && echo "OK" || mkdir /sample'
scp -r $i xxxx#192.168.174.33:/sample
fi
This will only work if you have privilege on the remote box to create /sample.

Resources