Shell script to create directories - bash

I'm trying to create a simple shell script for recursively creating directories inside a list of directories.
I have the next file structure:
A directory called v_79, containing a list of "dirs" (from dir_0 to dir_210), and inside each of them there are several directories called ENSG00000??????, where '?' stands for a character between [0-9].
I would like to create a directory called "my_dir" inside every one of the ENSG00000????? dirs.
I know how to create a directory once being inside each of the dir_XX 's,
for i in ENSG00000??????; do mkdir $i/my_dir; done
but I don't know how to create the directory that I need, in the v_79 directory.

If current dir is v_79, you can use a combination of find and xargs:
find . -name 'ENSG00000......' -type d | xargs -I DIR mkdir DIR/my_dir

if your current directory contains directory "v_79", then
for dir in v_79/dir_{0..210}/ENSG00000??????; do mkdir $dir/my_dir; done
I wonder if that might give you an "argument list too long" error, in which case find is the way to go.

mkdir -p v_79/dir{0,1}{1,2,3}
will create the directories v79/dir01, v79/dir02, v79/dir03, v79/dir11, v79/dir12 and v79/dir13 even if v_79 does not exist.
The -p options will create all required directories recursively.

You can do so from your v_79 directory:
for i in `find . -type d -name "ENSG00000??????"`; do mkdir $i/my_dir; done

this is for dry run - if satisfied, delete the echo before mkdir
echo ./v_79/**/ENSG* | xargs -I% echo mkdir %/my_dir #or
echo ./v_79/**/dir_*/ENSG* | xargs -I% echo mkdir %/my_dir
you need for this bash4 and "shopt -s globstar" (e.g. in your profile)
If you have too much directories, you may get "argument list too long" error (for the 1st echo). In this case the the solution with the find is better
find v_79 -type d -print | grep '/ENSG' | xargs -I% echo mkdir %/my_dir
find all directories in v_79
filter out only these with name ENSG (you can add more "filters")
run (echo) mkdir for the result
is somewhere in the path can be space, modify the above with:
find v_79 -type d -print0 | grep -z '/ENSG' | xargs -0 -I% echo mkdir %/my_dir
Also, you can limit the depth of the find command, e.g.:
find v_79 -depth 2 -type d -print0 | grep -z '/ENSG' | xargs -0 -I% echo mkdir %/my_dir
again, all above is for the dry run - remove the echo for the run. ;)

Just add the -p option, then your work will done.
BTW: -p option for mkdir command means "no error if existing, make parent directories as needed"

You want
mkdir v_79/dir_{0,1,2}{,0,1,2,3,4,5,6,7,8,9}{,0,1,2,3,4,5,6,7,8,9}/ENSG00000??????/my_dir

Related

How to cd into grep output?

I have a shell script which basically searches all folders inside a location and I use grep to find the exact folder I want to target.
for dir in /root/*; do
grep "Apples" "${dir}"/*.* || continue
While grep successfully finds my target directory, I'm stuck on how I can move the folders I want to move in my target directory. An idea I had was to cd into grep output but that's where I got stuck. Tried some Google results, none helped with my case.
Example grep output: Binary file /root/ant/containers/secret/Documents/2FD412E0/file.extension matches
I want to cd into 2FD412E0and move two folders inside that directory.
dirname is the key to that:
cd $(dirname $(grep "...." ...))
will let you enter the directory.
As people mentioned, dirname is the right tool to strip off the file name from the path.
I would use find for such kind of task:
while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
done < <(find /root/ -type f \
-exec grep "Apples" --files-with-matches {} \;)
Consider using find's -maxdepth option. See the man page for find.
Well, there is actually simpler solution :) I just like to write bash scripts. You might simply use single find command like this:
find /root/ -type f -exec grep Apples {} ';' -exec ls -l {} ';'
Note the second -exec. It will be executed, if the previous -exec command exited with status 0 (success). From the man page:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ; is encountered. The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find.
Replace the ls -l command with your stuff.
And if you want to execute dirname within the -exec command, you may do the following trick:
find /root/ -type f -exec grep -q Apples {} ';' \
-exec sh -c 'cd `dirname $0`; pwd' {} ';'
Replace pwd with your stuff.
When find is not available
In the comments you write that find is not available on your system. The following solution works without find:
grep -R --files-with-matches Apples "${dir}" | while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
echo $target_dir
done

How to remove intermediate folders containing only one folder each?

I had been playing around with mv, and now I have a situation.
Earlier, say
Folder1 had file1,2,3.
Now Folder1 has Folder2 which has Folder3 which has Folder4 which contains file1,2,3.
I am trying to write a bash script such that it identifies intermediate folders containing only 1 directory and moves all its contents up one level, ultimately giving back only Folder1->file1,2,3, and rest folders deleted.
I tried to write something like below, but I am :
1.unable to distinguish between file and folder
2.unable to find the file/directory name stored inside current folder
3.Not sure how to do recursively.
#!/bin/bash
echo "Directory Name?"
read dir_name
no_files=`ls -A| wc -l`
if [ $no_file==1 ] && [ itisaDirectory()];
then `mv folder_name/* dir_name`
fi
When you do not care for error messages and want to move all files in subdirs to the current dir and remove the remaining empty dir, do something like
find . -type f -exec mv {} "${dir_name}" \; 2>/dev/null
rm -r */
You ask for something else, only move files where an intermediate directory is unique. That is the case if exactly one subdir has that dir as a parent. The parent of a dir can be found with dirname.
When a dir has one subdir, only one subdir will have it as a parent. You can list all dirs, look for the parent and select the unique paths.
find . -type d -exec dirname {} \; | sort | uniq -u | while read dir; do
echo "${dir} has exactly one subdir"
done
The problem is that the dir can have files as well. We try to improve the above solution:
find . -exec dirname {} \; | sort | uniq -u | while read dir; do
echo "${dir} has exactly one subdir or one file"
done
You can test the content of the dir with if [ -d "${dir}/*" ] but I do not need to know:
find . -exec dirname {} \; | sort | uniq -u | while read dir; do
echo "${dir} has exactly one subdir or one file"
find "${dir}"/*/ -type f -exec mv {} "${dir_name}" \; 2>/dev/null
done
The path ${dir}/*/ will only exist when ${dir} has a subdirectory in it, and will move the files beneath. When $dir only has one file, the find command will find nothing.

Using find and xargs how can I stop execution on errors without crapping out

In my script I have the following 3 commands
Basically what it is trying to do is:
create a symlink to a certain bunch of files based on their filenames, in a temp directory.
change the name of the symlink to match the current date
move the symlinks from a temp directory to their proper location
-
find . -type f -name "*${regex}-*" -exec ln -s {} "${DataTempPath}/"{} \;
find "$DataTempPath" -type l | sed -e "p;s/A[0-9]*/A${today}/" | xargs -n2 mv
mv $DataTempPath/* $DataSetPath
This will be inserted as a cron job to run every 15 mins, which is not a problem when the source directory contains valid data.
However when it doesn't contain any files I get errors on the second find command and the mv command
What I want I guess is a way of not executing the last two lines of the script if the first one does not create any new links
GNU xargs supports a --no-run-if-empty parameter that, to quote the documentation "If the standard input is completely empty, do not run the command. By default, the command is run once even if there is no input".
This should help avoid the xargs error (assuming you are running GNU xargs)
check the status of the command:
find . -type f -name "*${regex}-*" -exec ln -s {} "${DataTempPath}/"{} \;
if [[ $? == 0 ]]; then
find "$DataTempPath" -type l | sed -e "p;s/A[0-9]*/A${today}/" | xargs -n2 mv
mv $DataTempPath/* $DataSetPath
fi

Modifying replace string in xargs

When I am using xargs sometimes I do not need to explicitly use the replacing string:
find . -name "*.txt" | xargs rm -rf
In other cases, I want to specify the replacing string in order to do things like:
find . -name "*.txt" | xargs -I '{}' mv '{}' /foo/'{}'.bar
The previous command would move all the text files under the current directory into /foo and it will append the extension bar to all the files.
If instead of appending some text to the replace string, I wanted to modify that string such that I could insert some text between the name and extension of the files, how could I do that? For instance, let's say I want to do the same as in the previous example, but the files should be renamed/moved from <name>.txt to /foo/<name>.bar.txt (instead of /foo/<name>.txt.bar).
UPDATE: I manage to find a solution:
find . -name "*.txt" | xargs -I{} \
sh -c 'base=$(basename $1) ; name=${base%.*} ; ext=${base##*.} ; \
mv "$1" "foo/${name}.bar.${ext}"' -- {}
But I wonder if there is a shorter/better solution.
The following command constructs the move command with xargs, replaces the second occurrence of '.' with '.bar.', then executes the commands with bash, working on mac OSX.
ls *.txt | xargs -I {} echo mv {} foo/{} | sed 's/\./.bar./2' | bash
It is possible to do this in one pass (tested in GNU) avoiding the use of the temporary variable assignments
find . -name "*.txt" | xargs -I{} sh -c 'mv "$1" "foo/$(basename ${1%.*}).new.${1##*.}"' -- {}
In cases like this, a while loop would be more readable:
find . -name "*.txt" | while IFS= read -r pathname; do
base=$(basename "$pathname"); name=${base%.*}; ext=${base##*.}
mv "$pathname" "foo/${name}.bar.${ext}"
done
Note that you may find files with the same name in different subdirectories. Are you OK with duplicates being over-written by mv?
If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:
find . -name "*.txt" | parallel 'ext={/} ; mv -- {} foo/{/.}.bar."${ext##*.}"'
Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
If you're allowed to use something other than bash/sh, AND this is just for a fancy "mv"... you might try the venerable "rename.pl" script. I use it on Linux and cygwin on windows all the time.
http://people.sc.fsu.edu/~jburkardt/pl_src/rename/rename.html
rename.pl 's/^(.*?)\.(.*)$/\1-new_stuff_here.\2/' list_of_files_or_glob
You can also use a "-p" parameter to rename.pl to have it tell you what it WOULD HAVE DONE, without actually doing it.
I just tried the following in my c:/bin (cygwin/windows environment). I used the "-p" so it spit out what it would have done. This example just splits the base and extension, and adds a string in between them.
perl c:/bin/rename.pl -p 's/^(.*?)\.(.*)$/\1-new_stuff_here.\2/' *.bat
rename "here.bat" => "here-new_stuff_here.bat"
rename "htmldecode.bat" => "htmldecode-new_stuff_here.bat"
rename "htmlencode.bat" => "htmlencode-new_stuff_here.bat"
rename "sdiff.bat" => "sdiff-new_stuff_here.bat"
rename "widvars.bat" => "widvars-new_stuff_here.bat"
the files should be renamed/moved from <name>.txt to /foo/<name>.bar.txt
You can use rename utility, e.g.:
rename s/\.txt$/\.txt\.bar/g *.txt
Hint: The subsitution syntax is similar to sed or vim.
Then move the files to some target directory by using mv:
mkdir /some/path
mv *.bar /some/path
To do rename files into subdirectories based on some part of their name, check for:
-p/--mkpath/--make-dirs Create any non-existent directories in the target path.
Testing:
$ touch {1..5}.txt
$ rename --dry-run "s/.txt$/.txt.bar/g" *.txt
'1.txt' would be renamed to '1.txt.bar'
'2.txt' would be renamed to '2.txt.bar'
'3.txt' would be renamed to '3.txt.bar'
'4.txt' would be renamed to '4.txt.bar'
'5.txt' would be renamed to '5.txt.bar'
Adding on that the wikipedia article is surprisingly informative
for example:
Shell trick
Another way to achieve a similar effect is to use a shell as the launched command, and deal with the complexity in that shell, for example:
$ mkdir ~/backups
$ find /path -type f -name '*~' -print0 | xargs -0 bash -c 'for filename; do cp -a "$filename" ~/backups; done' bash
Inspired by an answer by #justaname above, this command which incorporates Perl one-liner will do it:
find ./ -name \*.txt | perl -p -e 's/^(.*\/(.*)\.txt)$/mv $1 .\/foo\/$2.bar.txt/' | bash

How to go to each directory and execute a command?

How do I write a bash script that goes through each directory inside a parent_directory and executes a command in each directory.
The directory structure is as follows:
parent_directory (name could be anything - doesnt follow a pattern)
001 (directory names follow this pattern)
0001.txt (filenames follow this pattern)
0002.txt
0003.txt
002
0001.txt
0002.txt
0003.txt
0004.txt
003
0001.txt
the number of directories is unknown.
This answer posted by Todd helped me.
find . -maxdepth 1 -type d \( ! -name . \) -exec bash -c "cd '{}' && pwd" \;
The \( ! -name . \) avoids executing the command in current directory.
You can do the following, when your current directory is parent_directory:
for d in [0-9][0-9][0-9]
do
( cd "$d" && your-command-here )
done
The ( and ) create a subshell, so the current directory isn't changed in the main script.
You can achieve this by piping and then using xargs. The catch is you need to use the -I flag which will replace the substring in your bash command with the substring passed by each of the xargs.
ls -d */ | xargs -I {} bash -c "cd '{}' && pwd"
You may want to replace pwd with whatever command you want to execute in each directory.
If you're using GNU find, you can try -execdir parameter, e.g.:
find . -type d -execdir realpath "{}" ';'
or (as per #gniourf_gniourf comment):
find . -type d -execdir sh -c 'printf "%s/%s\n" "$PWD" "$0"' {} \;
Note: You can use ${0#./} instead of $0 to fix ./ in the front.
or more practical example:
find . -name .git -type d -execdir git pull -v ';'
If you want to include the current directory, it's even simpler by using -exec:
find . -type d -exec sh -c 'cd -P -- "{}" && pwd -P' \;
or using xargs:
find . -type d -print0 | xargs -0 -L1 sh -c 'cd "$0" && pwd && echo Do stuff'
Or similar example suggested by #gniourf_gniourf:
find . -type d -print0 | while IFS= read -r -d '' file; do
# ...
done
The above examples support directories with spaces in their name.
Or by assigning into bash array:
dirs=($(find . -type d))
for dir in "${dirs[#]}"; do
cd "$dir"
echo $PWD
done
Change . to your specific folder name. If you don't need to run recursively, you can use: dirs=(*) instead. The above example doesn't support directories with spaces in the name.
So as #gniourf_gniourf suggested, the only proper way to put the output of find in an array without using an explicit loop will be available in Bash 4.4 with:
mapfile -t -d '' dirs < <(find . -type d -print0)
Or not a recommended way (which involves parsing of ls):
ls -d */ | awk '{print $NF}' | xargs -n1 sh -c 'cd $0 && pwd && echo Do stuff'
The above example would ignore the current dir (as requested by OP), but it'll break on names with the spaces.
See also:
Bash: for each directory at SO
How to enter every directory in current path and execute script? at SE Ubuntu
If the toplevel folder is known you can just write something like this:
for dir in `ls $YOUR_TOP_LEVEL_FOLDER`;
do
for subdir in `ls $YOUR_TOP_LEVEL_FOLDER/$dir`;
do
$(PLAY AS MUCH AS YOU WANT);
done
done
On the $(PLAY AS MUCH AS YOU WANT); you can put as much code as you want.
Note that I didn't "cd" on any directory.
Cheers,
for dir in PARENT/*
do
test -d "$dir" || continue
# Do something with $dir...
done
While one liners are good for quick and dirty usage, I prefer below more verbose version for writing scripts. This is the template I use which takes care of many edge cases and allows you to write more complex code to execute on a folder. You can write your bash code in the function dir_command. Below, dir_coomand implements tagging each repository in git as an example. Rest of the script calls dir_command for each folder in directory. The example of iterating through only given set of folder is also include.
#!/bin/bash
#Use set -x if you want to echo each command while getting executed
#set -x
#Save current directory so we can restore it later
cur=$PWD
#Save command line arguments so functions can access it
args=("$#")
#Put your code in this function
#To access command line arguments use syntax ${args[1]} etc
function dir_command {
#This example command implements doing git status for folder
cd $1
echo "$(tput setaf 2)$1$(tput sgr 0)"
git tag -a ${args[0]} -m "${args[1]}"
git push --tags
cd ..
}
#This loop will go to each immediate child and execute dir_command
find . -maxdepth 1 -type d \( ! -name . \) | while read dir; do
dir_command "$dir/"
done
#This example loop only loops through give set of folders
declare -a dirs=("dir1" "dir2" "dir3")
for dir in "${dirs[#]}"; do
dir_command "$dir/"
done
#Restore the folder
cd "$cur"
I don't get the point with the formating of the file, since you only want to iterate through folders... Are you looking for something like this?
cd parent
find . -type d | while read d; do
ls $d/
done
you can use
find .
to search all files/dirs in the current directory recurive
Than you can pipe the output the xargs command like so
find . | xargs 'command here'
#!/bin.bash
for folder_to_go in $(find . -mindepth 1 -maxdepth 1 -type d \( -name "*" \) ) ;
# you can add pattern insted of * , here it goes to any folder
#-mindepth / maxdepth 1 means one folder depth
do
cd $folder_to_go
echo $folder_to_go "########################################## "
whatever you want to do is here
cd ../ # if maxdepth/mindepath = 2, cd ../../
done
#you can try adding many internal for loops with many patterns, this will sneak anywhere you want
You could run sequence of commands in each folder in 1 line like:
for d in PARENT_FOLDER/*; do (cd "$d" && tar -cvzf $d.tar.gz *.*)); done
for p in [0-9][0-9][0-9];do
(
cd $p
for f in [0-9][0-9][0-9][0-9]*.txt;do
ls $f; # Your operands
done
)
done

Resources