Search up through directory path to find a specific directory - bash

I understand how to recursively search down a hierarchy for a file or directory, but can't figure out how to search up the hierarchy and find a specific directory.
Given a path & file such as these fellas :
/Users/username/projects/project_name/lib/sub_dir/file.rb
/Users/username/projects/project_name/lib/sub_dir/2nd_sub_dir/3rd_sub_dir/file.rb
/Users/username/projects/project_name/spec/sub_dir/file.rb
How using the terminal can I get :
/Users/username/projects/project_name
N.B. I know that the next directory down from project_name is spec/ or lib/

Pure bash (no sub-processes spawning or other commands). Depending on how flexible you want it to be you may want to consider running the argument of rootdir() function through readlink -fn first. Explained here.
#!/bin/bash
function rootdir {
local filename=$1
local parent=${filename%%/lib/*}
if [[ $filename == $parent ]]; then
parent=${filename%%/spec/*}
fi
echo $parent
}
# test:
# rootdir /Users/username/projects/project_name/lib/sub_dir/file.rb
# rootdir /Users/username/projects/project_name/spec/sub_dir/file.rb
# rootdir /Users/username/projects/project_name/lib/sub_dir/2nd_sub_dir/3rd_sub_dir/file.rb
# output:
# /Users/username/projects/project_name
# /Users/username/projects/project_name
# /Users/username/projects/project_name

you can use perl:
cat file | perl -pe "s#(.+)(?:spec|lib).+#\1#"
where in file:
/Users/username/projects/project_name/lib/sub_dir/file.rb /Users/username/projects/project_name/lib/sub_dir/2nd_sub_dir/3rd_sub_dir/file.rb /Users/username/projects/project_name/spec/sub_dir/file.rb
or you can use sed:
cat file | sed 's/\(^.*\)\(spec\|lib\).*/\1/'

Related

How to apply a command to subfolders in a folder, and create a new folder with each output?

I have several subfolders in a folder which each contain a file called "README.txt". I would like the following two commands to be applied to each subfolder:
#identify sequence from README.txt where a line contains 'xyz'
seq= sed -nE '/xyz/{ s/^.*Series:([^,]+),.*/\1/; p; }' README.txt
echo $seq
#apply this command to convert filetype and copy the file of interest
dcm2niix_afni $seq -o /subfolder-directory /my-directory
How do I apply these commands to the subfolders in my folder?
I would like the output of the commands to be placed into a new folder, of the same name as the subfolder from which the file is from.
Thank you!
#sk23 It would be much better to provide:
An example of the README.txt file context.
What output exactly do you expect from the sed command?
What is the output of the dcm2niix_afni command - it is a newly generated file probably?
Clarify in detail what you mean by "I would like the output of the commands to be placed into a new folder, of the same name as the subfolder from which the file is from."
But until then you may review the following script as a hint to be completed to reach your requirements.
#!/bin/bash
## User defined Variables
# Main folder path
_mainFolder='./main_folder'
# File name to process
_fileName='README.txt'
while IFS='' read -r _file ;do
seq="$(sed -nE '/xyz/{ s/^.*Series:([^,]+),.*/\1/; p; }' "${_file}")"
## If you need to redirect the `sed` command output to a file
## let's say `README.txt-sed.log` in each relevant folder, un-comment the next line.
# echo "${seq}" > "${_file}-seq.log"
echo "Sed output for the ${_file}"
echo "${seq}"
# Any other relevant command may be added from here.
# To here.
done <<< "$(find "${_mainFolder}" -name "${_fileName}" -type f)"
exit 0

Bash go to level 1 subdir if inside parentsubdir

Okay,
I would like to do the following in the shell.
If I am in a subdir like css which is inside a subdir like mypage (e.g. projects/mypage/htdocs/css) I would like to go into the root dir of the project, which is mypage. I would like to write this as a function to us as a command. The only "fixed" value is projects.
So basically if I am within any subdir of projects in the shell and I type the command goroot (or whatever) I want the function to check if it is in fact inside a subdir of projects and if so, go to the current subdir.
E.g.
~/projects/mypage/htdocs/css › goroot [hit return]
~/projects/mypage > [jumped to here]
Is this at all possible and if so how could I achieve this?
Assuming I am understanding correctly, this should work:
goroot() { cd $(sed -r 's#(~/projects/[^/]*)/.*#\1#' <<< $PWD); }
This sed command effectively strips off everything after ~/projects/SOMETHING and then changes to that directory. If you're not in ~/projects/ then it will leave you in the current directory.
Note: this assumes that $PWD uses the ~ to denote home, if it is something like /home/user/ then amend the sed command appropriately.
projroot=/home/user/projects
goroot() {
# Strip off project root prefix.
local m=${d#$projroot/}
if [ "$m" = "$d" ]; then
echo "Not in ~/projects"
return
fi
# Strip off project directory.
local suf=${m#*/}
if [ "$suf" = "$m" ]; then
echo "Already in project root."
return
fi
# cd to concatenation of project root, and project directory (stripped of sub-project path).
cd "$projroot/${m%/$suf}"
}

BASH filepath and file commands

I am trying to recreate the folder structure from a source in a target location and perform a command on each file found in the process using BASH.Based on some feedback and some searches I am trying to get this solution to work properly. Right now it is breaking because the windows folders have directories with spaces that it refuses to find.
I was able to get this to work after installing some additional features for my cygwin.
source='/cygdrive/z/austin1/QA/Platform QA/8.0.0/Test Cases'
target='/cygdrive/c/FullBashScripts'
# let ** be recursive
shopt -s globstar
for file in "$source"/**/*.restomatic; do
cd "${file%/test.restomatic}"
locationNew="$target${file#$source}"
mkdir -p "$(dirname "$target${file#$source}")"
sed -e 's/\\/\//g' test.restomatic | awk '{if ($1 ~ /^(LOAD|IMPORT)/) system("cat " $2); else print;}' | sed -e 's/\\/\//g' |awk '{if ($1 ~ /^(LOAD|IMPORT)/) system("cat " $2); else print;}' > $locationNew
done
If your bash version is 4 or above, this should work:
source="testing/web testing/"
target="c:/convertedFiles/"
# let ** be recursive
shopt -s globstar
for file in "$source"/**/*.test; do
newfile= "$target/${file#$source}"
mkdir -p "$(dirname "$newfile")"
conversion.command "$file" > "$newfile"
done
${file#$source} lops $source off the beginning of $file.
If you can guarantee that no files have newlines in their name, you can use find to get the names:
source="testing/web testing/"
target="c:/convertedFiles/"
find "$source" -name \*.test | while read file; do
newfile= "$target/${file#$source}"
mkdir -p "$(dirname "$newfile")"
conversion.command "$file" > "$newfile"
done
Your best bet would be to find to get the list of files:
You can do it as follows:
export IFS=`/bin/echo -ne "\n"` # set field separator to new lines only
cd testing # change to the source directory
find . -type d > /tmp/test.dirs # make a list of local directories
for i in `cat /tmp/test.dirs`; do # for each directory
mkdir -p "c:/convertedFiles/$i" # create it in the new location
done
find . -iname *.test > /tmp/test.files # record local file paths as needed
for i in `cat /tmp/test.files`; do # for each test file
process "$i" > "c:/convertedFiles/$i" # process it and store in new dir
done
Note that this is not the most optimal way -- but the easiest to understand and follow. This should work with spaces in filenames. You may have to tweak it further to get it to work under windows.
I would look into a tool called sshfs, or Secure Shell File System. It lets you mount a portion of a remote file system to somewhere local to you.
Once you have the remote fs mounted locally, you can run the follow shell script:
for f in *.*;
do
echo "do something to $f file..";
done
EDIT: I initially did not realize that target was always local anyway.

Shell scripting debug help - Iterating through files in a directory

#!/bin/sh
files = 'ls /myDir/myDir2/myDir3/'
for file in $files do
echo $file
java myProg $file /another/directory/
done
What i'm trying to do is iterate through every file name under /myDir/myDir2/myDir3/, then use that file name as the first argument in calling a java program (second argument is "/another/directory")
When I run this script: . myScript.sh
I get this error:
-bash: files: command not found
What did I do wrong in my script? Thanks!
Per Neeaj's answer, strip off the whitespace from files =.
Better yet, use:
#!/bin/sh -f
dir=/myDir/MyDir2/MyDir3
for path in $dir/*; do
file=$(basename $path)
echo "$file"
java myProg "$file" arg2 arg3
done
Bash is perfectly capable of expanding the * wildcard itself, without spawning a copy of ls to do the job for it!
EDIT: changed to call basename rather than echo to meet OP's (previously unstated) requirement that the path echoed be relative and not absolute. If the cwd doesn't matter, then even better I'd go for:
#!/bin/sh -f
cd /myDir/MyDir2/MyDir3
for file in *; do
echo "$file"
java myProg "$file" arg2 arg3
done
and avoid the calls to basename altogether.
strip off the whitespace in and after files = as files=RHS of assignment
Remove the space surrounding the '=' : change
files = 'ls /myDir/myDir2/myDir3/'
into:
files='ls /myDir/myDir2/myDir3/'
and move the 'do' statement to its own line:
for file in $files
do
....
quote your variables and no need to use ls.
#!/bin/sh
for file in /myDir/myDir2/*
do
java myProg "$file" /another/directory/
done

How best to include other scripts?

The way you would normally include a script is with "source"
eg:
main.sh:
#!/bin/bash
source incl.sh
echo "The main script"
incl.sh:
echo "The included script"
The output of executing "./main.sh" is:
The included script
The main script
... Now, if you attempt to execute that shell script from another location, it can't find the include unless it's in your path.
What's a good way to ensure that your script can find the include script, especially if for instance, the script needs to be portable?
I tend to make my scripts all be relative to one another.
That way I can use dirname:
#!/bin/sh
my_dir="$(dirname "$0")"
"$my_dir/other_script.sh"
I know I am late to the party, but this should work no matter how you start the script and uses builtins exclusively:
DIR="${BASH_SOURCE%/*}"
if [[ ! -d "$DIR" ]]; then DIR="$PWD"; fi
. "$DIR/incl.sh"
. "$DIR/main.sh"
. (dot) command is an alias to source, $PWD is the Path for the Working Directory, BASH_SOURCE is an array variable whose members are the source filenames, ${string%substring} strips shortest match of $substring from back of $string
An alternative to:
scriptPath=$(dirname $0)
is:
scriptPath=${0%/*}
.. the advantage being not having the dependence on dirname, which is not a built-in command (and not always available in emulators)
If it is in the same directory you can use dirname $0:
#!/bin/bash
source $(dirname $0)/incl.sh
echo "The main script"
I think the best way to do this is to use the Chris Boran's way, BUT you should compute MY_DIR this way:
#!/bin/sh
MY_DIR=$(dirname $(readlink -f $0))
$MY_DIR/other_script.sh
To quote the man pages for readlink:
readlink - display value of a symbolic link
...
-f, --canonicalize
canonicalize by following every symlink in every component of the given
name recursively; all but the last component must exist
I've never encountered a use case where MY_DIR is not correctly computed. If you access your script through a symlink in your $PATH it works.
A combination of the answers to this question provides the most robust solution.
It worked for us in production-grade scripts with great support of dependencies and directory structure:
#!/bin/bash
# Full path of the current script
THIS=`readlink -f "${BASH_SOURCE[0]}" 2>/dev/null||echo $0`
# The directory where current script resides
DIR=`dirname "${THIS}"`
# 'Dot' means 'source', i.e. 'include':
. "$DIR/compile.sh"
The method supports all of these:
Spaces in path
Links (via readlink)
${BASH_SOURCE[0]} is more robust than $0
SRC=$(cd $(dirname "$0"); pwd)
source "${SRC}/incl.sh"
1. Neatest
I explored almost every suggestion and here is the neatest one that worked for me:
script_root=$(dirname $(readlink -f $0))
It works even when the script is symlinked to a $PATH directory.
See it in action here: https://github.com/pendashteh/hcagent/blob/master/bin/hcagent
2. The coolest
# Copyright https://stackoverflow.com/a/13222994/257479
script_root=$(ls -l /proc/$$/fd | grep "255 ->" | sed -e 's/^.\+-> //')
This is actually from another answer on this very page, but I'm adding it to my answer too!
3. The most reliable
Alternatively, in the rare case that those didn't work, here is the bullet proof approach:
# Copyright http://stackoverflow.com/a/7400673/257479
myreadlink() { [ ! -h "$1" ] && echo "$1" || (local link="$(expr "$(command ls -ld -- "$1")" : '.*-> \(.*\)$')"; cd $(dirname $1); myreadlink "$link" | sed "s|^\([^/].*\)\$|$(dirname $1)/\1|"); }
whereis() { echo $1 | sed "s|^\([^/].*/.*\)|$(pwd)/\1|;s|^\([^/]*\)$|$(which -- $1)|;s|^$|$1|"; }
whereis_realpath() { local SCRIPT_PATH=$(whereis $1); myreadlink ${SCRIPT_PATH} | sed "s|^\([^/].*\)\$|$(dirname ${SCRIPT_PATH})/\1|"; }
script_root=$(dirname $(whereis_realpath "$0"))
You can see it in action in taskrunner source: https://github.com/pendashteh/taskrunner/blob/master/bin/taskrunner
Hope this help someone out there :)
Also, please leave it as a comment if one did not work for you and mention your operating system and emulator. Thanks!
This works even if the script is sourced:
source "$( dirname "${BASH_SOURCE[0]}" )/incl.sh"
You need to specify the location of the other scripts, there is no other way around it. I'd recommend a configurable variable at the top of your script:
#!/bin/bash
installpath=/where/your/scripts/are
. $installpath/incl.sh
echo "The main script"
Alternatively, you can insist that the user maintain an environment variable indicating where your program home is at, like PROG_HOME or somesuch. This can be supplied for the user automatically by creating a script with that information in /etc/profile.d/, which will be sourced every time a user logs in.
I'd suggest that you create a setenv script whose sole purpose is to provide locations for various components across your system.
All other scripts would then source this script so that all locations are common across all scripts using the setenv script.
This is very useful when running cronjobs. You get a minimal environment when running cron, but if you make all cron scripts first include the setenv script then you are able to control and synchronise the environment that you want the cronjobs to execute in.
We used such a technique on our build monkey that was used for continuous integration across a project of about 2,000 kSLOC.
Shell Script Loader is my solution for this.
It provides a function named include() that can be called many times in many scripts to refer a single script but will only load the script once. The function can accept complete paths or partial paths (script is searched in a search path). A similar function named load() is also provided that will load the scripts unconditionally.
It works for bash, ksh, pd ksh and zsh with optimized scripts for each one of them; and other shells that are generically compatible with the original sh like ash, dash, heirloom sh, etc., through a universal script that automatically optimizes its functions depending on the features the shell can provide.
[Fowarded example]
start.sh
This is an optional starter script. Placing the startup methods here is just a convenience and can be placed in the main script instead. This script is also not needed if the scripts are to be compiled.
#!/bin/sh
# load loader.sh
. loader.sh
# include directories to search path
loader_addpath /usr/lib/sh deps source
# load main script
load main.sh
main.sh
include a.sh
include b.sh
echo '---- main.sh ----'
# remove loader from shellspace since
# we no longer need it
loader_finish
# main procedures go from here
# ...
a.sh
include main.sh
include a.sh
include b.sh
echo '---- a.sh ----'
b.sh
include main.sh
include a.sh
include b.sh
echo '---- b.sh ----'
output:
---- b.sh ----
---- a.sh ----
---- main.sh ----
What's best is scripts based on it may also be compiled to form a single script with the available compiler.
Here's a project that uses it: http://sourceforge.net/p/playshell/code/ci/master/tree/. It can run portably with or without compiling the scripts. Compiling to produce a single script can also happen, and is helpful during installation.
I also created a simpler prototype for any conservative party that may want to have a brief idea of how an implementation script works: https://sourceforge.net/p/loader/code/ci/base/tree/loader-include-prototype.bash. It's small and anyone can just include the code in their main script if they want to if their code is intended to run with Bash 4.0 or newer, and it also doesn't use eval.
Steve's reply is definitely the correct technique but it should be refactored so that your installpath variable is in a separate environment script where all such declarations are made.
Then all scripts source that script and should installpath change, you only need to change it in one location. Makes things more, er, futureproof. God I hate that word! (-:
BTW You should really refer to the variable using ${installpath} when using it in the way shown in your example:
. ${installpath}/incl.sh
If the braces are left out, some shells will try and expand the variable "installpath/incl.sh"!
I put all my startup scripts in a .bashrc.d directory.
This is a common technique in such places as /etc/profile.d, etc.
while read file; do source "${file}"; done <<HERE
$(find ${HOME}/.bashrc.d -type f)
HERE
The problem with the solution using globbing...
for file in ${HOME}/.bashrc.d/*.sh; do source ${file};done
...is you might have a file list which is "too long".
An approach like...
find ${HOME}/.bashrc.d -type f | while read file; do source ${file}; done
...runs but doesn't change the environment as desired.
This should work reliably:
source_relative() {
local dir="${BASH_SOURCE%/*}"
[[ -z "$dir" ]] && dir="$PWD"
source "$dir/$1"
}
source_relative incl.sh
Using source or $0 will not give you the real path of your script. You could use the process id of the script to retrieve its real path
ls -l /proc/$$/fd |
grep "255 ->" |
sed -e 's/^.\+-> //'
I am using this script and it has always served me well :)
Of course, to each their own, but I think the block below is pretty solid. I believe this involves the "best" way to find a directory, and the "best" way to call another bash script:
scriptdir=`dirname "$BASH_SOURCE"`
source $scriptdir/incl.sh
echo "The main script"
So this may be the "best" way to include other scripts. This is based off another "best" answer that tells a bash script where it is stored
Personally put all libraries in a lib folder and use an import function to load them.
folder structure
script.sh contents
# Imports '.sh' files from 'lib' directory
function import()
{
local file="./lib/$1.sh"
local error="\e[31mError: \e[0mCannot find \e[1m$1\e[0m library at: \e[2m$file\e[0m"
if [ -f "$file" ]; then
source "$file"
if [ -z $IMPORTED ]; then
echo -e $error
exit 1
fi
else
echo -e $error
exit 1
fi
}
Note that this import function should be at the beginning of your script and then you can easily import your libraries like this:
import "utils"
import "requirements"
Add a single line at the top of each library (i.e. utils.sh):
IMPORTED="$BASH_SOURCE"
Now you have access to functions inside utils.sh and requirements.sh from script.sh
TODO: Write a linker to build a single sh file
we just need to find out the folder where our incl.sh and main.sh is stored; just change your main.sh with this:
main.sh
#!/bin/bash
SCRIPT_NAME=$(basename $0)
SCRIPT_DIR="$(echo $0| sed "s/$SCRIPT_NAME//g")"
source $SCRIPT_DIR/incl.sh
echo "The main script"
According man hier suitable place for script includes is /usr/local/lib/
/usr/local/lib
Files associated with locally installed programs.
Personally I prefer /usr/local/lib/bash/includes for includes.
There is bash-helper lib for including libs in that way:
#!/bin/bash
. /usr/local/lib/bash/includes/bash-helpers.sh
include api-client || exit 1 # include shared functions
include mysql-status/query-builder || exit 1 # include script functions
# include script functions with status message
include mysql-status/process-checker; status 'process-checker' $? || exit 1
include mysql-status/nonexists; status 'nonexists' $? || exit 1
Most of the answers I saw here seem to overcomplicate things. This method has always worked reliably for me:
FULLPATH=$(readlink -f $0)
INCPATH=${FULLPATH%/*}
INCPATH will hold the complete path of the script excluding the script filename, regardless of how the script is called (by $PATH, relative or absolute).
After that, one only needs to do this to include files in the same directory:
. $INCPATH/file_to_include.sh
Reference: TecPorto / Location independent includes
here is a nice function you can use. it builds on what #sacii made. thank you
it will let you list any number of space separated script names to source (relative to the script calling source_files).
optionally you can pass an absolute or relative path as the first argument and it will source from there instead.
you can call it multiple times (see example below) to source scripts from different dirs
#!/usr/bin/env bash
function source_files() {
local scripts_dir
scripts_dir="$1"
if [ -d "$scripts_dir" ]; then
shift
else
scripts_dir="${BASH_SOURCE%/*}"
if [[ ! -d "$scripts_dir" ]]; then scripts_dir="$PWD"; fi
fi
for script_name in "$#"; do
# shellcheck disable=SC1091 disable=SC1090
. "$scripts_dir/$script_name.sh"
done
}
here is an example you can run to show how its used
#!/usr/bin/env bash
function source_files() {
local scripts_dir
scripts_dir="$1"
if [ -d "$scripts_dir" ]; then
shift
else
scripts_dir="${BASH_SOURCE%/*}"
if [[ ! -d "$scripts_dir" ]]; then scripts_dir="$PWD"; fi
fi
for script_name in "$#"; do
# shellcheck disable=SC1091 disable=SC1090
. "$scripts_dir/$script_name.sh"
done
}
## -- EXAMPLE -- ##
# assumes dir structure:
# /
# source_files.sh
# sibling.sh
# scripts/
# child.sh
# nested/
# scripts/
# grandchild.sh
cd /tmp || exit 1
# sibling.sh
tee sibling.sh <<- EOF > /dev/null
#!/usr/bin/env bash
export SIBLING_VAR='sibling var value'
EOF
# scripts/child.sh
mkdir -p scripts
tee scripts/child.sh <<- EOF > /dev/null
#!/usr/bin/env bash
export CHILD_VAR='child var value'
EOF
# nested/scripts/grandchild.sh
mkdir -p nested/scripts
tee nested/scripts/grandchild.sh <<- EOF > /dev/null
#!/usr/bin/env bash
export GRANDCHILD_VAR='grandchild var value'
EOF
source_files 'sibling'
source_files 'scripts' 'child'
source_files 'nested/scripts' 'grandchild'
echo "$SIBLING_VAR"
echo "$CHILD_VAR"
echo "$GRANDCHILD_VAR"
rm sibling.sh
rm -rf scripts nested
cd - || exit 1
prints:
sibling var value
child var value
grandchild var value
You can also use:
PWD=$(pwd)
source "$PWD/inc.sh"

Resources