How to obtain the full PATH, *allowing* for symbolic links - bash

I have written bash scripts that accept a directory name as an argument. A single dot ('.') is a valid directory name, but I sometimes need to know where '.' is. The readlink and realpath commands provide a resolved path, which does not help because I need to allow for symbolic links.
For example, the resolved path to the given directory might be something like /mnt/vol_01/and/then/some, whereas the script is called with '.' where '.' is /app/then/some (a sym link which would resolve to the first path I gave).
What I have done to solve my problem is use cd and pwd in combination to provide the full path I want, and it seems to have worked OK so far.
A simplified example of a script:
DEST_DIR=$1
# Convert the given destination directory to a full path, ALLOWING
# for symbolic links. This is necessary in cases where '.' is
# given as the destination directory.
DEST_DIR=$(cd $DEST_DIR && pwd -L)
# Do stuff in $DEST_DIR
My question is: is my use of cd and pwd the best way to get what I want? Or is there a better way?

If all you want to do is to make an absolute path that has minimal changes from a relative path then a simple, safe, and fast way to to it is:
[[ $dest_dir == /* ]] || dest_dir=$PWD/$dest_dir
(See Correct Bash and shell script variable capitalization for an explanation of why dest_dir is preferable to DEST_DIR.)
The code above will work even if the directory doesn't exist (yet) or if it's not possible to cd to it (e.g. because its permissions don't allow it). It may produce paths with redundant '.' components, '..' components, and redundant slashes (`/a//b', '//a/b/', ...).
If you want a minimally cleaned path (leaving symlinks unresolved), then a modified version of your original code may be a reasonable option:
dest_dir=$(cd -- "$dest_dir"/ && pwd)
The -- is necessary to handle directory names that begin with '-'.
The quotes in "$dest_dir" are necessary to handle names that contain whitespace (actually $IFS characters) or glob characters.
The trailing slash on "$dest_dir"/ is necessary to handle a directory whose relative name is simply -.
Plain pwd is sufficient because it behaves as if -L was specified by default.
Note that the code will set dest_dir to the empty string if the cd fails. You probably want to check for that before doing anything else with the variable.
Note also that $(cd ...) will create a subshell with Bash. That's good in one way because there's no need to cd back to the starting directory afterwards (which may not be possible), but it could cause a performance problem if you do it a lot (e.g. in a loop).
Finally, note that the code won't work if the directory name contains one or more trailing newlines (e.g. as created by mkdir $'dir\n'). It's possible to fix the problem (in case you really care about it), but it's messy. See How to avoid bash command substitution to remove the newline character? and shell: keep trailing newlines ('\n') in command substitution. One possible way to do it is:
dest_dir=$(cd -- "$dest_dir"/ && printf '%s.' "$PWD") # Add a trailing '.'
dest_dir=${dest_dir%.} # Remove the trailing '.'

Related

Organizing Files In Directories with Terminal

So I am wondering if there is any way to organize a directory on a mac with the terminal. I am a beginner with using the terminal and just seeing if this is possible.
I have a script that will scrape various pages and save certain data to a file (data irrelevant), such as this picture.
directory that needs organizing
I would like to know if I can write something that will read the file names and create directories that correspond. For example, it runs a loop that will read all files with "Year2014", create a folder named "Year2014", then place the files inside.
If you have any other questions, let me know!
The short answer is "Yes", and the longer answer is there are many ways to do it. Since you are using bash (or any POSIX shell), you have parameter expansion with substring removal available to help you trim text from the end of each filename to isolate the "YearXXXX" part of the filename that you can then use to (1) create the directory, and (2) move the file into the newly created directory.
Presuming Filenames Formatted WeekXXYearXXXX.txt
Take for example a simple for loop where the loop variable f will contain each filename in turn. You can isolate the "WeekXX" part of the name by using a parameter expansion that trims from the right of the string trough 'Y' leaving whatever "WeekXX" is. (save the result in a temporary variable) You can then use that temp variable to remove the "WeekXX" text from the original filename leaving "YearXXXX.txt". You then simply remove ".txt" from the first to arrive at the directory name to put the file in.
Scriptwise it would look like:
for f in *.txt; do ## loop over .txt files using variable $f
tmp="${f%%Y*}" ## remove though 'Y' from right
dname="${f#$tmp}" ## remove contents of tmp from left
dname="${dname%.txt}" ## remove .txt
mkdir -p "$dname" ## create dname (no error if exists)
mv "$f" "$dname" ## move $f to $dname
done
Where the temporary variable used is tmp and the final directory name is stored in the variable dname.
(note: you may want to use mv -i if you want mv to prompt before overwriting if the filename already exists in the target directory)
You can refer to man bash under the Parameter Expansion heading to read the specifics of each expansion which (among many more) are described as:
${var#pattern} Strip shortest match of pattern from front of $var
${var##pattern} Strip longest match of pattern from front of $var
${var%pattern} Strip shortest match of pattern from back of $var
${var%%pattern} Strip longest match of pattern from back of $var
Note this set of parameter expansions is POSIX so it will work with any POSIX shell, while most of the remaining expansions are bashisms (bash-only)
Let me know if you have further questions.

How to batch replace part of filenames with the name of their parent directory in a Bash script?

All of my file names follow this pattern:
abc_001.jpg
def_002.jpg
ghi_003.jpg
I want to replace the characters before the numbers and the underscore (not necessarily letters) with the name of the directory in which those files are located. Let's say this directory is called 'Pictures'. So, it would be:
Pictures_001.jpg
Pictures_002.jpg
Pictures_003.jpg
Normally, the way this website works, is that you show what you have done, what problem you have, and we give you a hint on how to solve it. You didn't show us anything, so I will give you a starting point, but not the complete solution.
You need to know what to replace: you have given the examples abc_001 and def_002, are you sure that the length of the "to-be-replaced" part always is equal to 3? In that case, you might use the cut basic command for deleting this. In other ways, you might use the position of the '_' character or you might use grep -o for this matter, like in this simple example:
ls -ltra | grep -o "_[0-9][0-9][0-9].jpg"
As far as the current directory is concerned, you might find this, using the environment variable $PWD (in case Pictures is the deepest subdirectory, you might use cut, using '/' as a separator and take the last found entry).
You can see the current directory with pwd, but alse with echo "${PWD}".
With ${x#something} you can delete something from the beginning of the variable. something can have wildcards, in which case # deletes the smallest, and ## the largest match.
First try the next command for understanding above explanation:
echo "The last part of the current directory `pwd` is ${PWD##*/}"
The same construction can be used for cutting the filename, so you can do
for f in *_*.jpg; do
mv "$f" "${PWD##*/}_${f#*_}"
done

prepending to the $PATH

In order to avoid ad-hoc setting of my PATH by the usual technique of blindly appending - I started hacking some code to prepend items to my path (asdf path for example).
pathprepend() {
for ARG in "$#"
do
export PATH=${${PATH}/:$"ARG"://}
export PATH=${${PATH}/:$"ARG"//}
export PATH=${${PATH}/$"ARG"://}
export PATH=$ARG:${PATH}
done
}
It's invoked like this : pathprepend /usr/local/bin and /usr/local/bin gets prepended to PATH. The script is also supposed to cleanly remove /usr/local/bin from it's original position in PATH (which it does, but not cleanly)(dodgy regex).
Can anyone recomend a cleaner way to do this? The shell (bash) regex support is a bit limited. I'd much rather split into an array and delete the redundant element, but wonder how portable either that or my implementation is. My feeling is, not particularly.
If you want to split PATH into an array, that can be done like so:
IFS=: eval 'arr=($PATH)'
This creates an array, arr, whose elements are the colon-delimited elements of the PATH string.
However, in my opinion, that doesn't necessarily make it easier to do what you want to do. Here's how I would prepend to PATH:
for ARG in "$#"
do
while [[ $PATH =~ :$ARG: ]]
do
PATH=${PATH//:$ARG:/:}
done
PATH=${PATH#$ARG:}
PATH=${PATH%:$ARG}
export PATH=${ARG}:${PATH}
done
This uses bash substitution to remove ARG from the middle of PATH, remove ARG from the beginning of PATH, remove ARG from the end of PATH, and finally prepend ARG to PATH. This approach has the benefit of removing all instances of ARG from PATH in cases where it appears multiple times, ensuring the only instance will be at the beginning after the function has executed.

What is the meaning of '-*' as case parameter in a fish shell script?

The official documentation of fish shell has this example.
function mkdir -d "Create a directory and set CWD"
command mkdir $argv
if test $status = 0
switch $argv[(count $argv)]
case '-*'
case '*'
cd $argv[(count $argv)]
return
end
end
end
I understand case '*' is like default: in C++ switch statement.
What is the meaning or usage of case '-*'?
It's a glob match.
case '-*' will be executed whenever the switched parameter starts with a "-".
And because only the first matching case will be used, case '*' as the last case is like "default:". If you had it earlier, it would swallow all cases after it.
Also the quotes here are necessary because otherwise fish would expand that glob, which would mean case -* would have all matching filenames in the current directory as parameters, so it would be true if the switched parameter is the name of a file in the current directory that starts with "-".
With the help of #faho's answer, I understand the purpose of -*.
-* is glob pattern. It is not at all different from patterns like *.pdf or Report_2016_*.
Author added this check to ignore all directories that start with -. It will create a directory that starts with - but will not set CWD to it.
The reason, - has special usage in shells.
For example, cd - does not change directory into a directory named -. Instead it switches to the last directory you were in.
Directories or files whose name start with - are a source of trouble. Following question on SO sister sites give an idea.
How do you enter a directory that's name is only a minus?
How do I delete a file whose name begins with “-” (hyphen a.k.a. dash or minus)?
How to cd into a directory with this name “-2” (starting with the hyphen)?
No wonder author decided to ignore directory that start with -.

Removing an optional / (directory separator) in Bash

I have a Bash script that takes in a directory as a parameter, and after some processing will do some output based on the files in that directory.
The command would be like the following, where dir is a directory with the following structure inside
dir/foo
dir/bob
dir/haha
dir/bar
dir/sub-dir
dir/sub-dir/joe
> myscript ~/files/stuff/dir
After some processing, I'd like the output to be something like this
foo
bar
sub-dir/joe
The code I have to remove the path passed in is the following:
shopt -s extglob
for file in $files ; do
filename=${file#${1}?(/)}
This gets me to the following, but for some reason the optional / is not being taken care of. Thus, my output looks like this:
/foo
/bar
/sub-dir/joe
The reason I'm making it optional is because if the user runs the command
> myscript ~/files/stuff/dir/
I want it to still work. And, as it stands, if I run that command with the trailing slash, it outputs as desired.
So, why does my ?(/) not work? Based on everything I've read, that should be the right syntax, and I've tried a few other variations as well, all to no avail.
Thanks.
that other guy's helpful answer solves your immediate problem, but there are two things worth nothing:
enumerating filenames with an unquoted string variable (for file in $files) is ill-advised, as sjsam's helpful answer points out: it will break with filenames with embedded spaces and filenames that look like globs; as stated, storing filenames in an array is the robust choice.
there is no strict need to change global shell option shopt -s extglob: parameter expansions can be nested, so the following would work without changing shell options:
# Sample values:
file='dir/sub-dir/joe'
set -- 'dir/' # set $1; value 'dir' would have the same effect.
filename=${file#${1%/}} # -> '/sub-dir/joe'
The inner parameter expansion, ${1%/}, removes a trailing (%) / from $1, if any.
I suggested you change files to an array which is a possible workaround for non-standard filenames that may contain spaces.
files=("dir/A/B" "dir/B" "dir/C")
for filename in "${files[#]}"
do
echo ${filename##dir/} #replace dir/ with your param.
done
Output
A/B
B
C
Here's the documentation from man bash under "Parameter Expansion":
${parameter#word}
${parameter##word}
Remove matching prefix pattern. The word is
expanded to produce a pattern just as in pathname
expansion. If the pattern matches the beginning of
the value of parameter, then the result of the
expansion is the expanded value of parameter with
the shortest matching pattern (the ``#'' case) or
the longest matching pattern (the ``##'' case)
deleted.
Since # tries to delete the shortest match, it will never include any trailing optional parts.
You can just use ## instead:
filename=${file##${1}?(/)}
Depending on what your script does and how it works, you can also just rewrite it to cd to the directory to always work with paths relative to .

Resources