$PWD vs. pwd regarding portability - bash

I'm writing a shell script which parses the path of the current working directory (printing a like of all basenames above the current directory).
So far, I've been using the environment variable PWD to parse the path but I wonder if
I can count on PWD to be always set
to give the same result on every platform
Would it possibly be better to use the pwd shell-builtin? I need this script to run on as many platforms as possible, so I just wondered...

POSIX requires $PWD to be set in the following fashion:
PWD
This variable shall represent an absolute pathname of the current working directory. It shall not contain any components that are dot or dot-dot. The value is set by the cd utility, and by the sh utility during initialization.
So you can rely on that being set – but do note "... an absolute path...", not the absolute path.
bash (at least recent versions) will remember what symlinks you followed when setting $PWD (and the pwd builtin). command pwd (that is, the external command) will not. So you'll get different results there, which might, or might not, be important for you. Use pwd -P if you want a path without symlinks.
Do note that the pwd documentation states:
If an application sets or unsets the value of PWD, the behavior of pwd is unspecified.
So, don't do that :)
In short, there is no winner here. The environment variable will be there in POSIX shells, as will the external command and possibly a built-in too. Choose the one that best fits your need, the important thing being whether you care about symlinks or not.

From that forum article, "$PWD vs `pwd`" which compares AIX 4.2.1, AIX 6, Sparc Solaris 10 and Redhat 5 enterprise with this regard:
there is no difference between $PWD and builtin pwd,
there is no difference between builtin pwd -P and /usr/bin/pwd.
The former shows working directory with names of symbolic links whereas the latter displays actual path.
The only discrepancy is that external command is in /usr/bin in most systems and /bin in Redhat.

Another point to note is
command substitutions are not generally safe on trailing
newlines .
This is obviously fairly contrived, but if you're really concerned about safely
handling input you should be using "$PWD". See, for example:
$ my_dir=$'/tmp/trailing_newline\n'
$ mkdir -p "$my_dir"
$ cd "$my_dir"
$ pwd
/tmp/trailing_newline
$ printf "%q\n" "$(pwd)" "$PWD"
/tmp/trailing_newline
$'/tmp/trailing_newline\n'
$ cd "$(pwd)"
sh: cd: /tmp/trailing_newline: No such file or directory
$ cd "$PWD"
It is possible to work around the command substitution but it is by no means
pretty. You can append a trailing character and then strip it with a
parameter expansion:
$ pwd_guarded="$(pwd; printf '#')"
$ pwd_fixed="${pwd_guarded%$'\n'#}"
$ printf "%q\n" "$pwd_fixed"
$'/tmp/trailing_newline\n'
$ cd "$pwd_fixed"
This is particularly ugly because you then also have to strip the newline that
pwd adds, which would normally have been stripped by the command substitution.
This becomes a total mess if you don't resort to non-POSIX constructs like
$'', so basically, just use "$PWD" if you care about these things. Of course
it is perfectly reasonable to just not support trailing newlines in directory
names.

If you know that bash is available and the script is executed with it, PWD is safe.
If, on some systems, only plain sh is available, use pwd.

If it were me, I'd use pwd since it is a built-in both for bash and sh. That does not mean they work identically in all respects, but if you are invoking it without options, that shouldn't matter.

Related

bash: What is the difference between PWD and CURDIR?

My Problem
I use a Makefile to run a docker run target, which needs the current working directory as one of its parameters.
I use either $(PWD) or $(CURDIR):
build: Dockerfile
docker run ... <$(PWD) or $(CURDIR)>
They seem to be producing the same value. I don't know if there's a subtle difference that can bite me later, so I want to know the exact definition of each of them.
What Have I Tried
STFW
man make
My Question
What is the difference between $(PWD) and $(CURDIR) in a Makefile?
TL;DR
Use CURDIR.
Why?
First, thanks Renaud Pacalet for his comment.
CURDIR
Quoting the GNU Make Manual:
CURDIR
Set to the absolute pathname of the current working directory.
For your convenience, when GNU make starts (after it has processed any -C options) it sets the variable CURDIR to the pathname of the current working directory. This value is never touched by make again: in particular note that if you include files from other directories the value of CURDIR does not change. The value has the same precedence it would have if it were set in the makefile (by default, an environment variable CURDIR will not override this value). Note that setting this variable has no impact on the operation of make (it does not cause make to change its working directory, for example).
PWD
There's no reference to PWD in the Make manual. A quick env | grep PWD found that it was set by the environment (in my case, zsh). GNU's notes about Special Shell Variables
state that:
PWD
Posix 1003.1-2001 requires that cd and pwd must update the PWD environment variable to point to the logical name of the current directory, but traditional shells do not support this. This can cause confusion if one shell instance maintains PWD but a subsidiary and different shell does not know about PWD and executes cd; in this case PWD points to the wrong directory. Use ``pwd' rather than $PWD'.
Since CURDIR is guaranteed to work in Make in PWD might be inherited from the shell, the former should be preferred.

What is the `Cd` command?

I was writing some code, navigating my computer (OSX 10.11.6) via the command line, like I always do, and I made a typo! Instead of typing:
cd USB
I typed
Cd USB
Nothing happened, but it didn't register as an invalid command. Perplexed by this, I did some investigating: I checked the man entry. There was no entry. I found the source file (/usr/bin/Cd) using which Cd, and then cated it:
#!/bin/sh
# $FreeBSD: src/usr.bin/alias/generic.sh,v 1.2 2005/10/24 22:32:19 cperciva Exp $
# This file is in the public domain.
builtin `echo ${0##*/} | tr \[:upper:] \[:lower:]` ${1+"$#"}
What is this, and why is it here? How does it relate to freeBSD?
Any help would be amazing, thanks!
macOS uses a case-insensitive filesystem by default[1]
, which can be misleading at times:
which Cd is effectively the same as which cd and which CD in terms of returning the (effectively) same file path.
Confusingly, even though all 3 command refer to the same file, they do so in a case-preserving manner, misleadingly suggesting that the actual case of the filename is whatever you specified.
As a workaround, you can see the true case of the filename if you employ globbing (filename expansion):
$ ls "$(which Cd)"* # could match additional files, but the one of interest is among them
/usr/bin/cd # true case of the filename
Bash (the macOS default shell) is internally case-sensitive.
That is, it recognizes cd as builtin cd (its built-in directory-changing command).
By contrast, it does NOT recognize Cd as that, due to the difference in case.
Given that it doesn't recognize Cd as a builtin, it goes looking for an external utility (in the $PATH), and that is when it finds /usr/bin/cd.
/usr/bin/cd is implemented as a shell script, which is mostly useless, because as an external utility it cannot affect the shell's state, so its attempts to change the directory are simply quietly ignored.
(Keith Thompson points out in a comment that you can use it as test whether a given directory can be changed to, because the script's exit code will reflect that).
Matt's answer provides history behind the inclusion of the script in FreeBSD and OSX (which mostly builds on FreeBSD), but it's worth taking a closer look at the rationale (emphasis mine):
From the POSIX spec:
However, all of the standard utilities, including the regular built-ins in the table, but not the special built-ins described in Special Built-In Utilities, shall be implemented in a manner so that they can be accessed via the exec family of functions as defined in the System Interfaces volume of POSIX.1-2008 and can be invoked directly by those standard utilities that require it (env, find, nice, nohup, time, xargs).
In essence, the above means: regular built-ins must (also) be callable stand-alone, as executables (whether as scripts or binaries), nut just as built-ins from within the shell.
The cited regular built-ins table comprises these utilities:
alias bg cd command false fc fg getopts jobs kill newgrp pwd read true umask unalias wait
Note: special built-in utilities are by definition shell-internal only, and their behavior differs from regular built-in utilities.
As such, to be formally POSIX-compliant an OS must indeed provide cd as an external utility.
At the same time, the POSIX spec. does have awareness that at least some of these regular built-ins - notably cd - only makes sense as a built-in:
"Since cd affects the current shell execution environment, it is always provided as a shell regular built-in." - http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cd.html
Among the regular built-in utilities listed, some make sense both as a built-in and as an external utility:
For instance kill needs to be a built-in in order to kill jobs (which are a shell-internal concept), but it is also useful as an external utility, so as to kill processes by PID.
However, among the regular built-in utilities listed, the following never make sense as external utilities, as far as I can tell Do tell me if you disagree
, even though POSIX mandates their presence:
alias bg cd command fc fg getopts jobs read umask unalias
Tip of the hat to Matt for helping to complete the list; he also points that the hash built-in, even though it's not a POSIX utility, also has a pointless script implementation.
[1] As Dave Newton points out in a comment, it is possible to format HFS+, the macOS filesystem, in a case-sensitive manner (even though most people stick with the case-insensitive default). Based on the answer Dave links to, the following command will tell you whether your macOS filesystem is case-insensitive or not:
diskutil info / | grep -iq '^\s*Name.*case-sensitive*' && echo "case-SENSITIVE" || echo "case-INsensitive"
What is this?
The script itself is a portable way to convert a command, even with random upper casing, into the equivalent shell builtin based on the exec paths file name, that is any part of the string after the final / in the $0 variable). The script then runs the builtin command with the same arguments.
As OSX file systems are case insensitive by default, /usr/bin/cd converts running Cd, CD, cD and any form of cd with a / fs path (like /usr/bin/cd) back to the shell builtin command cd. This is largely useless in a script as cd only affects the current shell it is running in, which immediately closes when the script ends.
How does it relate to freeBSD?
A similar file exists in FreeBSD, which Apple adapted to do case conversion. Mac file systems by default are case insensitive (but case preserving).
The $FreeBSD: src/usr.bin/alias/generic.sh,v 1.2 2005/10/24 22:32:19 cperciva Exp $ header is the source information in the file.
Most of the underlying OSX system comes directly from FreeBSD or was based on it. The Windowing system on top of this and the Cocoa app layer is where OSX becomes truly Apple. Some of the lower level Apple bits have even made it back into FreeBSD like Clang and LLVM compiler.
Why is it here?
The earlier FreeBSD svn commits shed a bit of light:
A little bit more thought has resulted in a generic script which can
implement any of the useless POSIX-required ``regular shell builtin''
utilities...
Although most builtins aren't very useful when run in a new shell via a script, this compliance script was used for the commands alias bg cd command fc fg getopts hash jobs read type ulimit umask unalias wait. POSIX compliance is fun!
As I recall, MacOS uses a case-insensitive file system by default. The command you saw as /usr/bin/Cd is actually /usr/bin/cd, but it can be referred to by either name.
You can see this by typing
ls /usr/bin/ | grep -i cd
Normally cd is a builtin command in the shell. As you know, it changes the current directory. An external cd command is nearly useless -- but it still exists.
It can be used to detect whether it's possible to change to a specified directory without actually affecting the working directory of your current process.
Your shell (probably bash) tends to assume case-sensitive command names. The builtin command can only be referred to as cd, but since it's able to open the script file named /usr/bin/Cd, it can find and execute it.

${BASH_SOURCE[0]} equivalent in zsh?

The title should say it all. I'm looking for an equivalent to ${BASH_SOURCE[0]} in zsh.
Note: I keep finding "$0 is equivalent to ${BASH_SOURCE[0]}" around the Internet, but this seems to be false: $0 seems to be the name of the executing command. (It's argv[0], which makes sense.) Echoing $0 in my script (.zshrc) gives zsh for $0, which isn't the same as what ${BASH_SOURCE[0]} is. In fact, ${BASH_SOURCE[0]} seems to work in zsh, except for inside .zshrc files.
What I'm really doing in my .zshrc (that isn't working):
echo ${BASH_SOURCE[0]}
source `dirname $0`/common-shell-rc.sh
The source fails ($0 is zsh) and the echo outputs a blank line.
Edit: apparently, for $0 to work, I need the option FUNCTION_ARGZERO option set. Any way to test if this is set in a script? (so that I can temporarily set it) It is apparently on unless you set nofunction_argzero, and it is on in my shell. Still get nothing for $0. (I think b/c I'm not in a function.)
${BASH_SOURCE[0]} equivalent in zsh is ${(%):-%N}, NOT $0(as OP said, the latter failed in .zshrc)
Here % indicates prompt expansion on the value,
%N indicates "The name of the script, sourced file,
or shell function that zsh is currently executing,
whichever was started most recently. If there is none, this is equivalent to the parameter $0."(from man zshmisc)
${(%):-%x} is the closest zsh equivalent to bash's $BASH_SOURCE (and ksh's ${.sh.file}) - not $0.
Tip of the hat to Hui Zheng for providing the crucial pointer and background information in his answer.
It returns the (potentially relative) path of the enclosing script,
regardless of whether the script is being sourced or not.
specifically, it also works inside initialization/profiles files such as ~/.zshrc (unlike $0, which inexplicably returns the shell's path there).
regardless of whether called from inside a function defined in the script or not (unlike $0, which returns the function name inside a function).
The only difference to $BASH_SOURCE I've found is in the following obscure scenario - which may even be a bug (observed in zsh 5.0.5): inside a function nested inside another function in a sourced script, ${(%):-%x} does not return the enclosing script path when that nested function is called (again) later, after having been sourced (returns either nothing or 'zsh').
Background information on ${(%):-%x}:
(%):- in lieu of a variable name in a parameter (variable) expansion (${...}) makes escape sequences available that are normally used to represent environmental information in prompt strings, such as used in the PS1 variable to determine the string displayed as the primary interactive prompt.
% is an instance of a parameter expansion flag, all of which are listed in man zshexpn under the heading Parameter Expansion Flags.
%x is one of the escape sequences that can be used in prompt strings, and it functions as described above; there are many more, such as %d to represent the current dir.
man zshmisc lists all available sequences under the heading SIMPLE PROMPT ESCAPES.
If you want to make your script both bash and zsh-compatible you can use ${BASH_SOURCE[0]:-${(%):-%x}}. The resulting value will be taken from BASH_SOURCE[0] when it's defined, and ${(%):-%x}} when BASH_SOURCE[0] is not defined.
$0 is correct. In a sourced script, this is the name of a script, as it was passed to the . or source built-in (so if the path_dirs option is set, you may need to do a $path lookup to find the actual location of the script).
.zshrc is not sourced, which explains why $0 is not set to .zshrc. You know the file name and location anyway: it's ${ZDOTDIR-~}/.zshrc.
If you are symlinking to .zshrc in a dotfiles directory and want to reference other files in the directory, then try this:
SOURCE=${(%):-%N}
while [ -h "$SOURCE" ]; do
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
SOURCE="$(readlink "$SOURCE")"
[[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE"
done
DOTFILES_DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
(I got the loop script from here.)
Maybe you're looking for $_?
# foo.sh
source foo2.sh
and
# foo2.sh
echo $_
yields
# ./foo.sh
foo2.sh

Find file's own path

I'm trying to find what the unix equivalent of the Windows/DOS variable %cd% is. What I'm looking for is an environmental variable or a workaround that will let me set a variable to the path of the file currently running.
For example, if the program is in /home/chris/Desktop but the working directory is /home/chris, what would be the command to get ~/Desktop as opposed to pwd which will give me /home/chris.
In BASH, you can look at the $PWDvariable. That'll show your Present Working Directory. Getting the relationship between the $PWD and where the program is located is a bit trickier. You can look at the $0 variable which should give you the name of the file I ran the following script:
#! /bin/bash
#
echo "PWD = $PWD"
echo "\$0 = $0"
And got the following result:
$ test.sh
PWD = /Users/david
$0 = /Users/david/bin/test.sh
The $0 gives you the name of the file from the root of the OS. Taking the dirname will give you the file name. Somehow, if you can filter out the PWD from the $0, you might get what you're looking for. I had some luck with the following:
curPath=$(dirname "${0#$PWD/}")
Didn't thoroughly test it, from what I can see, it seems to do what you want. What it can't do is do something like this:
$ test.sh
PWD = /Users/david/someSubDir
$0 = /Users/david/bin/test.sh
The current path is /Users/david/bin/test.sh
It would have been nice if it could do this:
The current path is ../bin/test.sh
Although the former is correct.
The readlink command doesn't work on non-Linux systems.
How about dirname $(readlink -f $0)
readlink -f $0 returns the canonicalized path to the running script.
dirname removes everything after and including the final \.
This way works, but isn't 100% reliable:
${0%/*}
The way that works is that it reads $0 (the program name), and strips off everything from the final slash onwards. It's not reliable because if your script is invoked via a symlink, you will get the directory containing the symlink, not the directory containing the real script.
Also, it's possible to pass in a "fake" value for $0, for example by using exec -a. So even if you aren't using symlinks, it's still not a 100% solution. In fact, such a solution doesn't exist.
Working with what Chris suggested, you could use the which command. According to the man page, which reports the full path of the executable that would have been executed if its argument had been entered at the shell prompt. Since we know $0 was entered at the shell prompt, we can use `which $0` to report exactly the path that was used to execute. Unfortunately, this still suffers from the symlink issue, as which does not provide options to avoid symlinks.

In bash2, how do I find the name of a script being sourced?

Here's my situation:
I have multiple versions of a script in source code control where the name differs by a path name component (ex: scc/project/1.0/script and scc/project/1.1/script). In another directory, there is a symlink to one of these scripts. The symlink name is not related to the script name, and in some cases may change periodically. That symlink, in turn, is sourced by bash using the '.' command.
What I need to know: how do I determine the directory of the referenced script, on a 10 year-old system with Bash 2 and Perl 5.5? For various reasons, the system must be used, and it cannot be upgraded.
In Bash 3 or above, I use this:
dir=`perl -MCwd=realpath -MFile::Basename 'print dirname(realpath($ARGV[0]))' ${BASH_SOURCE[0]} $0`
Apologies for the Perl one-liner - this was originally a pure-Perl project with a very small amount of shell script glue.
I've been able to work around the fact that the ancient Perl I am using doesn't export "realpath" from Cwd, but unfortunately, Bash 2.03.01 doesn't provide BASH_SOURCE, or anything like it that I've been able to find. As a workaround, I'm providing the path information in a text file that I change manually when I switch branches, but I'd really like to make this figure out which branch I'm using on its own.
Update:
I apologize - apparently, the question as asked is not clear. I don't know in every case what the name of the symlink will be - that's what I'm trying to find out at run time. The script is occasionally executed via the symlink directly, but most often the link is the argument to a "." command running in another script.
Also, $0 is not set appropriately when the script is sourced via ".", which is the entire problem I'm trying to solve. I apologize for bluntness, but no solution that depends entirely upon $0 being set is correct. In the Perl one-liner, I use both BASH_SOURCE and $0 (BASH_SOURCE is only set when the script is sourced via ".", so the one-liner only uses $0 when it's not sourced).
Try using $0 instead of ${BASH_SOURCE[0]}. (No promises; I don't have a bash 2 around.)
$0 has the name of the program/script you are executing.
Is stat ok? something like
stat -c %N $file
bash's cd and pwd builtins have a -P option to resolve symlinks, so:
dir=$(cd -P -- "$(dirname -- "$0")" && pwd -P)
works with bash 2.03
I managed to get information about the porcess sourcing my script using this command:
ps -ef | grep $$
This is not perfect but tells your which is the to process invoking your script. It migth be possible with some formating to determine the exact source.

Resources