Bash Script variable assigned "-r" when using rm -r $VAR - bash

Heres the code:
function rm {
cd ~/
if [[ -d ./Jam/projects/"$1" ]]; then
echo Removing $1 from projects...
rm -r ./Jam/projects/"$1"
elif [[ -d ./Jam/archive/"$1" ]]; then
echo Removing $1 from archives...
rm -r ./Jam/archive/"$1"
else
echo $1 does not exist \in ./Jam/projects/ or ./Jam/archive
exit
fi
echo Finished\!
}
When this is ran, $1 is "Hello World" (a Directory in i./Jam/archive/)
I get this output:
Removing HelloWorld from archives...
-r does not exist in ./Jam/projects/ or ./Jam/archive
Somehow, $1 is assigned to "-r".
I don't know how on earth this would happen. Any help is much appreciated.

Your function is called "rm" and inside your function "rm" you call rm -r thinking it's "normal rm", but it isn't - it's your function, which perfectly demonstrates the danger of calling your function a name that already has a well known meaning.

Related

Eval error Argument list too long when expandind a command

I got some code from here that works pretty well until I get "Argument list too long"
I am NOT a developer and pretty old too :) so if it is not much to ask please explain.
Is there a way the expand DIRCMD like eval does and pass each of the commands one at the time so eval does not break?
for (( ifl=0;ifl<$((NUMFIRSTLEVELDIRS));ifl++ )) { FLDIR="$(get_rand_dirname)"
FLCHILDREN="";
for (( ird=0;ird<$((DIRDEPTH-1));ird++ )) {
DIRCHILDREN=""; MOREDC=0;
for ((idc=0; idc<$((MINDIRCHILDREN+RANDOM%MAXDIRCHILDREN)); idc++)) {
CDIR="$(get_rand_dirname)" ;
# make sure comma is last, so brace expansion works even for 1 element? that can mess with expansion math, though
if [ "$DIRCHILDREN" == "" ]; then
DIRCHILDREN="\"$CDIR\"" ;
else
DIRCHILDREN="$DIRCHILDREN,\"$CDIR\"" ;
MOREDC=1 ;
fi
}
if [ "$MOREDC" == "1" ] ; then
if [ "$FLCHILDREN" == "" ]; then
FLCHILDREN="{$DIRCHILDREN}" ;
else
FLCHILDREN="$FLCHILDREN/{$DIRCHILDREN}" ;
fi
else
if [ "$FLCHILDREN" == "" ]; then
FLCHILDREN="$DIRCHILDREN" ;
else
FLCHILDREN="$FLCHILDREN/$DIRCHILDREN" ;
fi
fi
}
cd $OUTDIR
DIRCMD="mkdir -p $OUTDIR/\"$FLDIR\"/$FLCHILDREN"
eval "$DIRCMD"
echo "$DIRCMD"
}
I tried echo $DIRCMD but do not get the expanded list of commands
'echo mkdir -p /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
I had trouble following the code, but if I understood it correctly, you dynamically generate a mkdir -p command with a brace expansion:
'mkdir -p /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
Which then fails when you eval it due to your OS' maximum argument limit.
To get around that, you can instead generate a printf .. command since this is a Bash builtin and not subject to the argument limit, and feed its output to xargs:
dircmd='printf "%s\0" /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
eval "$dircmd" | xargs -0 mkdir -p
If your xargs doesn't support -0, you can instead use printf "%s\n" and xargs mkdir -p, though it won't behave as well if your generated names contain spaces and such.
If this is for benchmarking, you may additionally be interested to know that you can now use xargs -0 -n 1000 -P 8 mkdir -p to run 8 mkdirs in parallel, each creating 1000 dirs at a time.

bash: script to identify specific alias causing a bug

[Arch Linux v5.0.7 with GNU bash 5.0.3]
Some .bashrc aliases seem to conflict with a bash shell-scripts provided by pyenv and pyenv-virtualenvwrapper.I tracked down the problem running the script, using set -x and with all aliases enabled, and saw finally that the script exits gracefully with exit code is 0 only when aliases are disabled with unalias -a. So this has to do with aliases... but which one ?
To try to automate that, I wrote the shell-script below:
It un-aliases one alias at a time, reading iteratively from the complete list of aliases,
It tests the conflicting shell script test.sh against that leave-one-out alias configuration, and prints something in case an error is detected,
It undoes the previous un-aliasing,
It goes on to un-aliasing the next alias.
But the two built-ins alias and unalias do not fare well in the script cac.sh below:
#! /usr/bin/bash
[ -e aliases.txt ] && rm -f aliases.txt
alias | sed 's/alias //' | cut -d "=" -f1 > aliases.txt
printf "File aliases.txt created with %d lines.\n" \
"$(wc -l < <(\cat aliases.txt))"
IFS=" "
n=0
while read -r line || [ -n "$line" ]; do
n=$((n+1))
aliasedAs=$( alias "$line" | sed 's/alias //' )
printf "Line %2d: %s\n" "$n" "$aliasedAs"
unalias "$line"
[ -z $(eval "$*" 1> /dev/null) ] \ # check output to stderr only
&& printf "********** Look up: %s\n" "$line"
eval "${aliasedAs}"
done < <(tail aliases.txt) # use tail + proc substitution for testing only
Use the script like so: $ cac.sh test.sh [optional arguments to test.sh] Any test.sh will do. It just needs to return some non-empty string to stderr.
The first anomaly is that the file aliases.txt is empty as if the alias builtin was not accessible from within the script. If I start the script from its 3rd line, using an already populated aliases.txt file, the script fails at the second line within the while block, again as if alias could not be called from within the script. Any suggestions appreciated.
Note: The one liner below works in console:
$ n=0;while read -r line || [ -n "$line" ]; do n=$((n+1)); printf "alias %d : %s\n" "$n" "$(alias "$line" | sed 's/alias //')"; done < aliases.txt
I would generally advise against implementing this as an external script at all -- it makes much more sense as a function that can be evaluated directly in your interactive shell (which is, after all, where all the potentially-involved aliases are defined).
print_result() {
local prior_retval=$? label=$1
if (( prior_retval == 0 )); then
printf '%-30s - %s\n' "$label" WORKS >&2
else
printf '%-30s - %s\n' "$label" BROKEN >&2
fi
}
test_without_each_alias() {
[ "$#" = 1 ] || { echo "Usage: test_without_each_alias 'code here'" >&2; return 1; }
local alias
(eval "$1"); print_result "Unchanged aliases"
for alias in "${!BASH_ALIASES[#]}"; do
(unalias "$alias" && eval "$1"); print_result "Without $alias"
done
}
Consider the following:
rm_in_home_only() { [[ $1 = /home/* ]] || return 1; rm -- "$#"; }
alias rm=rm_in_home_only # alias actually causing our bug
alias red_herring=true # another alias that's harmless
test_without_each_alias 'touch /tmp/foobar; rm /tmp/foobar; [[ ! -e /tmp/foobar ]]'
...which emits something like:
Unchanged aliases - BROKEN
Without rm - WORKS
Without red_herring - BROKEN
Note that if the code you pass executes a function, you'll want to be sure that the function is defined inside the eval'd code; since aliases are parser behavior, they take place when functions are defined, not when functions are run.
#Kamil_Cuk, #Benjamin_W and #cdarke all pointed to the fact that a noninteractive shell (as that spawned from a bash script) does not have access to aliases.
#CharlesDuffy pointed to probable word splitting and glob expansion resulting in something that could be invalid test syntax in the original [ -z $(eval "$*" 1> /dev/null) ] block above, or worse yet in the possibility of $(eval "$*" 1> /dev/null) being parsed as a glob resulting in unpredictable script behavior. Block corrected to: [ -z "$(eval "$*" 1> /dev/null)" ].
Making the shell spawned by cac.sh interactive, with #! /usr/bin/bash -i. make the two built-ins alias and unalias returned non-null result when invoked, and BASH_ALIASES[#] became accessible from within the script.
#! /usr/bin/bash -i
[ -e aliases.txt ] && rm -f aliases.txt
alias | sed 's/alias //' | cut -d "=" -f1 > aliases.txt
printf "File aliases.txt created with %d lines.\n" \
"$(wc -l < <(\cat aliases.txt))"
IFS=" "
while read -r line || [ -n "$line" ]; do
aliasedAs=$( alias "$line" | sed 's/alias //' )
unalias "$line"
[ -z "$(eval "$*" 2>&1 1>/dev/null)" ] \ # check output to stderr only
&& printf "********** Look up: %s\n" "$line"
eval "${aliasedAs}"
done < aliases.txt
Warning: testing test.sh resorts to the eval built-in. Arbitrary code can be executed on your system if test.sh and optional arguments do not come from a trusted source.

Syntax issues in IF construct : shell script

I have only one condition to check. And i would like to use the if-else construct in shell script. How do i write it. Most of the docs show that the construct is doing multiple aspects thats why they use
if []; then
echo "do something"
elif
echo "somethingelse"
else
echo "something 2"
fi
But in my case i am writing my construct like the below. It gives syntax issues.
#!/bin/sh
hadoopFileList=`hadoop fs -ls /app/SmartAnalytics/Apps/service_request_transformed.db/sr_denorm_bug_details/ | sed '1d;s/ */ /g' | cut -d\ -f8`
destination="/apps/phodisvc/creandoStaging"
scpData()
{
for file in $hadoopFileList
do
echo $file
echo "copying file to the staging directory"
hadoop fs -copyToLocal $file $destination
sleep 2s
echo "deleting the file"
echo ${file##*/}
done
}
if [[ -d "$destination" ]]; then
#file exists copy data
scpData()
else
#else create directory and copy data
mkdir -p $destination
scpData()
fi
You should not call your function with (), only definition needes them.
wrong:
myFunction() {
echo "This line is from my Function"
}
myFunction()
Right way:
myFunction() {
echo "This line is from my Function"
}
myFunction
As PS. mentioned, a bash function like scpData should be called without ().
Other than that, your if[];then ... else ... fi syntax is correct.
Thus your if-loop will work as:
if [ -d "$destination" ]; then
scpData
else
mkdir -p $destination
scpData
fi
In this particular case, however, the if-loop is not necessary,
because mkdir's -p option does check the condition.
i.e., it only makes directory if it doesn't exist.
So I re-wrote your if-loop (without using if-else-fi) as following:
mkdir -p $destination
scpData
which does the same thing!

Curl not downloading files correctly

So I have been struggling with this task for eternity and still don't get what went wrong. This program doesn't seem to download ANY pdfs. At the same time I checked the file that stores final links - everything stored correctly. The $PDFURL also checked, stores correct values. Any bash fans ready to help?
#!/bin/sh
#create a temporary directory where all the work will be conducted
TMPDIR=`mktemp -d /tmp/chiheisen.XXXXXXXXXX`
echo $TMPDIR
#no arguments given - error
if [ "$#" == "0" ]; then
exit 1
fi
# argument given, but wrong format
URL="$1"
#URL regex
URL_REG='(https?|ftp|file)://[-A-Za-z0-9\+&##/%?=~_|!:,.;]*[-A-Za-z0-9\+&##/%=~_|]'
if [[ ! $URL =~ $URL_REG ]]; then
exit 1
fi
# go to directory created
cd $TMPDIR
#download the html page
curl -s "$1" > htmlfile.html
#grep only links into temp.txt
cat htmlfile.html | grep -o -E 'href="([^"#]+)\.pdf"' | cut -d'"' -f2 > temp.txt
# iterate through lines in the file and try to download
# the pdf files that are there
cat temp.txt | while read PDFURL; do
#if this is an absolute URL, download the file directly
if [[ $PDFURL == *http* ]]
then
curl -s -f -O $PDFURL
err="$?"
if [ "$err" -ne 0 ]
then
echo ERROR "$(basename $PDFURL)">&2
else
echo "$(basename $PDFURL)"
fi
else
#update url - it is always relative to the first parameter in script
PDFURLU="$1""/""$(basename $PDFURL)"
curl -s -f -O $PDFURLU
err="$?"
if [ "$err" -ne 0 ]
then
echo ERROR "$(basename $PDFURLU)">&2
else
echo "$(basename $PDFURLU)"
fi
fi
done
#delete the files
rm htmlfile.html
rm temp.txt
P.S. Another minor problem I have just spotted. Maybe the problem is with the if in regex? I pretty much would like to see something like that there:
if [[ $PDFURL =~ (https?|ftp|file):// ]]
but this doesn't work. I don't have unwanted parentheses there, so why?
P.P.S. I also ran this script on URLs beginning with http, and the program gave the desired output. However, it still doesn't pass the test.

Bash script absolute path with OS X

I am trying to obtain the absolute path to the currently running script on OS X.
I saw many replies going for readlink -f $0. However since OS X's readlink is the same as BSD's, it just doesn't work (it works with GNU's version).
Is there an out-of-the-box solution to this?
These three simple steps are going to solve this and many other OS X issues:
Install Homebrew
brew install coreutils
grealpath .
(3) may be changed to just realpath, see (2) output
There's a realpath() C function that'll do the job, but I'm not seeing anything available on the command-line. Here's a quick and dirty replacement:
#!/bin/bash
realpath() {
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}"
}
realpath "$0"
This prints the path verbatim if it begins with a /. If not it must be a relative path, so it prepends $PWD to the front. The #./ part strips off ./ from the front of $1.
I found the answer a bit wanting for a few reasons:
in particular, they don't resolve multiple levels of symbolic links, and they are extremely "Bash-y".
While the original question does explicitly ask for a "Bash script", it also makes mention of Mac OS X's BSD-like, non-GNU readlink.
So here's an attempt at some reasonable portability (I've checked it with bash as 'sh' and dash), resolving an arbitrary number of symbolic links; and it should also work with whitespace in the path(s).
This answer was previously edited, re-adding the local bashism. The point of this answer is a portable, POSIX solution. I have edited it to address variable scoping by changing it to a subshell function, rather than an inline one. Please do not edit.
#!/bin/sh
realpath() (
OURPWD=$PWD
cd "$(dirname "$1")"
LINK=$(readlink "$(basename "$1")")
while [ "$LINK" ]; do
cd "$(dirname "$LINK")"
LINK=$(readlink "$(basename "$1")")
done
REALPATH="$PWD/$(basename "$1")"
cd "$OURPWD"
echo "$REALPATH"
)
realpath "$#"
Hope that can be of some use to someone.
A more command-line-friendly variant of the Python solution:
python -c 'import os, sys; print(os.path.realpath(sys.argv[1]))' ./my/path
Since there is a realpath as others have pointed out:
// realpath.c
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[])
{
if (argc > 1) {
for (int argIter = 1; argIter < argc; ++argIter) {
char *resolved_path_buffer = NULL;
char *result = realpath(argv[argIter], resolved_path_buffer);
puts(result);
if (result != NULL) {
free(result);
}
}
}
return 0;
}
Makefile:
#Makefile
OBJ = realpath.o
%.o: %.c
$(CC) -c -o $# $< $(CFLAGS)
realpath: $(OBJ)
gcc -o $# $^ $(CFLAGS)
Then compile with make and put in a soft link with:
ln -s $(pwd)/realpath /usr/local/bin/realpath
abs_path () {
echo "$(cd $(dirname "$1");pwd)/$(basename "$1")"
}
dirname will give the directory name of /path/to/file, i.e. /path/to.
cd /path/to; pwd ensures that the path is absolute.
basename will give just the filename in /path/to/file, i.e.file.
I checked every answered, but missed the best one (IMHO) by Jason S Jul 14 '16 at 3:12, left the comment field.
So here it is, in case someone like me having the tendency to check answered and don't have time to go through every single comments:
$( cd "$(dirname "$0")" ; pwd -P )
Help:
NAME
pwd -- return working directory name
SYNOPSIS
pwd [-L | -P]
DESCRIPTION
The pwd utility writes the absolute pathname of the current working
directory to the standard output.
Some shells may provide a builtin pwd command which is similar or identi-
cal to this utility. Consult the builtin(1) manual page.
The options are as follows:
-L Display the logical current working directory.
-P Display the physical current working directory (all symbolic
links resolved).
I was looking for a solution for use in a system provision script, i.e., run before Homebrew is even installed. Lacking a proper solution I'd just offload the task to a cross-platform language, e.g., Perl:
script_abspath=$(perl -e 'use Cwd "abs_path"; print abs_path(#ARGV[0])' -- "$0")
More often what we actually want is the containing directory:
here=$(perl -e 'use File::Basename; use Cwd "abs_path"; print dirname(abs_path(#ARGV[0]));' -- "$0")
Use Python to get it:
#!/usr/bin/env python
import os
import sys
print(os.path.realpath(sys.argv[1]))
realpath for Mac OS X
realpath() {
path=`eval echo "$1"`
folder=$(dirname "$path")
echo $(cd "$folder"; pwd)/$(basename "$path");
}
Example with related path:
realpath "../scripts/test.sh"
Example with home folder
realpath "~/Test/../Test/scripts/test.sh"
So as you can see above, I took a shot at this about 6 months ago. I totally
forgot about it until I found myself in need of a similar thing again. I was
completely shocked to see just how rudimentary it was; I've been teaching
myself to code pretty intensively for about a year now, but I often feel like
maybe I haven't learned anything at all when things are at their worst.
I would remove the 'solution' above, but I really like it sort of being a record of
of how much I really have learnt over the past few months.
But I digress. I sat down and worked it all out last night. The explanation in
the comments should be sufficient. If you want to track the copy I'm continuing
to work on, you can follow this gist. This probably does what you need.
#!/bin/sh # dash bash ksh # !zsh (issues). G. Nixon, 12/2013. Public domain.
## 'linkread' or 'fullpath' or (you choose) is a little tool to recursively
## dereference symbolic links (ala 'readlink') until the originating file
## is found. This is effectively the same function provided in stdlib.h as
## 'realpath' and on the command line in GNU 'readlink -f'.
## Neither of these tools, however, are particularly accessible on the many
## systems that do not have the GNU implementation of readlink, nor ship
## with a system compiler (not to mention the requisite knowledge of C).
## This script is written with portability and (to the extent possible, speed)
## in mind, hence the use of printf for echo and case statements where they
## can be substituded for test, though I've had to scale back a bit on that.
## It is (to the best of my knowledge) written in standard POSIX shell, and
## has been tested with bash-as-bin-sh, dash, and ksh93. zsh seems to have
## issues with it, though I'm not sure why; so probably best to avoid for now.
## Particularly useful (in fact, the reason I wrote this) is the fact that
## it can be used within a shell script to find the path of the script itself.
## (I am sure the shell knows this already; but most likely for the sake of
## security it is not made readily available. The implementation of "$0"
## specificies that the $0 must be the location of **last** symbolic link in
## a chain, or wherever it resides in the path.) This can be used for some
## ...interesting things, like self-duplicating and self-modifiying scripts.
## Currently supported are three errors: whether the file specified exists
## (ala ENOENT), whether its target exists/is accessible; and the special
## case of when a sybolic link references itself "foo -> foo": a common error
## for beginners, since 'ln' does not produce an error if the order of link
## and target are reversed on the command line. (See POSIX signal ELOOP.)
## It would probably be rather simple to write to use this as a basis for
## a pure shell implementation of the 'symlinks' util included with Linux.
## As an aside, the amount of code below **completely** belies the amount
## effort it took to get this right -- but I guess that's coding for you.
##===-------------------------------------------------------------------===##
for argv; do :; done # Last parameter on command line, for options parsing.
## Error messages. Use functions so that we can sub in when the error occurs.
recurses(){ printf "Self-referential:\n\t$argv ->\n\t$argv\n" ;}
dangling(){ printf "Broken symlink:\n\t$argv ->\n\t"$(readlink "$argv")"\n" ;}
errnoent(){ printf "No such file: "$#"\n" ;} # Borrow a horrible signal name.
# Probably best not to install as 'pathfull', if you can avoid it.
pathfull(){ cd "$(dirname "$#")"; link="$(readlink "$(basename "$#")")"
## 'test and 'ls' report different status for bad symlinks, so we use this.
if [ ! -e "$#" ]; then if $(ls -d "$#" 2>/dev/null) 2>/dev/null; then
errnoent 1>&2; exit 1; elif [ ! -e "$#" -a "$link" = "$#" ]; then
recurses 1>&2; exit 1; elif [ ! -e "$#" ] && [ ! -z "$link" ]; then
dangling 1>&2; exit 1; fi
fi
## Not a link, but there might be one in the path, so 'cd' and 'pwd'.
if [ -z "$link" ]; then if [ "$(dirname "$#" | cut -c1)" = '/' ]; then
printf "$#\n"; exit 0; else printf "$(pwd)/$(basename "$#")\n"; fi; exit 0
fi
## Walk the symlinks back to the origin. Calls itself recursivly as needed.
while [ "$link" ]; do
cd "$(dirname "$link")"; newlink="$(readlink "$(basename "$link")")"
case "$newlink" in
"$link") dangling 1>&2 && exit 1 ;;
'') printf "$(pwd)/$(basename "$link")\n"; exit 0 ;;
*) link="$newlink" && pathfull "$link" ;;
esac
done
printf "$(pwd)/$(basename "$newlink")\n"
}
## Demo. Install somewhere deep in the filesystem, then symlink somewhere
## else, symlink again (maybe with a different name) elsewhere, and link
## back into the directory you started in (or something.) The absolute path
## of the script will always be reported in the usage, along with "$0".
if [ -z "$argv" ]; then scriptname="$(pathfull "$0")"
# Yay ANSI l33t codes! Fancy.
printf "\n\033[3mfrom/as: \033[4m$0\033[0m\n\n\033[1mUSAGE:\033[0m "
printf "\033[4m$scriptname\033[24m [ link | file | dir ]\n\n "
printf "Recursive readlink for the authoritative file, symlink after "
printf "symlink.\n\n\n \033[4m$scriptname\033[24m\n\n "
printf " From within an invocation of a script, locate the script's "
printf "own file\n (no matter where it has been linked or "
printf "from where it is being called).\n\n"
else pathfull "$#"
fi
On macOS, the only solution that I've found to this that reliably handles symlinks is by using realpath. Since this requires brew install coreutils, I just automated that step. My implementation looks like this:
#!/usr/bin/env bash
set -e
if ! which realpath >&/dev/null; then
if ! which brew >&/dev/null; then
msg="ERROR: This script requires brew. See https://brew.sh for installation instructions."
echo "$(tput setaf 1)$msg$(tput sgr0)" >&2
exit 1
fi
echo "Installing coreutils/realpath"
brew install coreutils >&/dev/null
fi
thisDir=$( dirname "`realpath "$0"`" )
echo "This script is run from \"$thisDir\""
This errors if they don't have brew installed, but you could alternatively just install that too. I just didn't feel comfortable automating something that curls arbitrary ruby code from the net.
Note that this an automated variation on Oleg Mikheev's answer.
One important test
One good test of any of these solutions is:
put the code in a script file somewhere
in another directory, symlink (ln -s) to that file
run the script from that symlink
Does the solution dereference the symlink, and give you the original directory? If so, it works.
This seems to work for OSX, doesnt require any binaries, and was pulled from here
function normpath() {
# Remove all /./ sequences.
local path=${1//\/.\//\/}
# Remove dir/.. sequences.
while [[ $path =~ ([^/][^/]*/\.\./) ]]; do
path=${path/${BASH_REMATCH[0]}/}
done
echo $path
}
I like this:
#!/usr/bin/env bash
function realpath() {
local _X="$PWD"
local _LNK=$1
cd "$(dirname "$_LNK")"
if [ -h "$_LNK" ]; then
_LNK="$(readlink "$_LNK")"
cd "$(dirname "$_LNK")"
fi
echo "$PWD/$(basename "$_LNK")"
cd "$_X"
}
I needed a realpath replacement on OS X, one that operates correctly on paths with symlinks and parent references just like readlink -f would. This includes resolving symlinks in the path before resolving parent references; e.g. if you have installed the homebrew coreutils bottle, then run:
$ ln -s /var/log/cups /tmp/linkeddir # symlink to another directory
$ greadlink -f /tmp/linkeddir/.. # canonical path of the link parent
/private/var/log
Note that readlink -f has resolved /tmp/linkeddir before resolving the .. parent dir reference. Of course, there is no readlink -f on Mac either.
So as part of the a bash implementation for realpath I re-implemented what a GNUlib canonicalize_filename_mode(path, CAN_ALL_BUT_LAST) function call does, in Bash 3.2; this is also the function call that GNU readlink -f makes:
# shellcheck shell=bash
set -euo pipefail
_contains() {
# return true if first argument is present in the other arguments
local elem value
value="$1"
shift
for elem in "$#"; do
if [[ $elem == "$value" ]]; then
return 0
fi
done
return 1
}
_canonicalize_filename_mode() {
# resolve any symlink targets, GNU readlink -f style
# where every path component except the last should exist and is
# resolved if it is a symlink. This is essentially a re-implementation
# of canonicalize_filename_mode(path, CAN_ALL_BUT_LAST).
# takes the path to canonicalize as first argument
local path result component seen
seen=()
path="$1"
result="/"
if [[ $path != /* ]]; then # add in current working dir if relative
result="$PWD"
fi
while [[ -n $path ]]; do
component="${path%%/*}"
case "$component" in
'') # empty because it started with /
path="${path:1}" ;;
.) # ./ current directory, do nothing
path="${path:1}" ;;
..) # ../ parent directory
if [[ $result != "/" ]]; then # not at the root?
result="${result%/*}" # then remove one element from the path
fi
path="${path:2}" ;;
*)
# add this component to the result, remove from path
if [[ $result != */ ]]; then
result="$result/"
fi
result="$result$component"
path="${path:${#component}}"
# element must exist, unless this is the final component
if [[ $path =~ [^/] && ! -e $result ]]; then
echo "$1: No such file or directory" >&2
return 1
fi
# if the result is a link, prefix it to the path, to continue resolving
if [[ -L $result ]]; then
if _contains "$result" "${seen[#]+"${seen[#]}"}"; then
# we've seen this link before, abort
echo "$1: Too many levels of symbolic links" >&2
return 1
fi
seen+=("$result")
path="$(readlink "$result")$path"
if [[ $path = /* ]]; then
# if the link is absolute, restart the result from /
result="/"
elif [[ $result != "/" ]]; then
# otherwise remove the basename of the link from the result
result="${result%/*}"
fi
elif [[ $path =~ [^/] && ! -d $result ]]; then
# otherwise all but the last element must be a dir
echo "$1: Not a directory" >&2
return 1
fi
;;
esac
done
echo "$result"
}
It includes circular symlink detection, exiting if the same (intermediary) path is seen twice.
If all you need is readlink -f, then you can use the above as:
readlink() {
if [[ $1 != -f ]]; then # poor-man's option parsing
# delegate to the standard readlink command
command readlink "$#"
return
fi
local path result seenerr
shift
seenerr=
for path in "$#"; do
# by default readlink suppresses error messages
if ! result=$(_canonicalize_filename_mode "$path" 2>/dev/null); then
seenerr=1
continue
fi
echo "$result"
done
if [[ $seenerr ]]; then
return 1;
fi
}
For realpath, I also needed --relative-to and --relative-base support, which give you relative paths after canonicalizing:
_realpath() {
# GNU realpath replacement for bash 3.2 (OS X)
# accepts --relative-to= and --relative-base options
# and produces canonical (relative or absolute) paths for each
# argument on stdout, errors on stderr, and returns 0 on success
# and 1 if at least 1 path triggered an error.
local relative_to relative_base seenerr path
relative_to=
relative_base=
seenerr=
while [[ $# -gt 0 ]]; do
case $1 in
"--relative-to="*)
relative_to=$(_canonicalize_filename_mode "${1#*=}")
shift 1;;
"--relative-base="*)
relative_base=$(_canonicalize_filename_mode "${1#*=}")
shift 1;;
*)
break;;
esac
done
if [[
-n $relative_to
&& -n $relative_base
&& ${relative_to#${relative_base}/} == "$relative_to"
]]; then
# relative_to is not a subdir of relative_base -> ignore both
relative_to=
relative_base=
elif [[ -z $relative_to && -n $relative_base ]]; then
# if relative_to has not been set but relative_base has, then
# set relative_to from relative_base, simplifies logic later on
relative_to="$relative_base"
fi
for path in "$#"; do
if ! real=$(_canonicalize_filename_mode "$path"); then
seenerr=1
continue
fi
# make path relative if so required
if [[
-n $relative_to
&& ( # path must not be outside relative_base to be made relative
-z $relative_base || ${real#${relative_base}/} != "$real"
)
]]; then
local common_part parentrefs
common_part="$relative_to"
parentrefs=
while [[ ${real#${common_part}/} == "$real" ]]; do
common_part="$(dirname "$common_part")"
parentrefs="..${parentrefs:+/$parentrefs}"
done
if [[ $common_part != "/" ]]; then
real="${parentrefs:+${parentrefs}/}${real#${common_part}/}"
fi
fi
echo "$real"
done
if [[ $seenerr ]]; then
return 1
fi
}
if ! command -v realpath > /dev/null 2>&1; then
# realpath is not available on OSX unless you install the `coreutils` brew
realpath() { _realpath "$#"; }
fi
I included unit tests in my Code Review request for this code.
For those nodejs developers in a mac using bash:
realpath() {
node -p "fs.realpathSync('$1')"
}
Just an idea of using pushd:
realpath() {
eval echo "$(pushd $(dirname "$1") | cut -d' ' -f1)/$(basename "$1")"
}
The eval is used to expand tilde such as ~/Downloads.
Based on the communication with commenter, I agreed that it is very hard and has no trival way to implement a realpath behaves totally same as Ubuntu.
But the following version, can handle corner cases best answer can't and satisfy my daily needs on macbook. Put this code into your ~/.bashrc and remember:
arg can only be 1 file or dir, no wildcard
no spaces in the dir or file name
at least the file or dir's parent dir exists
feel free to use . .. / thing, these are safe
# 1. if is a dir, try cd and pwd
# 2. if is a file, try cd its parent and concat dir+file
realpath() {
[ "$1" = "" ] && return 1
dir=`dirname "$1"`
file=`basename "$1"`
last=`pwd`
[ -d "$dir" ] && cd $dir || return 1
if [ -d "$file" ];
then
# case 1
cd $file && pwd || return 1
else
# case 2
echo `pwd`/$file | sed 's/\/\//\//g'
fi
cd $last
}

Resources