Bash what bash alias actually is? [duplicate] - bash

I'm surprised hasn't been asked before, but…
What is the difference between
alias ⇢ alias EXPORT='alias'
function ⇢ function exporter() { echo $EXPORT }
and
export ⇢ export ALIAS='export'
and for that matter...
alias export=$(function) (j/k)
in bash (zsh, et al.)
Specifically, I'd be most interested in knowing the lexical/practical difference between
alias this=that
and
export that=this
I have both forms... all over the place - and would prefer to stop arbitrarily choosing one, over the other. 😂
I'm sure there is a great reference to a "scopes and use-cases for unix shells", somewhere... but thought I'd post the question here, in the name of righteous-canonicalicism.

You're asking about two very different categories of things: aliases and functions define things that act like commands; export marks a variable to be exported to child processes. Let me go through the command-like things first:
An alias (alias ll='ls -l') defines a shorthand for a command. They're intended for interactive use (they're actually disabled by default in shell scripts), and are simple but inflexible. For example, any arguments you specify after the alias simply get tacked onto the end of the command; if you wanted something like alias findservice='grep "$1" /etc/services', you can't do it, because $1 doesn't do anything useful here.
A function is like a more flexible, more powerful version of an alias. Functions can take & process arguments, contain loops, conditionals, here-documents, etc... Basically, anything you could do with a shell script can be done in a function. Note that the standard way to define a function doesn't actually use the keyword function, just parentheses after the name. For example: findservice() { grep "$1" /etc/services; }
Ok, now on to shell variables. Before I get to export, I need to talk about unexported variables. Basically, you can define a variable to have some (text) value, and then if you refer to the variable by $variablename it'll be substituted into the command. This differs from an alias or function in two ways: an alias or function can only occur as the first word in the command (e.g. ll filename will use the alias ll, but echo ll will not), and variables must be explicitly invoked with $ (echo $foo will use the variable foo, but echo foo will not). More fundamentally, aliases and functions are intended to contain executable code (commands, shell syntax, etc), while variables are intended to store non-executable data.
(BTW, you should almost always put variable references inside double-quotes -- that is, use echo "$foo" instead of just echo $foo. Without double-quotes the variable's contents get parsed in a somewhat weird way that tends to cause bugs.)
There are also some "special" shell variables, that are automatically set by the shell (e.g. $HOME), or influence how the shell behaves (e.g. $PATH controls where it looks for executable commands), or both.
An exported variable is available both in the current shell, and also passed to any subprocesses (subshells, other commands, whatever). For example, if I do LC_ALL=en_US.UTF-8, that tells my current shell use the "en_US.UTF-8" locale settings. On the other hand, if I did export LC_ALL=en_US.UTF-8 that would tell the current shell and all subprocesses and commands it executes to use that locale setting.
Note that a shell variable can be marked as exported separately from defining it, and once exported it stays exported. For example, $PATH is (as far as I know) always exported, so PATH=/foo:/bar has the same effect as export PATH=/foo:/bar (although the latter may be preferred just in case $PATH somehow wasn't already exported).
It's also possible to export a variable to a particular command without defining it in the current shell, by using the assignment as a prefix for the command. For example LC_ALL=en_US.UTF-8 sort filename will tell the sort command to use the "en_US.UTF-8" locale settings, but not apply that to the current shell (or any other commands).

TL;DR:
The shell evaluation order (per POSIX) for the entities in your question is:
aliases --> variables --> command substitutions --> special built-ins --> functions --> regular built-ins
Aliases do not persist across subshells, but variables (and in Bash, functions) can be made to do so with the export command.
Regular built-ins can be overridden by writing functions that have the same name as the regular built-in (since functions expand before regular built-ins). (NOTE: If you're trying to add functionality to the regular built-in, call the built-in with command in your function definition so you don't accidentally create a recursive function.)
Variables can be made readonly with the (special built-in) readonly command, but aliases cannot.
USE CASES:
Export a variable if you need to use a variable across subshells.
Make a variable readonly if you don't want it changed for the life of the parent shell (once performed, this cannot be undone with unset; you must restart the parent shell).
If you want to override or add functionality to a regular built-in, use a function.
NOTE: If you want to be sure that you're using a special or regular built-in and not someone else's function, use builtin the_builtin, or if the shell doesn't support the builtin command, use the POSIX comand command -p the_builtin, where the -p switch tells command to use the $PATH that ships with the shell by default (in case the user has overriden path).
NOTE: A variable can be made to act like an alias that also persists across subshells and cannot be changed. For example,
#! /bin/sh
my_cmd='ls -al'
export my_cmd
readonly my_cmd
will act like
#! /bin/sh
alias my_cmd='ls -al'
so long as
my_cmd is used without double-quotes (i.e. ${my_cmd}, NOT "${my_cmd}") so it isn't treated as a single string, and
IFS is the standard space-tab-newline and not switched to something else so that the elements of my_cmd are globbed and each part separated by a space is evaluated as a single token (otherwise it will be evaluated as a single string).
Each shell (e.g. bash, zsh, ksh, yash, etc.) is a bit different, so be sure to review the reference manual for it (they each implement POSIX in a unique way, or sometimes not at all).

Related

How to predictably run shell script in unknown user environment?

Summary
How can I guarantee that my shell scripts will do what I expect, regardless of the environment?
(Let's assume that people have alias'd and function'd everything they can, but that they haven't touched any system binaries eg. /bin/ls)
Explanation
I am distributing shell scripts as part of an app. These shell scripts are executed in the user's environment - this cannot be changed.
This means users may have aliases for anything and functions redefining "standard" behavior. There have already been a few cases when normal shell keywords have been redefined (eg. local), causing unexpected side effects and crashes.
The only tokens that cannot be defined as functions are as follows:
Bash:
! [[ ]] case coproc do done elif else esac fi for function if in select then time until while { }
ZSH:
! [[ case coproc do done elif else end esac fi for foreach function if nocorrect repeat select then time until while { }
I am aware that:
You can escape a word to skip alias lookup
You can use builtin to always run a builtin
You can use command to always run a command
However, builtin and command can be redefined, so \builtin <command> may not always do what I expect.
Aliases are not expanded in bash scripts (unless you explicitly request this), and functions are usually not inherited by child processes. The caller of your script just has to avoid sourcing it. Problems could be environment variables and file handles.
It is difficult to make a script completely self-containing. For instance, I have seen cases where even standard programs (ls, cat,....) are stored in different locations, which means that if you set up your own PATH and don't know anything about the target platform, you have to apply some heuristics (searching a list of "commonly known directories") and hope that your search is correct.
A more reliable way would be to require from the user of the script to provide a certain minimal configuration (typically containing the basic definition for a PATH) and pass this configuration as parameter to your script.
There is one problem pointed out in the comment by Renaud Pacalet, in that bash allows functions to be exported (using export -f), and in bash, you would have to find out which functions exist, and explicitly remove their definitions (similarily as you would do it with environment variables). However, I see that you have tagged your question by bash and zsh, and if you don't mind, which script language you are using, writing the script in zsh would be perhaps better, because zsh does not have exported functions.
One point to keep in mind is, that every shell, bash and zsh, processes on startup certain files, before the commands in your script have any chance to run. For instance, no matter how you start your zsh, it will always process /etc/zshenv. For instance, if your script at one point invokes a zsh child script too, it would again run /etc/zshenv.
Of course, those startup files could set up functions, and in zsh, aliases are (AFIK) even expanded inside scripts. The strategy would be therefore to initially loop over your environment variables, the currently defined functions, the currently defined aliases (in zsh), and remove those definitions. Then you set up your own definitions (functions, variables).

Script runs when executed but fails when sourced

Original Title: Indirect parameter substitution breaks when the script is sourced (zsh)
zsh 5.7.1 (x86_64-apple-darwin19.0)
GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
I’m developing a shell script on a Mac and I’m trying to keep it portable between bash & zsh, so array indexing is a consideration. I know that I can set KSH_ARRAYS to get indexing to start at 0, but I decided to query the OS for the shell that’s in use and set the start index accordingly, which led to the issue described below.
It made sense (to me anyway!) to use indirect expansion, which is what led to the problem. Consider the script indirect.sh:
#! /bin/bash
declare -r ARRAY_START_BASH=0
declare -r ARRAY_START_ZSH=1
declare -r SHELL_BASH=0
declare -r SHELL_ZSH=1
# Indirect expansion is used to reference the values of the variables declared
# in this case statement e.g. ${!ARRAY_START}
case $(basename $SHELL) in
"bash" )
declare -r SHELL_ID=SHELL_BASH
declare -r ARRAY_START=ARRAY_START_BASH
;;
"zsh" )
declare -r SHELL_ID=SHELL_ZSH
declare -r ARRAY_START=ARRAY_START_ZSH
;;
* )
return 1
;;
esac
echo "Shell ID: ${!SHELL_ID} Index arrays from: ${!ARRAY_START}"
It works fine when run from the command line while in the same directory:
<my home> ~ % echo "$(./indirect.sh)"
Shell ID: 1 Index arrays from: 1
Problems arise when I source the script:
<my home> ~ % echo "$(. ~/indirect.sh)"
/Users/<me>/indirect.sh:28: bad substitution
I don’t understand why sourcing the script changes the behavior of the parameter expansion.
Is this expected behavior? If so, I’d be grateful if someone could explain it and hopefully, offer a work around.
The problem described in the original post has nothing to do with indirect expansion. The difference in behavior is a result of different shells being invoked depending on whether the script is “executed” or “sourced”. These differences reveal the basic flaw in deriving the shell from the $SHELL variable that underpins the script's design. If the shell defined in $SHELL does not match the shebang, the script will fail either when sourced or executed. An explanation follows.
Indirect expansion doesn’t offer value in the given scenario because values could just as easily be assigned directly. They’ll have to be assigned that way regardless given the different syntax used for indirect expansion between shells. In fact, other syntax differences between shells makes the entire premise for detecting the shell moot! However, putting that aside, the difference in behavior is a result of different shells being invoked based on whether the script is “executed” or “sourced”. The behavior of sourcing is well documented with numerous explanations on the web, but for context here’s how it works:
Executing a Script
Use the “./“ syntax to execute a script.
When run this way, the script executes in a sub-shell. Any changes the
script makes to it’s shell are applied to the sub-shell, not the shell
in which the script was launched, so those changes are lost when the
shell exits because the sub-shell in which it executed is destroyed as
well. For example, if the script changes the working directory, it
does so in the sub-shell. The working directory of the main shell that
launched the script is unchanged when the script terminates. If you
want to make changes to the shell in which the script was launched, it
must be sourced.
Sourcing a Script
Use the “source “ syntax to source a
script. When run this way, the script essentially becomes an argument
for the source command, which handles invoking the appropriate
execution. Some shells (e.g. ksh) use a single period “.” instead of
“source”.
When a script is executed with the “./“ syntax, the shebang at the top of the file is used to determine which shell to use. When a script is sourced, the shebang is ignored and the shell in which the script is launched is used instead. Also note that the period that appears in the “./“ command syntax used to execute a script, is not related to the period that’s occasionally used as an alias for the source command.
The script in the post uses bash in the shebang statement, so it works when executed because it’s run using bash. When it’s sourced from zsh, it encounters the incorrect indirect expansion syntax:
“${!A_VAR}"
The correct syntax is:
"${(P)A_VAR}"
However, correcting the syntax won’t help because it will then fail when executed. The shebang will invoke bash and the syntax will be wrong again. That renders indirection useless for accessing a variable designed to indicate the shell in use. More importantly, a design based on querying an environment variable for the shell is flawed due to differences in the shell that’s ultimately used depending on whether the script is executed or sourced.
To add to your answer (what I'm going to say is too long for a comment), I can not think of any application, why your script could be useful if not sourced. Actually, I came accross the need of such a script by myself in exactly one occasion:
Since I use as interactive shell not only zsh, but also sometimes bash, so I have written my .zshrc and .bashrc to set up everything (including defining variables and shell functions for interactive use). In order to safe work,
I try to put code which works under both bash and zsh into a single file (say: .commonrc), and my .zshrc and .bashrc have inside them a
source .commonrc
While many things are so different in bash and zsh, that I can't put them into .commonrc, some can, provided I do some tweaking. One reason for headache is obviously the different indexing of arrays, which you seemingly try to solve. So I have also a similar feature. However, I don't nee ca case construct for this. Instead, my .bashrc looks like this (using your naming of the variables):
...
declare -r ARRAY_START=0
source .commonrc
...
and my .zshrc looks like this:
...
declare -r ARRAY_START=1
source .commonrc
...
Since it does not happen that the .bashrc is run from a zsh and vice versa, I don't need to query what kind of shell I have.

Access local shell variables in vim

In vim I can access my bash environment variables such as $PWD and $PATH. I would like to know how to access my temporary shell variables in vim too.
For example, suppose I was in my terminal and define a variable foo="bar". Then I enter vim and try to access this variable with the following command :!echo $foo, but it does not recognize this variable. From my understanding, vim starts a new shell each time a bash command is invoked and then closes it immediately after. Is there a way to use the same shell in vim that my local variable foo was defined in?
No, you can't interact with the parent shell from a subprocess it spawned (without that shell's active participation, which isn't reasonably/practically available in the scenario at hand) -- but you can export your variables to make them accessible to new shells started in child processes.
Running
set -a
...will make any variable defined going forward be automatically exported to the environment, even without an explicit export command.
Since (unlike the C system() function) vim's system() honors the SHELL environment variable, if SHELL=/bin/bash (or :set shell=/bin/bash has been run in vim), you can also invoke exported functions from vim. That is, if you define the function and export it as follows:
foo() { echo "bar"; }
export -f foo
...then you can invoke it with !foo from inside vim.
Even then, however, this is running in a new, transient shell instance, not the original parent process.
Explanation
Environment variables and shell variables are two entirely different concepts, but as we manipulate them in a similar way in bash, it's easy to get confused.
Whenever a process is created (by fork), it may include an environment, given by its parent at fork-time. The child process may then access and modify its content. How this is done as a user depends on the program :
In vim, you can access an environment variable like this : :echo $foo
In bash, you can access it like this : $ echo "$foo"
In most programming languages, you can access it with a syntax coherent with the rest of the language, such as ENV['foo'] in ruby
On the other hand, a program may allocate memory for any internal use, but notably, it will quite often define and use variables. Once again, this depends on the program :
In vim, you would use the :let command to assign an internal variable
In bash, you would assign a variable with $ foo='bar', and then read it with $ echo "$foo"
In most programming languages, you have a variation of the foo='bar' syntax, sometimes with type declarations, etc
As you can see, bash uses the same syntax to read an environment variable and one of its own private variables, which can lead to some confusion.
When you execute vim from your bash shell, the environment is copied over from the parent process (bash) to the child (vim), but the private memory of bash (including the variables you may have defined) are not.
Thus, accessing them from the child process would require some inter-process communication mechanism, between parent and child. While technically doable, this option is not implemented in bash nor vim.
Solution
In order for your variable to be accessible from vim (or any forked process, for that matter), you need it to be present in the environment of your vim process.
Several options to do that :
$ export foo='bar' : This will mark your variable for export to the environment of subsequently executed commands. That's what you want in most cases.
$ foo='bar' vim : This adds your variable to the environment of this vim command. Very useful for troubleshooting, or for one-liners.
$ set -a : As you can see in bash manpage, this marks every subsequent definitions for export to the environment of subsequent commands. It's essentially equivalent to prepending every subsequent definition by export.
To go further
The question uses the :!echo $foo syntax to display the value of foo, which is yet another usecase. The ! here is actually an escape sequence that allows you to execute a shell command from vim.
However, vim cannot execute anything in the parent shell (the one you executed the vim command in), so it creates a new bash shell in a child process, executes echo in it, and displays the result.
In the current case, the result is mostly the same, but it could easily be misleading in other situations, so it's important to understand what is happening here.
There is another vim syntax, using expand, that allows one to lookup variables : :echo expand("$foo")
It however works entirely differently.
If no internal variable named foo exists, vim will invoke a shell to look it up (similarly to what ! would do).
This options is way slower than an environment lookup, and not recommended for most usecases.
If you want to use a value from your shell on the :substitute command, there's actually a way to do it.
I don't know if it solves your need but here we go.
Let's say we want to substitute Mydir by your PWD:
:s/Mydir/\=expand($PWD)/g

Create shell alias with semi-colon character

I've noticed that I have a tendency to mistype ls as ;s, so I decided that I should just create an alias so that instead of throwing an error, it just runs the command I mean.
However, knowing that the semi-colon character has a meaning in shell scripts/commands, is there any way to allow me to create an alias with that the semi-colon key? I've tried the following to no avail:
alias ;s=ls
alias ";s"=ls
alias \;=ls
Is it possible to use the semi-colon as a character in a shell alias? And how do I do so in ZSH?
First and foremost: Consider solving your problem differently - fighting the shell grammar this way is asking for trouble.
As far as I can tell, while you can define such a command - albeit not with an alias - the only way to call it is quoted, e.g. as \;s - which defeats the purpose; read on for technical details.
An alias won't work: while zsh allows you to define it (which, arguably, it shouldn't), the very mechanism that would be required to call it - quoting - is also the very mechanism that bypasses aliases and thus prevents invocation.
You can, however, define a function (zsh only) or a script in your $PATH (works in zsh as well as in bash, ksh, and dash), as long as you invoke it quoted (e.g., as \;s or ';s' or ";s"), which defeats the purpose.
For the record, here are the command definitions, but, again, they can only be invoked quoted.
Function (works in zsh only; place in an initialization file such as ~/.zshrc):
';s'() { ls "$#" }
Executable script ;s (works in dash, bash, ksh and zsh; place in a directory in your $PATH):
#!/bin/sh
ls "$#"

'which' command is incorrect

I have a shell script in my home directory called "echo". I added my home directory to my path, so that this echo would replace the other one.
To do this, I used: export PATH=/home/me:$PATH
When I do which echo, it shows the one I want. /home/me/echo
But when I actually do something like echo asdf it uses the system echo.
Am I doing something wrong?
which is an external command, so it doesn't have access to your current shell's built-in commands, functions, or aliases. In fact, at least on my system, /usr/bin/which is a shell script, so you can examine it and see how it works.
If you want to know how your shell will interpret a command, use type rather than which. If you're using bash, type -a will print all possible meanings in order of precedence. Consult your shell's documentation for details.
For most shells, built-in commands take precedence over commands in your $PATH. The whole point of having a built-in echo, for example, is that it's faster than loading /bin/echo into memory.
If you want your own echo command to override the shell's built-in echo, you can define it as a shell function.
On the other hand, overriding the built-in echo command doesn't strike me as a good idea in the first place. If it behaves the same as the built-in echo, there's not much point. If it doesn't, then it could break scripts that use echo expecting it to work a certain way. If possible, I suggest giving your command a different way. If it's an enhanced version of echo, you could even call it Echo.
It is likely using the shell's builtin.
If you want the one in your path you can do
`which echo` asdf
From this little article that explains the rules, here's a list in descending order of precedence:
Aliases
Shell functions
Shell builtin commands
Hash tables
PATH variable
echo is a shell builtin command (al least in bash) and PATH has the lowest priority. I guess you'll need to create a function or an alias.

Resources