How does bash tab completion work? - bash

I have been spending a lot of time in the shell lately and I'm wondering how the tab autocomplete works. What's the mechanism behind it? How does the bash know the contents of every directory?

There are two parts to the autocompletion:
The readline library, as already mentioned by fixje, manages the command line editing, and calls back to bash when tab is pressed, to enable completion. Bash then gives (see next point) a list of possible completions, and readline inserts as much characters as are identified unambiguously by the characters already typed in. (You can configure the readline library quite much, see the section Command line editing of the Bash manual for details.)
Bash itself has the built-in complete to define a completion mechanism for individual commands. If for the current command nothing is defined, it used completion by file name (using opendir/readdir, as Ignacio said).
The part to define your own completions is described in the section Programmable Completion. In short, with
complete «options» «command» you define the completion for some command. For example complete -u su says
when completing an argument for the su command, search for users of the current system.
If this is more complicated than the
normal options can cover (e.g. different completions depending on argument index, or depending on previous arguments),
you can use -F function, which will then invoke a shell function to generate the list of possible completions.
(This is used for example for the git completion, which is very complicated, depending on subcommand and sometimes
on options given, and using sometimes names of branches (which are nothing bash knows about).
You can list the existing completions defined in your current bash environment using simply complete, to have an impression on what is possible. If you have the bash-completion package installed (or however it is named on your system), completions for a lot of commands are installed, and as Wrikken said, /etc/bash_completion contains a bash script which is then often executed at shell startup to configure this. Additional custom completion scripts may be placed in /etc/bash_completion.d; those are all sourced from /etc/bash_completion.

If you are interested in the basics:
Bash uses readline which features history and basic completion. You could inspect the source if you want to get a detailed understanding.
Furthermore, you can use readline to build your own CLI interfaces with completion

Related

What is the proper install location of shell function scripts, in the context of Homebrew?

I'm writing a Homebrew formula for EVM, a utility that consists of shell function scripts (as distinct from shell executables). The shell functions in question are normally sourced, then executed manually in an interactive shell. It is unlikely you would ever write a script to call them.
I would therefore like to know what the proper install location for shell function scripts is, in the context of a Homebrew formula.
I have looked at equivalent utilities, but there does not seem to be a consensus on this:
Some put shell functions in libexec, which IIRC in Homebrew is treated as strictly private to the formula (something in libexec will never be added to the PATH). The thing is that these functions must then be sourced by the user, which seems (on the surface) like it breaks the formula-private visibility contract.
Others put shell functions in /usr/share/<shell>/site-functions. This location is on the FPATH for most shells, so for functions that are used for scripting this seems like a good place. But I'm unclear if this is the right place for functions which would generally be used interactively.
For bonus points: is there an install location from which shell functions can be auto-loaded by the shell (i.e. no need to source them)? Or is there no alternative but to add source /full/path/to/the-script.sh to one's .bashrc / .zshrc?
My Homebrew formula (work in progress) is here: homebrew-evm

How can I generate a list of every valid syntactic operator in Bash including input and output?

According to the Bash Reference Manual, the Bash scripting language is constituted of 4 distinct subclasses of syntactic elements:
built-in commands (alias, cd)
reserved words (if, function)
parameters and variables ($, IFS)
functions (abort, end-of-file - activated with keybindings such as Ctrl-d)
Apart from reading the manual, I became inherently curious if there was a programmatic way to list out or generate all such keywords, at least from one of the above categories. I think this could be useful in some contexts. Sometimes I wish I could see all the options available to me for what I can write in any given moment, and having that information as data, instead of a formatted manual, is convenient, focused, and can be edited, in case you want to strike out commands you know well, or that are too obscure for now.
My understanding is that Bash takes the input into stdin and passes it to the running shell process. When code is distributed in a production-ready form, it is compiled, so it runs faster. Unlike using a Python REPL, you don’t have access to the Bash source code from within Bash, so it is not a very direct route to write a program that searches through source files to find various defined commands. I mean that if you wanted to list all functions, Python has the dir() function which programmatically looks for function names in the namespace. But I don’t think Bash can do that. I think it doesn’t have a special syntax in its source files which makes it easy to find and identify all the keywords. Instead, they will be found if you simply enter them - like cd will “find” the program cd because $PATH returns the path to that command - but there’s no special way to discover them.
Or am I wrong? Technically, you could run a “brute force” search by generating every combination of symbols of every length and record when you did not get “error: unknown command” as a response.
Is there any other clever programmatic way to do this?
I mean I want to see a list of every symbol or string that the bash
compiler
Bash is not a compiler. It and every other shell I know are interpreters of various languages.
recognises and knows what to do with, including commands like
“ls” or just a symbol like “*”. I also want to see the inputs and
outputs for each symbol, i.e., some commands are executed in the shell
prompt by themselves, but what data type do they return?
All commands executed by the shell have an exit status, which is a number between 0 and 255. This is as close to a "return type" as you get. Many of them also produce idiosyncratic output to one or two streams (a standard output stream and a standard error stream) under some conditions, and many have other effects on the shell environment or operating environment.
And some
require a certain data type to standard input.
I can't think of a built-in utility whose expected input is well characterized as having a particular data type. That's not really a stream-oriented concept.
I want to do this just as a rigorous way to study the language.
If you want to rigorously study the language, then you should study its manual, where everything you describe has already been compiled. You might also want to study the POSIX shell command language manual for a slightly different perspective, which is more thorough in some areas, though what it documents differs in a few details from Bash's default behavior.
If you want to compile your own summary of Bash syntax and behavior, then those are the best source materials for such an effort.
You can get a list of all reserved words and syntactic elements of bash using this trick:
help -s '*' | cut -d: -f1
Or more accurately:
help -s \* | awk -F ': ' 'NR>2&&!/variables/{print $1}'

Why does Scala use a reversed shebang (!#) instead of just setting interpreter to scala

The scala documentation shows that the way to create a scala script is like this:
#!/bin/sh
exec scala "$0" "$#"
!#
/* Script here */
I know that this executes scala with the name of the script file and the arguments passed to it, and that the scala command apparently knows to read a file that starts like this and ignore everything up to the reversed shebang !#
My question is: is there any reason why I should use this (rather verbose) format for a scala script, rather than just:
#!/bin/env scala
/* Script here */
This, as far a I can tell from a quick test, does exactly the same thing, but is less verbose.
How old is the documentation? Usually, this sort of thing (often referred to as 'the exec hack') was recommended before /bin/env was common, and this was the best way to get the functionality. Note that /usr/bin/env is more common than /bin/env, and ought to be used instead.
Note that it's /usr/bin/env, not /bin/env.
There are no benefits to using an intermediate shell instead of /usr/bin/env, except running in some rare antique Unix variants where env isn't in /usr/bin. Well, technically SCO still exists, but does Scala even run there?
However the advantage of the shell variant is that it gives an opportunity to tune what is executed, for example to add elements to PATH or CLASSPATH, or to add options such as -savecompiled to the interpreter (as shown in the manual). This may be why the documentation suggests the shell form.
I am not on the Scala development team and I don't know what the historical motivation for the Scala documentation was.
Scala did not always support /usr/bin/env. No particular reason for it, just, I imagine, the person who wrote the shell scripting support was not familiar with that syntax, back in the mid 00's. The documentation followed what was supported, and I added /usr/bin/env support at some point (iirc), but never bothered changing the documentation, it would seem.

Determine if bash/zsh/etc. is running under Midnight Commander

Simple question. I'd like to know how to tell whether the current shell is running as a mc subshell or not. If it is, I'd like to enter a degraded mode without some features mc can't handle.
In particular, I'd like this to
Be as portable as possible
Not rely on anything outside the shell and basic universal external commands.
Though it's not documented in the man page, a quick experiment shows that mc sets two environment variables: $MC_TMPDIR and $MC_SID. (It also sets $HISTCONTROL, but that's not specific to mc; it affects the behavior of bash, and could have been set by something other than mc.)
If you don't want to depend on undocumented features, you can always set an environment variable yourself. For example, in bash:
mc() { MC_IS_RUNNING=1 command mc "$#" ; }
Entering a "degraded mode" is another matter; I'm not sure how you'd do that. I don't know of any way in bash to disable specified features. You could disable selected built-in commands by defining functions that override them. What features do you have in mind?

How to enable tab completion from the terminal specific to the executable

In bash, I believe it is possible to enable tab completion on the terminal for terms that are specific to the executable being invoked.
For example, given an executable "eat" with valid arguments {cake, carrot, banana}, typing 'eat car' should complete to 'eat carrot'.
I believe this is possible because I have seen it with 'ant' tab-completing its targets (though how this was set up I don't know).
How can this behaviour be implemented?
This is done with scripts in /etc/bash_completion.d/ and if you want to write your own completion support for an executable, here's a tutorial to get you started.
If you only need to get the behaviour working for common executables, your Linux distro probably has a bash-completion package available with support for common commands.
This is quite similar to filename globbing where the shell will attempt to autocomplete based on the globbing wildcard...for instance....
echo foo*
will list all files in the current directory beginning with 'foo'...the bash shell globbed the wildcard and expanded it into a list of files...
MSDOS had a similar concept, although it was not explicitly linked in at run-time, I'm talking about the old Turbo C stuff, when the wildcard globbing was activated by linking with 'wildargs.obj' (if my memory serves me correct), internally, that code will iterate through the directory and expand the list based on wildcard pattern matching.
In Linux/*nix land, globbing is standard, but however, you cannot manually hit the sequence Tab key to do the pattern matching or completion...as different terminals may translate the tab key differently and of course handle it differently...

Resources