What are the minimum required environment variables? - shell

I am writing a shell.
With the execvpe system call, I can run a program and control its environment. What are the minimum values I need to pass through here?
Alternatively, I understand that child processes should have a copy of their parent's environment, possibly with some values added. While testing my shell, I am running it from within bash from within my terminal from within a window manager, etc etc. What are the bare basics that I can assume are in my environment? If I were to run my shell straight from a TTY (the "lowest level", as far as I understand), what can I expect?

That’s a very broad question.  To a certain extent,
programs should be able to run with no environment at all.
“X” display (i.e., GUI) programs need to know
where they are supposed to display. 
This information is usually provided
through the DISPLAY environment variable,
but can also be passed on the command line. 
There are probably other environment variables that are essential
(or nearly so) to “X” programs;
it’s been a while since I’ve looked under that hood.
Any program that needs to use special characteristics of your terminal
needs the TERM environment variable. 
“Special characteristics” means being able to set colors
(as ls and grep can do, subject to options),
move around the screen (like vi / vim),
or even know the size of the screen (like less). 
Note that size of the screen may also be available
through ROWS and COLUMNS.
Any program that needs to know the date and time
as perceived / understood by the user needs to know the time zone (TZ) —
although, if you’re willing to work with absolute (GMT / UTC),
you don’t need this.
etc.

The minimum that you need is a working PATH variable. Any extras beyond that depend on what programs you want to execute.
POSIX has a list of commonly-used environment variables, very few programs use more than a few of those.
Generally if you're using execvp*, you're not giving full pathnames for the executables. It makes your programs much simpler, you do not have to provide a full pathname for each executable, as is needed by the plain execv. POSIX describes these functions as
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
and (referring to the parameters of the various exec* functions):
The argument path points to a pathname that identifies the new process image file.
The argument file is used to construct a pathname that identifies the new process image file. If the file argument contains a slash character, the file argument shall be used as the pathname for this file. Otherwise, the path prefix for this file is obtained by a search of the directories passed as the environment variable PATH (see XBD Environment Variables). If this environment variable is not present, the results of the search are implementation-defined.
and (remember that "file" is referring to execvp rather than execv, so the environ variable applies to the search using PATH for the "file" parameter):
For those forms not containing an envp pointer (execl(), execv(), execlp(), and execvp()), the environment for the new process image shall be taken from the external variable environ in the calling process.
So... you could technically remove the entire PATH variable, but the result would be implementation-defined.

The minimum neccessary environment is empty. You don't need anything.
e.g.
$ env -i env
$
We can see that env -i has created a blank environment.
We can take this further:
$ env -i /bin/bash
sweh#server:/home/sweh$ env
LS_COLORS=
PWD=/home/sweh
SHLVL=1
_=/usr/bin/env
We can see that bash has set a few variables, but nothing was inherited.
Now such an environment may break some things; e.g. a missing TERM variable means that vi or less may not work properly
$ less foo
WARNING: terminal is not fully functional
foo (press RETURN)
So, really, you need to determine what programs you expect to run inside the environment and what their needs are.

Related

How can I generate a list of every valid syntactic operator in Bash including input and output?

According to the Bash Reference Manual, the Bash scripting language is constituted of 4 distinct subclasses of syntactic elements:
built-in commands (alias, cd)
reserved words (if, function)
parameters and variables ($, IFS)
functions (abort, end-of-file - activated with keybindings such as Ctrl-d)
Apart from reading the manual, I became inherently curious if there was a programmatic way to list out or generate all such keywords, at least from one of the above categories. I think this could be useful in some contexts. Sometimes I wish I could see all the options available to me for what I can write in any given moment, and having that information as data, instead of a formatted manual, is convenient, focused, and can be edited, in case you want to strike out commands you know well, or that are too obscure for now.
My understanding is that Bash takes the input into stdin and passes it to the running shell process. When code is distributed in a production-ready form, it is compiled, so it runs faster. Unlike using a Python REPL, you don’t have access to the Bash source code from within Bash, so it is not a very direct route to write a program that searches through source files to find various defined commands. I mean that if you wanted to list all functions, Python has the dir() function which programmatically looks for function names in the namespace. But I don’t think Bash can do that. I think it doesn’t have a special syntax in its source files which makes it easy to find and identify all the keywords. Instead, they will be found if you simply enter them - like cd will “find” the program cd because $PATH returns the path to that command - but there’s no special way to discover them.
Or am I wrong? Technically, you could run a “brute force” search by generating every combination of symbols of every length and record when you did not get “error: unknown command” as a response.
Is there any other clever programmatic way to do this?
I mean I want to see a list of every symbol or string that the bash
compiler
Bash is not a compiler. It and every other shell I know are interpreters of various languages.
recognises and knows what to do with, including commands like
“ls” or just a symbol like “*”. I also want to see the inputs and
outputs for each symbol, i.e., some commands are executed in the shell
prompt by themselves, but what data type do they return?
All commands executed by the shell have an exit status, which is a number between 0 and 255. This is as close to a "return type" as you get. Many of them also produce idiosyncratic output to one or two streams (a standard output stream and a standard error stream) under some conditions, and many have other effects on the shell environment or operating environment.
And some
require a certain data type to standard input.
I can't think of a built-in utility whose expected input is well characterized as having a particular data type. That's not really a stream-oriented concept.
I want to do this just as a rigorous way to study the language.
If you want to rigorously study the language, then you should study its manual, where everything you describe has already been compiled. You might also want to study the POSIX shell command language manual for a slightly different perspective, which is more thorough in some areas, though what it documents differs in a few details from Bash's default behavior.
If you want to compile your own summary of Bash syntax and behavior, then those are the best source materials for such an effort.
You can get a list of all reserved words and syntactic elements of bash using this trick:
help -s '*' | cut -d: -f1
Or more accurately:
help -s \* | awk -F ': ' 'NR>2&&!/variables/{print $1}'

% in Windows environmental variables value

What does % mean in windows environmental variables ?
%SystemRoot%\system32;
%SystemRoot%;
%SystemRoot%\System32\Wbem;
Especially the Path, TMP, TEMP variable values have this sign. There might be other variables also, but I came across only these three.
Do I need to bother about it while setting my own path variables ?
Do I need to bother about it while setting my own path variables ?
Under normal circumstances, no, you don't. You would only do this if you wanted the effective value of PATH to depend on some other environment variable. Even then it is only a convenience, never a necessity.
As a real-world example of when it might be convenient, suppose you've written a program that automates updating the Java SDK to the latest version, so your users don't have to do it by hand. Updating the SDK moves it to a different location, so you probably want to add the new location of the SDK to the path, and remove the old one.
You could do that the hard way, by parsing PATH each time, locating the part that points to the old location and changing it appropriately. But that's a pain, and if you're doing this globally, the users don't have any choice over whether Java is in the path, even if they don't use it. So instead you might create a variable JAVA_PATH that points to the current SDK location. That way, it is easy to change, and individual users can choose whether or not to put %JAVA_PATH% in their own paths.
In Microsoft's case (the examples you noticed) the system root is never going to move, but by using a variable they could hard-code the default value of PATH rather than having to explicitly generate it during operating system installation.
PS: the environment variables referenced in PATH must all be system variables. Referencing a user variable will not work.
%VariableName% is the syntax for referencing an environment variable. The actual name is the part between the % symbols.
So your first line, when fully expanded, would evaluate to the value of the SystemRoot variable, followed by \system32;.
You'll need to use %...% if you want to make use of environment variables in the Windows shell, or if you want to define environment variables that reference other variables.

Make Variables from Parent Shell Script

In a makefile I'm reading, there are many references to a variable, projdir, defined and exported in the shell script that calls make; there are many instances of $(projdir) in the makefile.
From the make manual at http://www.gnu.org/software/make/manual/make.html#Reference, there are only two types of variables: recursively expanded variables and simply expanded variables. It seems projdir is neither one.
Question 1: Since the makefile works, it must be true that makefiles can access shell variables defined in the parent shell environment. Why is this not documented (or perhaps I did not find the correct documentation)?
Question 2: Unrelated to question 1, I see in Section 6.2 the line all:;echo $(foo). Why is the semicolon ; needed here?
Question 1. Environment variables automatically become make variables. This is explained in section 6.10:
Variables in make can come from the environment in which make is run. Every environment variable that make sees when it starts up is transformed into a make variable with the same name and value.
Question 2: Putting a semicolon after a rule name and optional prerequisites allows you to start the recipe on the same line. This is explained in section 4.2:
The first recipe line may appear on the line after the prerequisites, with a tab character, or may appear on the same line, with a semicolon.
They used this syntax in the example in 6.2 for brevity, so they could show the whole rule inline in the sentence; I think it's rarely used in practice.
From the make manual:
Variables in make can come from the environment in which make is run. Every environment variable that make sees when it starts up is transformed into a make variable with the same name and value.
As to your second question:
Why is the semicolon ; needed here?
Without the semicolon the right-hand side of : would specify the list of dependencies. The semicolon terminates that (empty) list.
Variables have both a flavor and an origin.
Expansion type is a flavor.
make contains a flavor function which can be used to find the type of variable in question. It will return one of undefined, recursive or simple.
Variables also have an origin which is where their value has come from. What you are seeing here is one of those possible origins.
The origin function will tell you the origin of a variable.
There are seven possible origins: undefined, default, environment, environment override, file, command line, override and automatic.
So your projdir variable has a flavor of simple and an origin of environment.

Determine if bash/zsh/etc. is running under Midnight Commander

Simple question. I'd like to know how to tell whether the current shell is running as a mc subshell or not. If it is, I'd like to enter a degraded mode without some features mc can't handle.
In particular, I'd like this to
Be as portable as possible
Not rely on anything outside the shell and basic universal external commands.
Though it's not documented in the man page, a quick experiment shows that mc sets two environment variables: $MC_TMPDIR and $MC_SID. (It also sets $HISTCONTROL, but that's not specific to mc; it affects the behavior of bash, and could have been set by something other than mc.)
If you don't want to depend on undocumented features, you can always set an environment variable yourself. For example, in bash:
mc() { MC_IS_RUNNING=1 command mc "$#" ; }
Entering a "degraded mode" is another matter; I'm not sure how you'd do that. I don't know of any way in bash to disable specified features. You could disable selected built-in commands by defining functions that override them. What features do you have in mind?

finding variable and function memory addresses in shell

In shell programming, is there any way to find the variable and function addresses?
Something like in c or c++ when we output the &x, we can get x memory address.
It depends on what you want. Most shells only support symbolic (or "soft") references, variables that contain the names of other variables. Korn shell 93 has name references, created using typeset -n, but the address is not exposed. Take a look at that in man ksh, it might help.
But even if you could get the memory address of variables, what would you do with it? Shell variables are not structured in a primitive manner as in C, so pointer arithemtic would be no use. You could not iterate through an array or character string just by incrementing an address. The best you could do is investigate the memory of that particular version of that particular shell.
Since the shells are mostly written in C, you could always compile the shell source with debug (-g) and connect using a debugger. No idea what that would buy you though, so my question has to be: why?
Edit; from a comment I see you want the stack size - I'm pretty sure that shell variables will be allocated either on the heap or in the environment block.

Resources