This question already has answers here:
Meaning of $? (dollar question mark) in shell scripts
(8 answers)
Closed 4 years ago.
I'm trying to learn shell scripting, and I need to understand someone else's code. What is the $? variable hold? I can't Google search the answer because they block punctuation characters.
$? is used to find the return value of the last executed command.
Try the following in the shell:
ls somefile
echo $?
If somefile exists (regardless whether it is a file or directory), you will get the return value thrown by the ls command, which should be 0 (default "success" return value). If it doesn't exist, you should get a number other then 0. The exact number depends on the program.
For many programs you can find the numbers and their meaning in the corresponding man page. These will usually be described as "exit status" and may have their own section.
That is the exit status of the last executed function/program/command. Refer to:
exit / exit status # tldp.org
Special Shell Variables # tldp.org
Special Characters # tlpd.org
A return value of the previously executed process.
10.4 Getting the return value of a program
In bash, the return value of a program is stored in a special variable
called $?.
This illustrates how to capture the return value of a program, I
assume that the directory dada does not exist. (This was also
suggested by mike)
#!/bin/bash
cd /dada &> /dev/null
echo rv: $?
cd $(pwd) &> /dev/null
echo rv: $?
See Bash Programming Manual for more details.
Minimal POSIX C exit status example
To understand $?, you must first understand the concept of process exit status which is defined by POSIX. In Linux:
when a process calls the exit system call, the kernel stores the value passed to the system call (an int) even after the process dies.
The exit system call is called by the exit() ANSI C function, and indirectly when you do return from main.
the process that called the exiting child process (Bash), often with fork + exec, can retrieve the exit status of the child with the wait system call
Consider the Bash code:
$ false
$ echo $?
1
The C "equivalent" is:
false.c
#include <stdlib.h> /* exit */
int main(void) {
exit(1);
}
bash.c
#include <unistd.h> /* execl */
#include <stdlib.h> /* fork */
#include <sys/wait.h> /* wait, WEXITSTATUS */
#include <stdio.h> /* printf */
int main(void) {
if (fork() == 0) {
/* Call false. */
execl("./false", "./false", (char *)NULL);
}
int status;
/* Wait for a child to finish. */
wait(&status);
/* Status encodes multiple fields,
* we need WEXITSTATUS to get the exit status:
* http://stackoverflow.com/questions/3659616/returning-exit-code-from-child
**/
printf("$? = %d\n", WEXITSTATUS(status));
}
Compile and run:
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o bash bash.c
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o false false.c
./bash
Output:
$? = 1
In Bash, when you hit enter, a fork + exec + wait happens like above, and bash then sets $? to the exit status of the forked process.
Note: for built-in commands like echo, a process need not be spawned, and Bash just sets $? to 0 to simulate an external process.
Standards and documentation
POSIX 7 2.5.2 "Special Parameters" http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_02 :
? Expands to the decimal exit status of the most recent pipeline (see Pipelines).
man bash "Special Parameters":
The shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed. [...]
? Expands to the exit status of the most recently executed foreground pipeline.
ANSI C and POSIX then recommend that:
0 means the program was successful
other values: the program failed somehow.
The exact value could indicate the type of failure.
ANSI C does not define the meaning of any vaues, and POSIX specifies values larger than 125: What is the meaning of "POSIX"?
Bash uses exit status for if
In Bash, we often use the exit status $? implicitly to control if statements as in:
if true; then
:
fi
where true is a program that just returns 0.
The above is equivalent to:
true
result=$?
if [ $result = 0 ]; then
:
fi
And in:
if [ 1 = 1 ]; then
:
fi
[ is just an program with a weird name (and Bash built-in that behaves like it), and 1 = 1 ] its arguments, see also: Difference between single and double square brackets in Bash
$? is the result (exit code) of the last executed command.
It is the returned error code of the last executed command. 0 = success
$? is the exit status of a command, such that you can daisy-chain a series of commands.
Example
command1 && command2 && command3
command2 will run if command1's $? yields a success (0) and command3 will execute if $? of command2 will yield a success
The exit code of the last command ran.
It is well suited for debugging in case your script exit if set -e is used. For example, put echo $? after the command that cause it to exit and see the returned error value.
Related
I wanted to understand bash builtins. Hence the following questions:
When I think of the term builtin I am thinking that the bash executable has a function defined in its symbol table that other parts of the executable can access without actually having to fork. Is this what builtin means?
I also see that some builtins have a separate executable. For instance type [ returns [ is a shell builtin. But then I also see an executable named /usr/bin/[. Is it correct to say that the same code is available through two executables: one through bash program and another through /usr/bin/[?
Loosely speaking, the program version of the built-ins is used when the shell interpreter is not available or not needed. Let's explain it in more details...
When you run a shell script, the interpreter recognizes the built-ins and will not fork/exec but merely call the function corresponding to the built-in. Even if you call them from an C/C++ executable through system(), the latter launches a shell first and then makes the spawn shell run the built-in.
Here is an example program, which runs echo message thanks to system() library service:
#include <stdlib.h>
int main(void)
{
system("echo message");
return 0;
}
Compile it and run it:
$ gcc msg.c -o msg
$ ./msg
message
Running the latter under strace with the -f option shows the involved processes. The main program is executed:
$ strace -f ./msg
execve("./msg", ["./msg"], 0x7ffef5c99838 /* 58 vars */) = 0
Then, system() triggers a fork() which is actually a clone() system call. The child process#5185 is launched:
clone(child_stack=0x7f7e6d6cbff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD
strace: Process 5185 attached
<unfinished ...>
The child process executes /bin/sh -c "echo message". The latter shell calls the echo built-in to display the message on the screen (write() system call):
[pid 5185] execve("/bin/sh", ["sh", "-c", "echo message"], 0x7ffdd0fafe28 /* 58 vars */ <unfinished ...>
[...]
[pid 5185] write(1, "message\n", 8message
) = 8
[...]
+++ exited with 0 +++
The program version of the built-ins is useful when you need them from a C/C++ executable without an intermediate shell for the sake of the performances. For instance, when you call them through execv() function.
Here is an example program which does the same thing as the preceding example but with execv() instead of system():
#include <unistd.h>
int main(void)
{
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
return 0;
}
Compile and run it to see that we get the same result:
$ gcc msg1.c -o msg1
$ ./msg1
message
Let's run it under strace to get the details. The output is shorter because no sub-process is involved to execute an intermediate shell. The actual /bin/echo program is executed instead:
$ strace -f ./msg1
execve("./msg1", ["./msg1"], 0x7fffd5b22ec8 /* 58 vars */) = 0
[...]
execve("/bin/echo", ["/bin/echo", "message"], 0x7fff6562fbf8 /* 58 vars */) = 0
[...]
write(1, "message \1\n", 10message
) = 10
[...]
exit_group(0) = ?
+++ exited with 0 +++
Of course, if the program is supposed to do additional things, a simple call to execv() is not sufficient as it overwrites itself by the /bin/echo program. A more elaborated program would fork and execute the latter program but without the need to run an intermediate shell:
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void)
{
if (fork() == 0) {
char *av[3];
av[0] = "/bin/echo";
av[1] = "message";
av[2] = NULL;
execv(av[0], av);
}
wait(NULL);
// Make additional things before ending
return 0;
}
Compile and run it under strace to see that the intermediate child process executes the /bin/echo program without the need of an intermediate shell:
$ gcc msg2.c -o msg2
$ ./msg2
message
$ strace -f ./msg2
execve("./msg2", ["./msg2"], 0x7ffc11a5e228 /* 58 vars */) = 0
[...]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 5703 attached
, child_tidptr=0x7f8e0b6e0810) = 5703
[pid 5703] execve("/bin/echo", ["/bin/echo", "message"], 0x7ffe656a9d08 /* 58 vars */ <unfinished ...>
[...]
[pid 5703] write(1, "message\n", 8message
) = 8
[...]
[pid 5703] +++ exited with 0 +++
<... wait4 resumed>NULL, 0, NULL) = 5703
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5703, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
exit_group(0) = ?
+++ exited with 0 +++
the bash executable has a function defined in its symbol table
There are builtins that are included inside Bash executable. You can load builtins dynamically from a separate shared library on runtime.
can access without actually having to fork
Yes.
Is it correct to say that the same code is available through two executables: one through bash executable and another through /usr/bin/[?
No, it's a different source code. One is a Bash builtin and the other is a program. It will be a different source code. There is also different behavior in grey areas.
$ printf "%q\n" '*'
\*
$ /bin/printf "%q\n" '*'
'*'
$ time echo 1
1
real 0m0.000s
user 0m0.000s
sys 0m0.000s
$ /bin/time echo 1
1
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 2392maxresident)k
64inputs+0outputs (1major+134minor)pagefaults 0swaps
$ [ -v a ]
$ /bin/[ -v a ]
/bin/[: ‘-v’: unary operator expected
I am running the following script using tcsh. In my while loop, I'm running a C++ program that I created and will return a different exit code depending on certain things. While it returns an exit code of 0, I want the script to increment counter and run the program again.
#!/bin/tcsh
echo "Starting the script."
set counter = 0
while ($? == 0)
# counter ++
./auto $counter
end
I have verified that my program is definitely returning with exit code = 1 after a certain point. However, the condition in the while loop keeps evaluating to true for some reason and running.
I found that if I stick the following line at the end of my loop and then replace the condition check in the while loop with this new variable, it works fine.
while ($return_code == 0)
# counter ++
./auto $counter
set return_code = $?
end
Why is it that I can't just use $? directly? Is another operation underneath the hood performed in between running my custom program and checking the loop condition that's causing $? to change value?
That is peculiar.
I've altered your example to something that I think illustrates the issue more clearly. (Note that $? is an alias for $status.)
#!/bin/tcsh -f
foreach i (1 2 3)
false
# echo false status=$status
end
echo Done status=$status
The output is
Done status=0
If I uncomment the echo command in the loop, the output is:
false status=1
false status=1
false status=1
Done status=0
(Of course the echo in the loop would break the logic anyway, because the echo command completes successfully and sets $status to zero.)
I think what's happening is that the end that terminates the loop is executed as a statement, and it sets $status ($?) to 0.
I see the same behavior with both tcsh and bsd-csh.
Saving the value of $status in another variable immediately after the command is a good workaround -- and arguably just a better way of doing it, since $status is extremely fragile, and will almost literally be clobbered if you look at it.
Note that I've add a -f option to the #! line. This prevents tcsh from sourcing your init file(s) (.cshrc or .tcshrc) and is considered good practice. (That's not the case for sh/bash/ksh/zsh, which assign a completely different meaning to -f.)
A digression: I used tcsh regularly for many years, both as my interactive login shell and for scripting. I would not have anticipated that end would set $status. This is not the first time I've had to find out how tcsh or csh behaves by trial and error and been surprised by the result. It is one of the reasons I switched to bash for interactive and scripting use. I won't tell you to do the same, but you might want to read Tom Christiansen's classic "csh.whynot".
Slightly shorter/simpler explanation:
Recall that with tcsh/csh EACH command (including shell builtin) return a status. Therefore $? (aliases to $status) is updated by 'if' statements, 'for' loops, assignments, ...
From practical point of view, better to limit the usage of direct use of $? to an if statement after the command execution:
do-something
if ( $status == 0 )
...
endif
In all other cases, capture the status in a variable, and use only that variable
do-something
something_status=$?
if ( $something_status == 0 )
...
endif
To expand on the $status, even a condition test in an if statement will modify the status, therefore the following repeated test on $status will not never hit the '$status == 5', even when do-something will return status of 5
do-something
if ( $status == 2 ) then
echo FOO
else if ( $status == 5 ) then
echo BAR
endif
Suppose I have this simple C program (test.c):
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
exit (1);
}
Obviously, the exit code of this program is 1:
$ gcc test.c
$ ./a.out
$ echo $?
1
But when I run test ./a.out, the result of the test doesn't match the exit code:
$ test ./a.out
$ echo $?
0
So what is actually being tested? Why is the result of the test 0?
test is a Bash built-in, often invoked by the alternative name [.
The last command (test ./a.out) exits with status 0 indicating success because test ./a.out checks whether ./a.out as a string has one or more characters in it (is not an empty string), and because it isn't an empty string, returns success or 0. The test ./a.out command line does not execute your a.out program — as you could see by printing something from within your program.
As written, your program doesn't need the <stdio.h> header or the arguments to main() — it should be int main(void). You could lose <stdlib.h> too if you use return 1; instead of exit(1);:
int main(void)
{
return 1;
}
To use the exit status in an if condition in the shell, just use it directly:
if ./a.out ; then
echo Success
else
echo Failure
fi
Rule of Thumb: Don't call C programs test because you will be confused sooner or later — usually sooner rather than later.
Your C program returns "1" to the shell (I'd prefer"return()" over exit()", but...)
If you wanted to actually run "a.out" in conjunction with the "*nix" test command, you'd use syntax like:
`./a.out` # classic *nix
or
$(./a.out) # Bash
If you did that, however, "test" would read the value printed to "stdout", and NOT the value returned by your program on exit.
You can read more about test here:
test(1) - Linux man page
The classic test command: Bash hackers wiki
Understanding exit codes and how to use them in Bash scripts
Here is an example:
C program:
#include <stdio.h>
int main (int argc, char *argv[]) {
printf("%d\n", argc);
return 2;
}
Shell script:
echo "Assign RETVAL the return value of a.out:"
./a.out RETVAL=$? echo " " RETVAL=$RETVAL
echo "Assign RETVAL the value printed to stdout by a.out:"
RETVAL=$(./a.out) echo " " RETVAL=$RETVAL
echo "Turn an 'trace' and run a.out with 'test':"
set -x
if [ $(./a.out) -eq 1 ]; then
echo "One"
else
echo "Not One"
fi
Example output:
paulsm#vps2:~$ ./tmp.sh
Assign RETVAL the return value of a.out:
1
RETVAL=2
Assign RETVAL the value printed to stdout by a.out:
RETVAL=1
Turn an 'trace' and run a.out with 'test':
+++ ./a.out
++ '[' 1 -eq 1 ']'
++ echo One
One
ALSO:
A couple of points that have already been mentioned:
a. return 1 is generally a better choice than exit (1).
b. "test" is probably a poor name for your executable - because it collides with the built-in "test" command. Something like "test_return" might be a better choice.
I am experiencing some strange return values from system() when a child process receives a SIGINT from the terminal. To explain, from a Perl script parent.pl I used system() to run another Perl script as a child process, but I also needed to run the child through the shell, so I used the system 'sh', '-c', ... form.. So the parent of the child became the sh process and the parent of the sh process became parent.pl. Also, to avoid having the sh process receiving the SIGINT signal, I trapped it.
For example, parent.pl:
use feature qw(say);
use strict;
use warnings;
for (1..3) {
my $res = system 'sh', '-c', "trap '' INT; child$_.pl";
say "Parent received return value: " . ($res >> 8);
}
where child1.pl:
local $SIG{INT} = "DEFAULT";
sleep 10;
say "Child timed out..";
exit 1;
child2.pl:
local $SIG{INT} = sub { die };
sleep 10;
say "Child timed out..";
exit 1;
and child3.pl is:
eval {
local $SIG{INT} = sub { die };
sleep 10;
};
if ( $# ) {
print $#;
exit 2;
}
say "Child timed out..";
exit 0;
If I run parent.pl (from the command line) and press CTRL-C to abort each child process, the output is:
^CParent received return value: 130
^CDied at ./child2.pl line 7.
Parent received return value: 4
^CDied at ./child3.pl line 8.
Parent received return value: 2
Now, I would like to know why I get a return value of 130 for case 1, and a return value of 4 for case 2.
Also, it would be nice to know exactly what the "DEFAULT" signal handler does in this case.
Note: the same values are returned if I replace sh with bash ( and trap SIGINT instead of INT in bash ).
See also:
Propagation of signal to parent when using system
perlipc
Chapter 15, in Programming Perl, 4th Edition
This question is very similar to Propagation of signal to parent when using system that you asked earlier.
From my bash docs:
When a command terminates on a fatal signal N, bash uses the value of 128+N as the exit status.
SIGINT is typically 2, so 128 + 2 give you 130.
Perl's die figures out its exit code by inspecting $! or $? for an uncaught exception (so, not the case where you use eval):
exit $! if $!; # errno
exit $? >> 8 if $? >> 8; # child exit status
exit 255; # last resort
Notice that in this case, Perl exits with the value as is, not shifted up 8 bits.
The errno value happens to be 4 (see errno.h). The $! variable is a dualvar with different string and numeric values. Use it numerically (like adding zero) to get the number side:
use v5.10;
local $SIG{INT}=sub{
say "numeric errno is ", $!+0;
die
};
sleep 10;
print q(timed out);
exit 1;
This prints:
$ bash -c "perl errno.pl"
^Cnumeric errno is 4
Died at errno.pl line 6.
$ echo $?
4
Taking your questions out of order:
Also, it would be nice to know exactly what the "DEFAULT" signal handler does in this case.
Setting the handler for a given signal to "DEFAULT" affirms or restores the default signal handler for the given signal, whose action depends on the signal. Details are available from the signal(7) manual page. The default handler for SIGINT terminates the process.
Now, I would like to know why I get a return value of 130 for case 1, and a return value of 4 for case 2.
Your child1 explicitly sets the default handler for SIGINT, so that signal causes it to terminate abnormally. Such a process has no exit code in the conventional sense. The shell also receives the SIGINT, but it traps and ignores it. The exit status it reports for the child process (and therefore for itself) reflects the signal (number 2) that killed the child.
Your other two child processes, on the other hand, catch SIGINT and terminate normally in response. These do produce exit codes, which the shell passes on to you (after trapping and ignoring the SIGINT). The documentation for die() describes how the exit code is determined in this case, but the bottom line is that if you want to exit with a specific code then you should use exit instead of die.
What does
echo $?
mean in shell programming?
This is the exit status of the last executed command.
For example the command true always returns a status of 0 and false always returns a status of 1:
true
echo $? # echoes 0
false
echo $? # echoes 1
From the manual: (acessible by calling man bash in your shell)
? Expands to the exit status of the most recently executed foreground pipeline.
By convention an exit status of 0 means success, and non-zero return status means failure. Learn more about exit statuses on wikipedia.
There are other special variables like this, as you can see on this online manual: https://www.gnu.org/s/bash/manual/bash.html#Special-Parameters
$? returns the exit value of the last executed command. echo $? prints that value on console. zero implies a successful execution while non-zero values are mapped to various reason for failure.
Hence when scripting; I tend to use the following syntax
if [ $? -eq 0 ]; then
# do something
else
# do something else
fi
The comparison is to be done on equals to 0 or not equals 0.
** Update Based on the comment: Ideally, you should not use the above code block for comparison, refer to #tripleee comments and explanation.
echo $? - Gives the EXIT STATUS of the most recently executed command . This EXIT STATUS would most probably be a number with ZERO implying Success and any NON-ZERO value indicating Failure
? - This is one special parameter/variable in bash.
$? - It gives the value stored in the variable "?".
Some similar special parameters in BASH are 1,2,*,# ( Normally seen in echo command as $1 ,$2 , $* , $# , etc., ) .
It has the last status code (exit value) of a command.
Minimal POSIX C exit status example
To understand $?, you must first understand the concept of process exit status which is defined by POSIX. In Linux:
when a process calls the exit system call, the kernel stores the value passed to the system call (an int) even after the process dies.
The exit system call is called by the exit() ANSI C function, and indirectly when you do return from main.
the process that called the exiting child process (Bash), often with fork + exec, can retrieve the exit status of the child with the wait system call
Consider the Bash code:
$ false
$ echo $?
1
The C "equivalent" is:
false.c
#include <stdlib.h> /* exit */
int main(void) {
exit(1);
}
bash.c
#include <unistd.h> /* execl */
#include <stdlib.h> /* fork */
#include <sys/wait.h> /* wait, WEXITSTATUS */
#include <stdio.h> /* printf */
int main(void) {
if (fork() == 0) {
/* Call false. */
execl("./false", "./false", (char *)NULL);
}
int status;
/* Wait for a child to finish. */
wait(&status);
/* Status encodes multiple fields,
* we need WEXITSTATUS to get the exit status:
* http://stackoverflow.com/questions/3659616/returning-exit-code-from-child
**/
printf("$? = %d\n", WEXITSTATUS(status));
}
Compile and run:
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o bash bash.c
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o false false.c
./bash
Output:
$? = 1
In Bash, when you hit enter, a fork + exec + wait happens like above, and bash then sets $? to the exit status of the forked process.
Note: for built-in commands like echo, a process need not be spawned, and Bash just sets $? to 0 to simulate an external process.
Standards and documentation
POSIX 7 2.5.2 "Special Parameters" http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_02 :
? Expands to the decimal exit status of the most recent pipeline (see Pipelines).
man bash "Special Parameters":
The shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed. [...]
? Expands to the exit status of the most recently executed foreground pipeline.
ANSI C and POSIX then recommend that:
0 means the program was successful
other values: the program failed somehow.
The exact value could indicate the type of failure.
ANSI C does not define the meaning of any vaues, and POSIX specifies values larger than 125: What is the meaning of "POSIX"?
Bash uses exit status for if
In Bash, we often use the exit status $? implicitly to control if statements as in:
if true; then
:
fi
where true is a program that just returns 0.
The above is equivalent to:
true
result=$?
if [ $result = 0 ]; then
:
fi
And in:
if [ 1 = 1 ]; then
:
fi
[ is just an program with a weird name (and Bash built-in that behaves like it), and 1 = 1 ] its arguments, see also: Difference between single and double square brackets in Bash
From http://www.gnu.org/s/bash/manual/bash.html#Special-Parameters
?
Expands to the exit status of the most recently executed foreground pipeline.
See The Bash Manual under 3.4.2 Special Parameters:
? - Expands to the exit status of the most recently executed foreground pipeline.
It is a little hard to find because it is not listed as $? (the variable name is "just" ?). Also see the exit status section, of course ;-)
Happy coding.
Outputs the result of the last executed unix command
0 implies true
1 implies false