I encountered a bash script ending with the exit line. Would anything changes (save scaring users who 'source' rather than calling straight when the terminal closes )?
Note that I am not particularly interested in difference between exit and return. Here I am only interested in differences between having exit without parameters in the end of a bash script (one being closing console or process which sources the script rather than calling).
Could it be to address some less known shell dialects?
There are generally no benefits to doing this. There are only downsides, specifically the inability to source scripts like you say.
You can construct scenarios where it matters, such as having a sourceing script rely on it for termination on errors, or having a self-extracting archive header avoid executing its payload, but these unusual cases should not be the basis for a general guideline.
The one significant advantage is that it gives you explicit control over the return code.
Otherwise the return code of the script is going to be the return code of whatever the last command it executed happened to be. Which may or may not be indicative of the actual success or failure of the script as a whole.
A slightly less significant advantage is that if the last command's exit code is significant, and you follow it up with "exit $?" that tells the maintenance programmer coming along later that yes, you did consider what the exit code of the program should be and he shouldn't monkey with it without understanding why.
Conversely, of course, I wouldn't recommend ending a bash script with an explicit call to exit unless you really mean "ignore all previous exit codes and use this one". Because that's what anyone else looking at your code is going to assume you wanted and they're going to be annoyed that you wasted their time trying to figure out why if you did it just by rote and not for a reason.
Is it possible to raise exceptions in bash? This can be useful, for example, when we want the script to exit when an error happens in a subcommand. Without exception, it seems the best we can do is to append || exit after each subcommand, which gives poor readability.
I didn't find descriptions about exceptions in bash manual. But I'm wondering whether there are ways to simulate them.
No, Bash does not have a notion of exceptions like other languages like Java have. The key unit of error-reporting in Bash is the exit code; functions, commands, and scripts all return 0 on success and non-zero to report some sort of error condition. Many programs document specific exit codes to report certain failure modes, for instance grep uses 1 to mean no-match-found and 2 to report other errors.
There are a number of useful debugging tricks you can take advantage of despite the lack of exceptions, including the caller command which enables some introspection of the current execution context.
Other resources:
How to debug a bash script?
Trace of executed programs called by bash script
Accessing function call stack in trap function
I want to press a key at any point, causing the simulation to stop without loosing data collected until that point. I don't know how to do the exit command. Can you give me some examples?
I think, WandMaker's comment tells only half of the story.
First, there is no general rule, that Control-C will interrupt your program (see the for instance here), but assume that this works in your case (since it will work in many cases):
If I understand you write, you want to somehow "process" the data collected up to this point. This means that you need to intercept the effect of Control-C (which, IF it works as expected, will make the controlling shell deliver a SIGINT), or that you need to interecept the "exit" (since the default behaviour upon receiving a SIGINT would be to exit the program).
If you want to go along the first path, you need to catch the Interrupt exception; see for example here.
If you want to follow the second route, you need to install an exit handler. Note that it will be called too when the program is exited in the normal way.
If you are unsure, which way is better - and I see no general way to recommend one over the other -, try the first one. There is less chance that you will accidentally ruin something.
I'm writing a rather trivial bash script; if I detect an error (not with a bad exit status from some other process) I want to exit with an exit status indicating an error (without being too specific).
It seems like I should be doing exit 1 (e.g. as per the TLDP Advanced Bash Scripting Guide, and the C Standard Library's stdlib.h header); yet I notice many people exit -1. Why is that?
TLDP's ABS is of questionable validity (in that it often uses, without comment, sub-par practices) so I wouldn't take it as a particular bastion of correctness about this.
That said valid command return codes are between 0 and 255 with 0 being "success". So yes, 1 is a perfectly valid (and common) error code.
Obviously I cannot say for certain why other people do that but I have two thoughts on the topic.
A failure to context switch (possibly combined with a lack of domain knowledge).
In many languages a return value of -1 from a function is a perfectly valid value and stands out from all the positive values that might (one assumes) normally be returned.
So attempting to extend that pattern (which the writer has picked up over time) to a shell script/etc. is a reasonable thing for them to do. Especially if they don't have the domain knowledge to realize that valid return codes are between 0 255.
An attempt to have those error exit lines "stand out" from normal exit cases (which may or may not be successful exits themselves) in an attempt to visually distinguish a certain set of extremely unlikely or otherwise extraordinary exit cases.
An exit of -1 does, actually, work it just doesn't get you a return code of -1 it gets you a return code of 255. (Try (exit -1); echo $? in your shell to see that.) So this isn't an entirely unreasonable thing to want to do (despite being confusing and complicit in perpetrating a confusion about exit codes).
Is there a minimally POSIX.2 compliant shell (let's call it mpcsh) in the following sense:
if mpcsh myscript.sh behaves correctly on my (compliant) system then xsh myscript.sh will behave identically for any POSIX.2 compliant shell xsh on any compliant system. ("Identically" up to less relevant things like the wording of error messages etc.)
Does dash qualify?
If not, is there any way to verify compliance of myscript.sh?
Edit (9 years later):
The accepted answer still stands, but have a look at this blog post and the checkbashisms command (source). Avoiding bashisms is not the same as writing a POSIX.2 compliant shell script, but it comes close.
The sad answer in advance
It won't help you (not as much and reliably as you would expect and want it to anyway).
Here is why.
One big problem that cannot be addressed by a virtual "POSIX shell" are things that are ambiguously worded or just not addressed in the standard, so that shells may implement things in different ways while still adhering to the standard.
Take these two examples regarding pipelines, the first of which is well known:
Example 1 - scoping
$ ksh -c 'printf "foo" | read s; echo "[${s}]"'
[foo]
$ bash -c 'printf "foo" | read s; echo "[${s}]"'
[]
ksh executes the last command of a pipe in the current shell, whereas bash executes all - including the last command - in a subshell. bash 4 introduced the lastpipe option which makes it behave like ksh:
$ bash -c 'shopt -s lastpipe; printf "foo" | read s; echo "[${s}]"'
[foo]
All of this is (debatably) according to the standard:
Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment.
I am not 100% certain on what they meant with extension, but based on other examples in the document it does not mean that the shell has to provide a way to switch between behavior but simply that it may, if it wishes so, implement things in this "extended way". Other people read this differently and argue about the ksh behavior being non-standards-compliant and I can see why. Not only is the wording unlucky, it is not a good idea to allow this in the first place.
In practice it doesn't really matter which behavior is correct since those are the """two big shells""" and people would think that if you don't use their extensions and only supposedly POSIX-compliant code that it will work in either, but the truth is that if you rely on one or the other behavior mentioned above your script can break in horrible ways.
Example 2 - redirection
This one I learnt about just a couple of days ago, see my answer here:
foo | bar 2>./qux | quux
Common sense and POLA tells me that when the next line of code is hit, both quux and bar should have finished running, meaning that the file ./qux is fully populated. Right? No.
POSIX states that
If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.)
May (!) wait for all commands to complete! WTH!
bash waits:
The shell waits for all commands in the pipeline to terminate before returning a value.
but ksh doesn't:
Each command, except possibly the last, is run as a separate process; the shell waits for the last command to terminate.
So if you use redirection inbetween a pipe, make sure you know what you are doing since this is treated differently and can horribly break on edge cases, depending on your code.
I could give another example not related to pipelines, but I hope these two suffice.
Conclusion
Having a standard is good, continuously revising it is even better and adhering to it is great. But if the standard fails due to ambiguity or permissiveness things can still unexpectedly break practically rendering the usefulness of the standard void.
What this means in practice is that on top of writing "POSIX-compliant" code you still need to think and know what you are doing to prevent certain things from happening.
All that being said, one shell which has not yet been mentioned is posh which is supposedly POSIX plus even fewer extensions than dash has, (primarily echo -n and the local keyword) according to its manpage:
BUGS
Any bugs in posh should be reported via the Debian BTS.
Legitimate bugs are inconsistencies between manpage and behavior,
and inconsistencies between behavior and Debian policy
(currently SUSv3 compliance with the following exceptions:
echo -n, binary -a and -o to test, local scoping).
YMMV.
Probably the closest thing to a canonical shell is ash which is maintained by The NetBSD Foundation, among other organizations.
A downstream variant of this shell called dash is better known.
Currently, there is no single role model for the POSIX shell.
Since the original Bourne shell, the POSIX shell has adopted a number of additional features.
All of the shells that I know that implement those features also have extensions that go beyond the feature set of the POSIX shell.
For instance, POSIX allows for arithmetic expressions in the format:
var=$(( expression ))
but it does not allow the equivalent:
(( var = expression ))
supported by bash and ksh93.
I know that bash has a set -o posix option, but that will not disable any extensions.
$ set -o posix
$ (( a = 1 + 1 ))
$ echo $a
2
To the best of my knowledge, ksh93 tries to conform to POSIX out of the box, but still allows extensions.
The POSIX developers spent years (not an exaggeration) wrestling with the question: "What does it mean for an application program to conform to the standard?" While the POSIX developers were able to define a conformance test suite for an implementation of the standards (POSIX.1 and POSIX.2), and could define the notion of a "strictly conforming application" as one which used no interface beyond the mandatory elements of the standard, they were unable to define a testing regime that would confirm that a particular application program was "strictly conforming" to POSIX.1, or that a shell script was "strictly conforming" to POSIX.2.
The original question seeks just that; a conformance test that verifies a script uses only elements of the standard which are fully specified. Alas, the standard is full of "weasel words" that loosen definitions of behavior, making such a test effectively impossible for a script of any significant level of usefulness. (This is true even setting aside the fact that shell scripts can generate and execute shell scripts, thus rendering the question of "strictly conforming" as equivalent to the Stopping Problem.)
(Full disclosure: I was a working member and committee leader within IEEE-CS TCOS, the creators of the POSIX family of standards, from 1988-1999.)
If not, is there any way to verify compliance of myscript.sh?
This is basically a case of Quality Assurance. Start with:
code review
unit tests (yes, I've done this)
functional tests
perform the test suite with as many different shell programs as you can find.
(ash, bash, dash, ksh93, mksh, zsh)
Personally, I aim for the common set of extensions as supported by bash and ksh93. They're the oldest and most widely available interpreters of the shell language available.
EDIT Recently I happened upon rylnd/shpec - a testing framework for your shell code. You can describe features of your code in test cases, and specify how they can be verified.
Disclosure: I helped making it work across bash, ksh, and dash.