Why would someone use two shebangs in a shell script? - bash

I have inherited a medium-sized collection of scripts where some of them start with two shebangs, like this:
#!/bin/sh
#!/bin/bash
[do stuff]
Is there a valid reason for using this construct?
In my experience, any Unix will only respect the first line as a shebang, and the second line will be the first line of the script, which the sh interpreter will ignore as being a comment. Should I assume this is a mistake by the programmer? Is there any difference in, say, compatibility or portability if I were to simply remove the second shebang?

You're right: there is no use for having two shebang lines. Only the first will ever be used.
Perhaps it was a mistake during some previous automated refactoring. Just look in your revision control system--I'm absolutely certain whoever did this left a clear and detailed comment explaining their rationale.

Related

Why don't makefiles behave more like shell scripts within recipes?

I find makefiles very useful, and the header of each recipe
<target> : [dependencies]
is helpful. Within a recipe, the prefixes # and - are useful, as well as the automatically-defined variables like $# and $?. However, besides that, I find the way of coding the actual recipe to be strange and unhelpful. There are so many questions on StackOverflow along the lines of "how to do this in a makefile" for something that's simple (or at least more familiar) to do in bash.
Is there a reason why the recipe contents are not just interpreted as a regular shell script? Reading the manual pages, there seems to be many tools with equivalent functionality to a shell script but with different syntax. I end up specifying .ONESHELL and escaping $ with $$, or sometimes just call a script from the recipe when I can't figure out how to make it work in a makefile. My question is whether this is just unfortunate design, or are there are important features of makefiles that force them to be designed this way?
I don't really know how to answer your question. Probably that means it's not really appropriate for StackOverflow.
The requirement for using $$ instead of $ is obvious. The reasoning for using a separate shell for each logical line of a makefile instead of passing the entire recipe to a single shell, is less clear. It could have worked either way, and this is the way it was chosen.
There is one advantage to the way it works now, although maybe most people don't care about it: you only have to indent the first recipe line with TAB, if you use backslash newline to continue each line. If you don't use backslash newline, then every line has to be indented with TAB else you don't know where the recipe ends.
If your question is, could Stuart Feldman have made very different syntax decisions that would have made it easier to write long/complex recipes in makefiles, then sure. Choosing a more obscure character than $ as a variable introducer would reduce the amount of escaping (although, shell scripting uses pretty much every special character somewhere so "reduce" is the best you can do). Choosing an explicit "start/stop" character sequence for recipes would make it simpler to write long recipes, possibly at the expense of some readability.
But that's not how it was done.

What is the rationale behind variable assignment without space in bash script

I am trying to write an automate process for AWS that requires some JSON processing and other things in bash script. I am following a few blogs for bash script and I found this:
a=b
with the following note:
There is no space on either side of the equals ( = ) sign. We
also leave off the $ sign from the beginning of the variable name when
setting it
This is ugly and very difficult to read and comparing to other scripting languages, it is easy for user to make a mistake when writing a bash script by leaving space in between. I think everyone like to write clean and readable code, this restriction for sure is bad for code readability.
Can you explain why? explanation with examples are highly appreciated.
It's because otherwise the syntax would be ambiguous. Consider this command line:
cat = foo
Is that an assignment to the variable cat, or running the command cat with the arguments "=" and "foo"? Note that "=" and "foo" are both perfectly legal filenames, and therefore reasonable things to run cat on. Shell syntax settles this in favor of the command interpretation, so to avoid this interpretation you need to leave out the spaces. cat =foo has the same problem.
On the other hand, consider:
var= cat
Is that the command cat run with the variable var set to the empty string (i.e. a shorthand for var='' cat), or an assignment to the shell variable var? Again, the shell syntax favors the command interpretation so you need to avoid the temptation to add spaces.
There are many places in shell syntax where spaces are important delimiters. Another commonly-messed-up place is in tests, where if you leave out any of the spaces in:
if [ "$foo" = "$bar" ]
...it will lead to a different meaning, which might cause an error, or might just silently do the wrong thing.
What I'm getting at is that shell syntax does not allow you to arbitrarily add or remove spaces to improve readability. Don't even try, you'll just break things.
What you need to understand is that the shell language and syntax is old. Really old. The first version of the UNIX shell with variables was the Bourne shell which was designed and implemented in 1977. Back then, there were few precedents. (AFAIK, just the Thompson shell, which didn't support variables according to the manual entry.)
The rationale for the design decisions in the 1970's are ... lost in the mists of time. The design decisions were made by Steve Bourne and colleagues working at Bell Labs on v6 UNIX. They probably had no idea that their decisions would still be relevant 40+ years later.
The Bourne shell was designed to be general purpose and simple to use ... compared with the alternative of writing programs in C. And small. It was an outstanding success in those terms.
However, any language that is successful has the "problem" that it gets widely adopted. And that makes it more difficult to fix any issues (real or perceived) that may arise. Any proposal to change a language needs to be balanced against the impact of that change on existing users / uses of the language. You don't want to break existing programs or scripts.
Irrespective of arguments about whether spaces around = should be allowed in a shell variable assignment, changing this would break millions of shell scripts. It is just not going to happen.
Of course, Linux (and UNIX before it) allow you to design and implement your own shell. You could (in theory) replace the default shell. It is just a lot of work.
And there is nothing stopping you from writing your scripts in another scripting language (e.g. Python, Ruby, Perl, etc) or designing and implementing your own scripting language.
In summary:
We cannot know for sure why they designed the shell with this syntax for variable assignment, but it is moot anyway.
Reference:
Evolution of shells in Linux: a history of shells.
It prevents ambiguity in a lot of cases. Otherwise, if you have a statement foo = bar, it could then either mean run the foo program with = and bar as arguments, or set the foo variable to bar. When you require that there are no spaces, now you've limited ambiguity to the case where a program name contains an equals sign, which is basically unheard of.
I agree with #StephenC, and here's some more context with sources:
Unix v6 from 1975 did not have an environment, there was just a exec syscall that took a program and a string array of arguments. The system sh, written by Thompson, did not support variables, only single digit numbered arguments like $1 (probably why $12 to this day is interpreted as ${1}2)
Unix v7 from 1979, emboldened by advances in hardware, added a ton of features including a second string array to the exec call. The man page described it like this, which is still how it works to this day:
An array of strings called the environment is made available by exec(2) when a process begins. By convention these strings have the form name=value
The system sh, now written by Bourne, worked much like v6 shell, but now allowed you to specify these environment strings in the same format in front of commands (because which other format would you use?). The simplistic parser essentially split words by spaces, and flagged a word as destined for a variable if it contained a = and all preceding characters had been alphanumeric.
Thanks to Unix v7's incredible popularity, forks and clones copied a lot of things including this behavior, and that's what we're still seeing today.

Why does Scala use a reversed shebang (!#) instead of just setting interpreter to scala

The scala documentation shows that the way to create a scala script is like this:
#!/bin/sh
exec scala "$0" "$#"
!#
/* Script here */
I know that this executes scala with the name of the script file and the arguments passed to it, and that the scala command apparently knows to read a file that starts like this and ignore everything up to the reversed shebang !#
My question is: is there any reason why I should use this (rather verbose) format for a scala script, rather than just:
#!/bin/env scala
/* Script here */
This, as far a I can tell from a quick test, does exactly the same thing, but is less verbose.
How old is the documentation? Usually, this sort of thing (often referred to as 'the exec hack') was recommended before /bin/env was common, and this was the best way to get the functionality. Note that /usr/bin/env is more common than /bin/env, and ought to be used instead.
Note that it's /usr/bin/env, not /bin/env.
There are no benefits to using an intermediate shell instead of /usr/bin/env, except running in some rare antique Unix variants where env isn't in /usr/bin. Well, technically SCO still exists, but does Scala even run there?
However the advantage of the shell variant is that it gives an opportunity to tune what is executed, for example to add elements to PATH or CLASSPATH, or to add options such as -savecompiled to the interpreter (as shown in the manual). This may be why the documentation suggests the shell form.
I am not on the Scala development team and I don't know what the historical motivation for the Scala documentation was.
Scala did not always support /usr/bin/env. No particular reason for it, just, I imagine, the person who wrote the shell scripting support was not familiar with that syntax, back in the mid 00's. The documentation followed what was supported, and I added /usr/bin/env support at some point (iirc), but never bothered changing the documentation, it would seem.

Is there any shell script and/or Makefile static code analyser?

Or how can I ensure reliability of my Makefiles/scripts?
Update: by shell scripts I mean sh dialect (bash, zsh, whatever), by Makefiles I mean GNU make. I know, they are different beasts, but they have many in common.
P. S. Yeah, I know, static code analysis can't verify all possible cases, and that I need to write my Makefiles and shell script in a way, that would be reliable. I just need tool, that will tell me, when I use bad practices, when I forgot about them or didn't notice in big script. Not fix errors for me, but just take second look.
For sh scripts, ShellCheck will do some static analysis checks, like detecting when variable modifications are hidden by subshells, when you accidentally use [ $foo=bar ] or when you neglect to quote variables that could contain spaces. It also comments on some stylistic issues like useless use of cat or using sed when you could use parameter expansion.

What are the most important shell/terminal concepts/commands for novice to learn?

ALthough I've had to dabble in shell scripting and commands, I still consider myself a novice and I'm interested to hear from others what they consider to be crucial bits of knowledge.
Here's an example of something that I think is important:
I think understanding $PATH is crucial. In order to run psql, for instance, the PostgreSQL folder has to be added to the $PATH variable, a step easily over looked by beginners.
Concept of pipes. The fact that you can easily redirect output and divide complex task to several simple ones is crucial.
Do yourself a favor and get this book: Learning the Bash Shell
Read and understand:
The Official Bash FAQs
Greg Wooledge's Bash FAQs and Bash Pitfalls and everything else on that site
If you're writing shell scripts, an important habit to get into is to always put double quotes around variable substitutions. That is, always write "$myvariable" (and similarly "$(mycommand)"), never plain $myvariable or $(mycommand), unless you understand exactly why you need to leave them out. (Again, the question is not “should I use quotes?”, it's “why would I want to omit the quotes?”)
The reason is that the shell does nasty things when you leave a variable substitution unquoted. (Those nasty things are called field splitting and pathname expansion. They're good in some situations, but almost never on the result of a variable or command substitution.)
If you leave out the quotes, your script may appear to work at first glance. This is because nasty things only happen if the value of the variable contains some special characters (whitespace, \, *, ? and [). This sort of latent bug tends to be revealed the day you create a file whose name contains a space and your script ends up deleting your source tree/thesis/baby pictures/...
So for example, if you have a variable $filename that contains the name of a file you want to pass to a command, always write
mycommand "$filename"
and not mycommand $filename.

Resources