Circular Shebang #! (too many levels of symbolic links) - bash

For context: I was reading this example of trickery one can do with a shebang. It uses
#!/bin/rm
which ends up deleting the file you executed (very funny; you can extend this to create self-deleting messages). This shows that the program (rm) is invoked with the filename as an argument and therefore has access to the whole file including the shebang that invoked it.
Another trickery I've come up with is to invoke yourself as an interpreter in shebang to create an infinite loop. For example if /usr/bin/loop starts with
#!/usr/bin/loop
t should invoke itself with itself forever. Obviously at some point an error will occur and in my particular case I get:
bash: /usr/bin/loop: /usr/bin/loop: bad interpreter: Too many levels of symbolic links
It looks like it got two levels deep. Can somebody explain to me why this particular error occurs? Or maybe share some other error messages for different shells.
In particular I would like to understand why there are symbolic links involved and whether this is an implementation detail of bash or not.

why this particular error occurs?
Because when kernel tries to run the executable, tries to run /usr/bin/loop, then tries to run /usr/bin/loop, then tries to run /usr/bin/loop, etc. and finally fails with ELOOP error. It checks that.
https://elixir.bootlin.com/linux/latest/source/fs/exec.c#L1767
hare some other error messages for different shells.
While it could be, this particular errno message comes from glibc strerror.
why there are symbolic links involved
Because ELOOP is typically returned when doing input/output operations, the message mentions them. Symbolic links are not involved.
this is an implementation detail of bash or not.
Not.

Related

What is the mysterious relationship between `fchown()` and `flock()`?

Reading the man pages for fchown, I find this statement:
The fchown() system call is particularly useful when used in conjunction with the file locking primitives (see flock(2)).
Here's the mystery: the man page for flock(2) makes no mention of fchown or even how ownership affects it in general.
So can anyone explain what happens when fchown and flock are used together and why it's so "useful?"
I'm developing for macOS (Darwin), but I find the same statement (and lack of an explanation) in Linux, BSD, POSIX, and virtually every other *NIX man page I've searched.
Backstory ('cause every great villain has a backstory):
I have a set-UID helper process that gets executed as root, but spends much of its time running as user. While it's running as user, the files it creates belong to user. Good so far.
However, occasionally it needs to create files while running as root. When this happens, the files belong to root and I want them to belong to user. So my plan was to create+open the file, then call fchown() to change the ownership back to user.
But a few of these files are shared and I use flock() to block concurrent access to the file and now I'm wondering what will happen to my flocks.

are there any benefits of ending a bash script with exit

I encountered a bash script ending with the exit line. Would anything changes (save scaring users who 'source' rather than calling straight when the terminal closes )?
Note that I am not particularly interested in difference between exit and return. Here I am only interested in differences between having exit without parameters in the end of a bash script (one being closing console or process which sources the script rather than calling).
Could it be to address some less known shell dialects?
There are generally no benefits to doing this. There are only downsides, specifically the inability to source scripts like you say.
You can construct scenarios where it matters, such as having a sourceing script rely on it for termination on errors, or having a self-extracting archive header avoid executing its payload, but these unusual cases should not be the basis for a general guideline.
The one significant advantage is that it gives you explicit control over the return code.
Otherwise the return code of the script is going to be the return code of whatever the last command it executed happened to be. Which may or may not be indicative of the actual success or failure of the script as a whole.
A slightly less significant advantage is that if the last command's exit code is significant, and you follow it up with "exit $?" that tells the maintenance programmer coming along later that yes, you did consider what the exit code of the program should be and he shouldn't monkey with it without understanding why.
Conversely, of course, I wouldn't recommend ending a bash script with an explicit call to exit unless you really mean "ignore all previous exit codes and use this one". Because that's what anyone else looking at your code is going to assume you wanted and they're going to be annoyed that you wasted their time trying to figure out why if you did it just by rote and not for a reason.

Whether to redirect stderr to stdout OR redirect both to the same file?

Which is better?
cmd >>file 2>&1
cmd 1>>file 2>>file
Is there even a difference?
I know two reasons to choose the first one: It does also work with > instead of >>. It is more popular, therefore someone knowing shell-scripts would except it right away.
But, I still feel like the second one is better readable, and works without having to know the [n]>&[n] syntax, which IMHO is kinda confusing.
What is the difference?
Let's examine what each of these commands means. I will assume that the POSIX shell specification applies since the question doesn't ask about anything more specific.
The first command is cmd >>file 2>&1. This runs cmd after setting up the specified redirections.
The redirection >>file opens the named file with O_APPEND. As explained in the specification of open, this creates a new Open File Description, which notably contains the current file offset, and arranges for File Descriptor 1 to refer to that description. The meaning of O_APPEND is "the file offset shall be set to the end of the file prior to each write".
The redirection 2>&1 says that file descriptor 2 "shall be made to be a copy" of file descriptor 1. That specification is a little vague, but I think the only sensible interpretation (and what shells actually do) is it means to call dup2(1, 2), which "shall cause the file descriptor [2] to refer to the same open file description as the file descriptor [1]". Crucially, we get another file descriptor, but continue to use the same file description, meaning they both have the same file offset.
The second command is cmd 1>>file 2>>file. Based on the specifications cited above, this creates two separate file descriptions for file, each with their own offset.
Now, if the only thing that cmd does to file descriptors 1 and 2 is to call write, then these two situations are equivalent, because every call to write will atomically update the offset to point to the end of the file before performing the write, and therefore the existence of two separate offsets in the second command will not have any observable effect.
However, if cmd performs some other operation, for example lseek, then the two cases are not equivalent because that will reveal that the first command has one shared offset while the second command has two independent offsets.
Additionally, the above assumes the POSIX-specified semantics of O_APPEND. But real computer systems do not always implement that; for example, NFS does not have atomic append. Without atomic append, the second command may behave differently (most likely corrupting the output) even when only write is performed.
Which is better?
As the two commands do not mean the same thing, which is better presumably depends on which meaning is closer to what you intend. I speculate that, in almost all cases, the intent is to append to file both the standard output and standard error from cmd, which is presumed to only write to these descriptors. That is precisely the meaning of the first command (cmd >>file 2>&1), and hence is the better choice.
While the second command does use fewer shell features, and hence might be easier to understand for some people, it would probably seem odd to those who do have greater familiarity with redirection syntax, and might even behave differently than intended in some circumstances. Therefore I would advise against it, and if I found it in some code I was maintaining, would be inclined to change it to the first form.
Of course, if you truly want separate file descriptions, and hence separate file offsets, then the second command makes sense, so long as you put a comment nearby explaining the rationale for the unusual construction.

Changing directories in go routines

I am trying to change directories in a go routine to directory x. I now want to use a different go routine that changes the directory to directory y. Will the execution of my first go routine be affected by this change to the current working directory in the second go routine? The purpose of wanting to do this is to introduce parallelism while doing similar tasks. If it does end up changing the CWD, what should be an alternative approach (forking...)?
Like mentioned in the comments, keeping track of the current working directory in each goroutine will cause problems.
Try using filepath.Abs to capture the absolute directory and store that instead. Then each goroutine can operate on it's own directory without worrying about it being "switched" under the hood. Just be sure you're not modifying the same file accidentally by multiple goroutines.
Edit: Removing a chunk of text per #Evan's comment. Use absolute paths :p
#Evan has identified a fundamental flaw in attempting to use the 'change working directory' (CWD) system call.
I believe that #Evan is correct, and that the CWD is a thread property on some OS's.
As #Evan pointed out, a goroutine could be resheduled (for example at a function call, channel access, or system call) onto a different thread.
The implications are, it may be impossible to change the CWD (if Chdir() could change the threads CWD) because Go's runtime chooses to reschedule the goroutine on a different thread; its CWD could change invisibly and unpredictably.
Edit: I would not expect Chdir() to do anything other than change the CWD for the process. However, the documentation for the package has no mention of 'process'.
Worse, the runtime may change how things work with releases.
Even worse, it would be very hard to debug. It may be a 'Heisenberg problem', where any attempt to debug it (for example by calling a function, which the runtime may use as a reschedule point) may actually changes the behaviour in an unpredictable way.
Keep track of absolute path names. This is explicit, clear, and would even work across goroutines without any need for synchronisation. Hence it is simpler, and easier to test and debug.

Help in understanding this bash file

I am trying to understand the code in this page: https://github.com/corroded/git-achievements/blob/gh-pages/git-achievements
and I'm kinda at a loss on how it actually works. I do know some bash and shell scripting, but how does this script actually "store" how many times you've used a command(im guessing saving into a text file?) and how does it "sense" that you actually typed in a git command? I have a feeling it's line 464 onwards that does it but I don't seem to quite follow the logic.
Can anyone explain this in a bit more understandable context?
I plan to do some achievements for other commands and I hope to have an idea on HOW to go about it without randomly copying and pasting stuff and voodoo.
Yes on 464 start the script, everything before are helping functions. I dont know how it gets installed, but I would assume you have to call this script instead of the normal git-command. It just checks if the first parameter is achievement, and if not then just (regular) git with the rest parameters is executed. Afterwards he checks if an error happend (if he exits). And then he just makes log_action and check_for_achievments. log_action just writes the issued command with a date into a text file, while achievments scans for that log file for certains events. If you want to add another achievment you have to do it in this check_for_achievments.
Just look how the big case handles it (most of the achievments call the count_function which counts the # usages of the function and matches when a power of 2 is reached).

Resources