When calling shell scripts from Erlang, I generally need their exit status (0 or something else), so I run them using this function:
%% in module util
os_cmd_exitstatus(Action, Cmd) ->
?debug("~ts starting... Shell command: ~ts", [Action, Cmd]),
try erlang:open_port({spawn, Cmd}, [exit_status, stderr_to_stdout]) of
Port ->
os_cmd_exitstatus_loop(Action, Port)
catch
_:Reason ->
case Reason of
badarg ->
Message = "Bad input arguments";
system_limit ->
Message = "All available ports in the Erlang emulator are in use";
_ ->
Message = file:format_error(Reason)
end,
?error("~ts: shell command error: ~ts", [Action, Message]),
error
end.
os_cmd_exitstatus_loop(Action, Port) ->
receive
{Port, {data, Data}} ->
?debug("~ts... Shell output: ~ts", [Action, Data]),
os_cmd_exitstatus_loop(Action, Port);
{Port, {exit_status, 0}} ->
?info("~ts finished successfully", [Action]),
ok;
{Port, {exit_status, Status}} ->
?error("~ts failed with exit status ~p", [Action, Status]),
error;
{'EXIT', Port, Reason} ->
?error("~ts failed with port exit: reason ~ts",
[Action, file:format_error(Reason)]),
error
end.
This worked fine, until I used this to start a script which forks off a program and exits:
#!/bin/sh
FILENAME=$1
eog $FILENAME &
exit 0
(In the actual usecase, there are quite a few more arguments, and some massaging before they are passed to the program). When run from the terminal, it shows the image and exits immediately, as expected.
But running from Erlang, it doesn't. In the log file I see that it starts fine:
22/Mar/2011 13:38:30.518 Debug: Starting player starting... Shell command: /home/aromanov/workspace/gmcontroller/scripts.dummy/image/show-image.sh /home/aromanov/workspace/media/images/9e89471e-eb0b-43f8-8c12-97bbe598e7f7.png
and the eog window appears. But I don't get
22/Mar/2011 13:47:14.709 Info: Starting player finished successfully
until killing the eog process (with kill or just closing the window), which isn't suitable for my requirements. Why the difference in behavior? Is there a way to fix it?
Normally if you run a command in background with & in a shell script and the shell script terminates before the command, then the command gets orphaned. It might be that erlang trys to prevent orphaned processes in open_port and waits for eog to terminate. Normally if you want to run something in background during a shell script you should put in a wait at the end of the script to wait for your background processes to terminate. But this is exactly what youd don't want to do.
You might try the following in your shell script:
#!/bin/sh
FILENAME=$1
daemon eog $FILENAME
# exit 0 not needed: daemon returns 0 if everything is ok
If your operating system has a daemon command. I checked in FreeBSD and it has one: daemon(8)
This is not a command available on all Unix alike systems, however there might be a different command doing the same thing in your operating system.
The daemon utility detaches itself from the controlling terminal and executes the program specified by its arguments.
I'm not sure if this solves your problem, but I suspect that eog somehow stays attached to stdin/stdou as a kind of controling terminal. Worth a try anyway.
This should also solve the possible problem that job control is on erroneously which could also cause the problem. Since daemon does exit normally your shell can't try to wait for the background job on exit because there is none in the shells view.
Having said all this: why not just keep the port open in Erlang while eog runs?
Start it with:
#!/bin/sh
FILENAME=$1
exec eog $FILENAME
Calling it with exec doesn't fork it bu replaces the shell process with eog. The exit status you'll see in Erlang will then be the status of eog when it terminates. Also you have the possibility to close the port and terminate eog from Erlang if you want to do so.
Perhaps your /bin/sh doesn't support job control when it isn't run interactively? At least the /bin/sh (actually dash(1)!) on my Ubuntu system mentions:
-m monitor Turn on job control (set automatically
when interactive).
When you run the script from a terminal, the shell probably recognizes that it is being run interactively and supports job control. When you run the shell script as a port, the shell probably runs without job control.
Related
I'm trying to achieve the following:
from a fish script, open a PDF reader as a background job. Once it is opened, spawn another fish process (that runs an infinite while loop), also as a background job.
Next, open an editor (neovim) and allow it to take control of the running terminal. Once neovim terminates, also suspend the previous 2 background jobs (mupdf and the other fish process).
My current attempt looks something along the lines of:
mupdf $pdfpath &
set pid_mupdf $last_pid
fish -c "while inotifywait ...; [logic to rebuild the pdf file..]; end" &
set pid_sub $last_pid
nvim $mdpath && kill -2 $pid_mudf $pid_sub
First I open mupdf as a background job and save its PID in a variable. Next I spawn the other fish process, also as a background job, and I save its PID as well.
Next I run nvim (but not as a background job, as I intend to actually control it), and after it is terminated by the user, I gracefully kill the previous 2 background jobs.
However this doesn't work as intended.
mupdf and the second fish process open successfully, and so does nvim, but it quickly closes after around half a second, after which I get the following in the controlling terminal window: image (bote is just the filename of the script from which the lines above originate)
The 2 background processes stay running after that and I have to kill them manually.
I understand that the script is sent a SIGHUP because the controlling terminal now executes another application (neovim), but why does neovim close after that?
I also tried disowning the background processes after they're spawned but that didn't help.
How would I solve this issue?
The problem is that $last_pid, in fish 3, and %last, in fish 2, doesn't work by default in scripts. See https://github.com/fish-shell/fish-shell/issues/5036. You can "fix" this by putting status job-control full at the top of the script or using the (jobs -lp) hack that Glenn mentioned.
Regarding the background process remaining running... I can't reproduce that. It works for me. However, note that your nvim && kill will only run the kill if nvim exits with a status of zero. If you always want the kill to be run you should just unconditionally execute it. Also, your use of signal two (SIGINT) should produce the desired result but is unusual. You should use kill -15 or just omit the signal in which case it defaults to 15 (SIGTERM).
You're getting the PID incorrectly. The $pid_mudf and $pid_sub variables are empty. You want
set pid_mupdf (jobs -lp)
I have a bash script server.sh which is maintained by an external source and ideally should not be modified. This script writes to stdout and stderr.
In fact, this server.sh itself is doing an exec tclsh immediately:
#!/bin/sh
# \
exec tclsh "$0" ${1+"$#"}
so in fact, it is just a wrapper around a Tcl script. I just mention this in case you think that this matters.
I need a Tcl script setup.tcl which is supposed to do some preparatory work, then invoke server.sh (in the background), then do some cleanup work (and display the PID of the background process), and terminate.
server.sh is supposed to continue running until explicitly killed.
setup.tcl is usually invoked manually, either from a Cygwin bash shell or from a Windows cmd shell. In the latter case, it is ensured that Cygwin's bash.exe is in the PATH.
The environment is Windows 7 and Cygwin. The Tcl is either Cygwin's (8.5) or ActiveState 8.4.
The first version (omitting error handling) went like this:
# setup.tcl:
# .... preparatory work goes here
set childpid [exec bash.exe server.sh &]
# .... clean up work goes here
puts $childpid
exit 0
While this works when started as ActiveState Tcl from a Windows CMD shell, it does not work in a pure Cygwin setup. The reason is that as soon as setup.tcl ends, a signal is sent to the child process and this is killed too.
Using nohup would not help here, because I want to see the output of server.sh as soon as it occurs.
My next idea would be to created an intermediate bash script, mediator.sh, which uses disown -h to detach the child process and keep it from being killed:
#!/usr/bin/bash
# mediator.sh
server.sh &
child=$!
disown -h $child
and invoke mediator.sh from setup.tcl. But aside from the fact that I don't see an easy way to pass the child PID up to setup.tcl, the main problem is that it doesn't work either: While mediator.sh indeed keeps the child alive when called from the Cygwin command line directly, we have the same behaviour again (server.sh being killed when setup.tcl exits), when I call it via setup.tcl.
Anybody knowing a solution for this?
You'll want to set a trap handler in your server script so you can handle/ignore certain signals.
For example, to ignore HUP signals, you can do something like the following:
#!/bin/bash
handle_signal() {
echo "Ignoring HUP signal"
}
trap handle_signal SIGHUP
# Rest of code goes here
In the example case, if the script receives a HUP signal it will print a message and continue as normal. It will still die to Ctrl-C as that's the INT signal which is unhandled.
I'm trying to use a shell script to start a command. I don't care if/when/how/why it finishes. I want the process to start and run, but I want to be able to get back to my shell immediately...
You can just run the script in the background:
$ myscript &
Note that this is different from putting the & inside your script, which probably won't do what you want.
Everyone just forgot disown. So here is a summary:
& puts the job in the background.
Makes it block on attempting to read input, and
Makes the shell not wait for its completion.
disown removes the process from the shell's job control, but it still leaves it connected to the terminal.
One of the results is that the shell won't send it a SIGHUP(If the shell receives a SIGHUP, it also sends a SIGHUP to the process, which normally causes the process to terminate).
And obviously, it can only be applied to background jobs(because you cannot enter it when a foreground job is running).
nohup disconnects the process from the terminal, redirects its output to nohup.out and shields it from SIGHUP.
The process won't receive any sent SIGHUP.
Its completely independent from job control and could in principle be used also for foreground jobs(although that's not very useful).
Usually used with &(as a background job).
nohup cmd
doesn't hangup when you close the terminal. output by default goes to nohup.out
You can combine this with backgrounding,
nohup cmd &
and get rid of the output,
nohup cmd > /dev/null 2>&1 &
you can also disown a command. type cmd, Ctrl-Z, bg, disown
Alternatively, after you got the program running, you can hit Ctrl-Z which stops your program and then type
bg
which puts your last stopped program in the background. (Useful if your started something without '&' and still want it in the backgroung without restarting it)
screen -m -d $command$ starts the command in a detached session. You can use screen -r to attach to the started session. It is a wonderful tool, extremely useful also for remote sessions. Read more at man screen.
I logged in to a remote server via ssh and started a php script. Appereantly, it will take 17 hours to complete, is there a way to break the connection but the keep the script executing? I didn't make any output redirection, so I am seeing all the output.
Can you stop the process right now? If so, launch screen, start the process and detach screen using ctrl-a then ctrl-d. Use screen -r to retrieve the session later.
This should be available in most distros, failing that, a package will definitely be available for you.
ctrl + z
will pause it. Than type
bg
to send it to background. Write down the PID of the process for later usage ;)
EDIT: I forgot, you have to execute
disown -$PID
where $PID is the pid of your process
after that, and the process will not be killed after you close the terminal.
you described it's important to protect script continuation. Unfortunately I don't know, you make any interaction with script and script is made by you.
continuation protects 'screen' command. your connection will break, but screen protect pseudo terminal, you can reconnect to this later, see man.
if you don't need operators interaction with script, you simply can put script to background at the start, and log complete output into log file. Simply use command:
nohup /where/is/your.script.php >output.log 2&>1 &
>output.log will redirect output into log file, 2&>1 will append error stream into output, effectively into log file. last & will put command into background. Notice, nohup command will detach process from terminal group.
At now you can safely exit from ssh shell. Because your script is out of terminal group, then it won't be killed. It will be rejoined from your shell process, into system INIT process. It is unix like system behavior. Complete output you can monitor using command
tail -f output.log #allways breakable by ^C, it is only watching
Using this method you do not need use ^Z , bg etc shell tricks for putting command to the background.
Notice, using redirection to nohup command is preferred. Otherwise nohup will auto redirect all outputs for you to nohup.out file in the current directory.
You can use screen.
I wanted to know why i am seeing a different behaviour in the background process in Bash shell
Case 1: Logged in to Unix server using Putty(SSH)
By default it uses csh shell
I changed to bash shell
typed sleep 2000 &
press enter
It gave me the job number. Now i killed my session by clicking the x in the putty window
Now open another session and tried to lookup the process..the process died.
Case 2:Case 1: Logged in to Unix server using Putty(SSH)
By default it uses csh shell
I changed to bash shell
vi mysleep.sh
sleep 2000 & Saved mysleep.sh
./mysleep.sh
Diff here is..instead of executing the sleep command directly i am storing the sleep command in a file and executing the file.
Now i killed my session by clicking the x in the putty window
Now open another session and tried to lookup the process..the process is still there
Not sure why this is happening. I thought i need to do disown in bash to run the process even after logging out.
One diff i see in the parent process id..In the second case..the parent process id for the sleep 2000 becomes 1. Looks like as soon as process for mysleep.sh died the kernel assigned the parent process to 1.
The difference here is indeed the intervening process.
When you close the terminal window, a HUP signal (related to "nohup" as an0nymo0usc0ward mentioned) is sent to the processes running in it. The default action on receiving HUP is to die - from the signal(3) manpage,
No Name Default Action Description
1 SIGHUP terminate process terminal line hangup
In your first example, the sleep process directly receives this HUP signal and dies because it isn't set to do anything else. (Some processes catch HUP and use it to perform some action, e.g. reread some configuration files)
In the second example, the shell process running your shell script has already died, so the sleep process never gets the signal. In UNIX, every process must have a parent process due to the internals of how the wait(2) family of calls works and indeed processes in general. So when the parent process dies, the kernel gives it to init (pid 1, as you note) as a foster child.
Orphan process (on wikipedia) has some more information available about it, also see Zombie process for some additional technical details.
Already running process?
^z
bg
disown %<jobid>
New process/script (on local machine's console)?
nohup script.sh &
New process/script (on remote machine's console)?
Depending on your need,
there are two options [ there will be more ;-) ]
ssh remotehost 'nohup /path/to/script.sh </dev/null > nohup.out 2>&1 &'
OR
use 'screen'
Try "nohup cmd args..."
Steven's answer is correct, but I'd like to highlight the tricky part here again:
=> Using a bash script that just executes sleep in the background
The effect of this is that the "script" exits almost immediately (since it's done all its commands). However, it did create a child process (sleep) during its lifetime. The effect of this is that:
The "script" cannot be the parent anymore, and sleep is orphaned to init (which shows nicely in a pstree)
The bash shell where you started the script from has no underlying jobs anymore
Note that this stuff all happens when you executed the script, and has nothing to do with any ssh logout/putty closing.
When you then finally close your putty session, bash receives a "SIGHUP", but doesn't forward it to any other process (since there are no jobs left)
In the other case, bash did still have a job left, which it then sent the SIGHUP to, causing it to end (as you noticed)
Hope this helps